Proficiency in Python, Java or Golang is preferred.
Extensive experience in feature engineering and developing data-driven frameworks that enhance identity matching algorithms.
Strong background in the foundations of machine learning and building blocks of modern deep learning
Deep understanding of machine learning frameworks and libraries such as TensorFlow, PyTorch, or Scikit-learn.
Experience with big data technologies like Apache Spark or Hadoop, and familiarity with cloud platforms (AWS, Azure, Google Cloud) for scalable data processing.
Familiarity with ML Ops concepts related to testing and maintaining models in production such as testing, retraining, and monitoring.
Experienced with modern data storage, messaging, and processing tools (Kafka, Apache Spark, Hadoop, Presto, DynamoDB etc.) and demonstrated experience designing and coding in big-data components such as DynamoDB or similar
Experience working in an agile team environment with changing priorities
Experience of working on AWS
Responsibilities:
Design, implement, and refine machine learning models that improve the precision and recall of identity resolution algorithms.
Develop and optimize feature engineering methodologies to extract meaningful patterns from large and complex datasets that enhance identity matching and unification.
Develop and maintain scalable data infrastructure to support the deployment and training of machine learning models, ensuring that they run efficiently under varying loads.
Build and maintain scalable machine learning solutions in production
Train and validate both deep learning-based and statistical-based models considering use-case, complexity, performance, and robustness
Demonstrate end-to-end understanding of applications and develop a deep understanding of the “why” behind our models & systems
Partner with product managers, tech leads, and stakeholders to analyze business problems, clarify requirements and define the scope of the systems needed
Ensure high standards of operational excellence by implementing efficient processes, monitoring system performance, and proactively addressing potential issues.
Drive engineering best practices around code reviews, automated testing and monitoring