Data Scientist II - Big Data R&D, Identity Graph & KYC
New
S
SocureIdentity Trust Infrastructure
Remote - USFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- Master’s degree with 2+ years of experience, or Ph.D. with 1+ years of experience
- Required Skills
- AWSPythonSQLSparkScalascikit-learnPySpark
Requirements
- Master’s degree with 2+ years of experience, or Ph.D. with 1+ years of experience in data science or analytics
- Proficiency in Python or Scala
- Solid experience writing and optimizing SQL for large datasets
- Comfort working in data lake / warehouse environments
- Hands‑on experience with Spark or PySpark
- Experience with common ML libraries (e.g., scikit‑learn, XGBoost)
- Familiarity with UNIX environments and the AWS ecosystem (e.g., EMR, S3)
- Working knowledge of supervised/unsupervised ML and basic statistics (similarity measures, clustering, evaluation metrics)
- Exposure to graph techniques or graph databases (Neo4j, AWS Neptune, GraphFrames) is a strong plus
- Experience with Elasticsearch or DynamoDB is a bonus
- Experience with workflow tools such as Airflow for automating data pipelines is a bonus
- Ability to break down loosely defined problems and iterate quickly with feedback
Responsibilities
- Contribute to the design and implementation of machine learning, data mining, statistical, and graph-based algorithms for identity verification and anomaly detection.
- Analyze large datasets to develop and refine entity-resolution and identity-matching algorithms for KYC and compliance solutions.
- Build and maintain data-processing pipelines (ETL, feature generation, normalization) using Spark/PySpark and AWS (e.g., EMR, S3).
- Support senior data scientists with feature engineering, data exploration, error analysis, and A/B test setup.
- Evaluate new third‑party and internal data sources, profile data quality, design offline experiments, and summarize impact.
- Implement and maintain SQL and Python/R code for data extraction, transformation, and validation, including code reviews and testing.
- Provide analytical support to compliance and regulatory product teams, including ad hoc investigations, dashboards, and data deep dives.
- Communicate findings clearly to peers and cross‑functional partners (Product, Engineering, Client Analysis).
- Work effectively in a fast‑paced, cross‑functional environment, demonstrating ownership of tasks.
View Full Description & ApplyYou'll be redirected to the employer's site