Senior Data Engineer

New

United StatesFull-TimeSenior

Salary103,500 - 192,000 USD per year

Apply NowOpens the employer's application page

Job Details

Experience: 5+ years
Required Skills: AWSPythonSQLKafkaAirflowSparkCI/CDTerraformData modelingDatabricks

5+ years of experience in high-volume data engineering or distributed data systems.
Strong expertise in Databricks, AWS (S3, EMR, Kinesis/Kafka), Apache Spark/Spark Streaming, Airflow, SQL, and Python (Scala or Java is a plus).
Proven experience building and maintaining large-scale batch and streaming data pipelines in production environments.
Solid understanding of data modeling (logical and physical), SQL optimization, and performance tuning.
Hands-on experience with data quality and validation frameworks (e.g., Great Expectations or similar tools).
Familiarity with Infrastructure-as-Code tools such as Terraform and cloud infrastructure best practices.
Demonstrated ability to work effectively in AI-native environments using LLM tools for code generation, review, and workflow automation.
Strong ability to evaluate, refine, and take ownership of AI-generated code and outputs before production deployment.
Excellent communication skills with the ability to work independently in a remote-first environment.
Bachelor’s degree in Computer Science, Engineering, Mathematics, or equivalent practical experience.

Design, build, and maintain scalable data platforms supporting real-time analytics, batch processing, and exploratory data use cases.
Own end-to-end data pipelines, including ingestion, transformation, storage, and serving layers across large-scale distributed systems.
Develop and optimize streaming and batch pipelines using technologies such as Kafka, Kinesis, Databricks, Spark, AWS, and Airflow (MWAA).
Architect and maintain medallion (Bronze/Silver/Gold) data models to ensure clean, consistent, and well-governed datasets.
Implement robust data quality frameworks, automated testing, monitoring, and CI/CD pipelines to ensure reliability and correctness.
Build AI-powered engineering workflows, including prompts, automation scripts, and tooling that accelerate pipeline development and documentation.
Develop and maintain natural language data interfaces and chatbot solutions (e.g., Databricks Genie) for non-technical users.
Collaborate with analytics engineering, data science, and product teams to transform complex datasets into production-ready insights.
Contribute to infrastructure-as-code initiatives (Terraform and related tools) for scalable cloud resource provisioning and management.

View Full Description & ApplyYou'll be redirected to the employer's site