Senior Data Engineer
New
United StatesFull-TimeSenior
Salary103,500 - 192,000 USD per year
Apply NowOpens the employer's application page
Job Details
- Experience
- 5+ years
- Required Skills
- AWSPythonSQLKafkaAirflowSparkCI/CDTerraformData modelingDatabricks
Requirements
- 5+ years of experience in high-volume data engineering or distributed data systems.
- Strong expertise in Databricks, AWS (S3, EMR, Kinesis/Kafka), Apache Spark/Spark Streaming, Airflow, SQL, and Python (Scala or Java is a plus).
- Proven experience building and maintaining large-scale batch and streaming data pipelines in production environments.
- Solid understanding of data modeling (logical and physical), SQL optimization, and performance tuning.
- Hands-on experience with data quality and validation frameworks (e.g., Great Expectations or similar tools).
- Familiarity with Infrastructure-as-Code tools such as Terraform and cloud infrastructure best practices.
- Demonstrated ability to work effectively in AI-native environments using LLM tools for code generation, review, and workflow automation.
- Strong ability to evaluate, refine, and take ownership of AI-generated code and outputs before production deployment.
- Excellent communication skills with the ability to work independently in a remote-first environment.
- Bachelor’s degree in Computer Science, Engineering, Mathematics, or equivalent practical experience.
Responsibilities
- Design, build, and maintain scalable data platforms supporting real-time analytics, batch processing, and exploratory data use cases.
- Own end-to-end data pipelines, including ingestion, transformation, storage, and serving layers across large-scale distributed systems.
- Develop and optimize streaming and batch pipelines using technologies such as Kafka, Kinesis, Databricks, Spark, AWS, and Airflow (MWAA).
- Architect and maintain medallion (Bronze/Silver/Gold) data models to ensure clean, consistent, and well-governed datasets.
- Implement robust data quality frameworks, automated testing, monitoring, and CI/CD pipelines to ensure reliability and correctness.
- Build AI-powered engineering workflows, including prompts, automation scripts, and tooling that accelerate pipeline development and documentation.
- Develop and maintain natural language data interfaces and chatbot solutions (e.g., Databricks Genie) for non-technical users.
- Collaborate with analytics engineering, data science, and product teams to transform complex datasets into production-ready insights.
- Contribute to infrastructure-as-code initiatives (Terraform and related tools) for scalable cloud resource provisioning and management.
View Full Description & ApplyYou'll be redirected to the employer's site