Apply

Senior Data Engineer

Posted 3 months agoViewed

View full description

πŸ’Ž Seniority level: Senior, 5+ years

πŸ“ Location: United States

πŸ’Έ Salary: 100000.0 - 120000.0 USD per year

πŸ” Industry: Healthcare

🏒 Company: FoundπŸ‘₯ 51-100πŸ’° $45,999,997 Series C 10 months agoFinancial ServicesBankingFinTech

πŸ—£οΈ Languages: English

⏳ Experience: 5+ years

πŸͺ„ Skills: PythonSQLApache AirflowETLSnowflakePandasSpark

Requirements:
  • 5+ years of experience in data engineering or related areas. You are an end-to-end data engineer whose experience goes beyond creating ETL pipelines.
  • Expertise in SQL and data manipulation languages.
  • Proficiency in data pipeline tools (Airflow, AWS Glue, Spark/PySpark, Pandas).
  • Strong programming skills in Python.
  • Experience with data storage technologies like warehouses (Snowflake, Redshift) and data lakes (Databricks, Glue Catalog/S3).
Responsibilities:
  • Design, implement, and manage robust and scalable data pipelines to ingest, process, and transform data from various sources.
  • Develop and maintain data models to support business intelligence, reporting, and analytics needs.
  • Design and implement data warehousing solutions to store and organize large volumes of data efficiently.
  • Develop and optimize ETL (Extract, Transform, Load) processes to ensure data accuracy and integrity.
  • Implement data quality checks and monitoring processes to maintain data integrity and reliability.
  • Continuously monitor and optimize data pipelines and queries for performance and scalability.
  • Work closely with data analysts and other stakeholders to understand their data needs and provide solutions.
  • Create and maintain clear and comprehensive documentation of data architecture, processes, and data dictionaries.
Apply

Related Jobs

Apply

πŸ“ Worldwide

πŸ” Hospitality

🏒 Company: Lighthouse

  • 4+ years of professional experience using Python, Java, or Scala for data processing (Python preferred)
  • You stay up-to-date with industry trends, emerging technologies, and best practices in data engineering.
  • Improve, manage, and teach standards for code maintainability and performance in code submitted and reviewed
  • Ship large features independently, generate architecture recommendations and have the ability to implement them
  • Great communication: Regularly achieve consensus amongst teams
  • Familiarity with GCP, Kubernetes (GKE preferred),Β  CI/CD tools (Gitlab CI preferred), familiarity with the concept of Lambda Architecture.
  • Experience with Apache Beam or Apache Spark for distributed data processing or event sourcing technologies like Apache Kafka.
  • Familiarity with monitoring tools like Grafana & Prometheus.
  • Design and develop scalable, reliable data pipelines using the Google Cloud stack.
  • Optimise data pipelines for performance and scalability.
  • Implement and maintain data governance frameworks, ensuring data accuracy, consistency, and compliance.
  • Monitor and troubleshoot data pipeline issues, implementing proactive measures for reliability and performance.
  • Collaborate with the DevOps team to automate deployments and improve developer experience on the data front.
  • Work with data science and analytics teams to enable them to bring their research to production grade data solutions, using technologies like airflow, dbt or MLflow (but not limited to)
  • As a part of a platform team, you will communicate effectively with teams across the entire engineering organisation, to provide them with reliable foundational data models and data tools.
  • Mentor and provide technical guidance to other engineers working with data.

PythonSQLApache AirflowETLGCPKubernetesApache KafkaData engineeringCI/CDMentoringTerraformScalaData modeling

Posted 3 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 183600.0 - 216000.0 USD per year

πŸ” Software Development

  • 6+ years of experience in a data engineering role building products, ideally in a fast-paced environment
  • Good foundations in Python and SQL.
  • Experience with Spark, PySpark, DBT, Snowflake and Airflow
  • Knowledge of visualization tools, such as Metabase, Jupyter Notebooks (Python)
  • Collaborate on the design and improvements of the data infrastructure
  • Partner with product and engineering to advocate best practices and build supporting systems and infrastructure for the various data needs
  • Create data pipelines that stitch together various data sources in order to produce valuable business insights
  • Create real-time data pipelines in collaboration with the Data Science team

PythonSQLSnowflakeAirflowData engineeringSparkData visualizationData modeling

Posted 4 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ” Healthcare

🏒 Company: Rad AIπŸ‘₯ 101-250πŸ’° $60,000,000 Series C 2 months agoArtificial Intelligence (AI)Enterprise SoftwareHealth Care

  • 4+ years relevant experience in data engineering.
  • Expertise in designing and developing distributed data pipelines using big data technologies on large scale data sets.
  • Deep and hands-on experience designing, planning, productionizing, maintaining and documenting reliable and scalable data infrastructure and data products in complex environments.
  • Solid experience with big data processing and analytics on AWS, using services such as Amazon EMR and AWS Batch.
  • Experience in large scale data processing technologies such as Spark.
  • Expertise in orchestrating workflows using tools like Metaflow.
  • Experience with various database technologies including SQL, NoSQL databases (e.g., AWS DynamoDB, ElasticSearch, Postgresql).
  • Hands-on experience with containerization technologies, such as Docker and Kubernetes.
  • Design and implement the data architecture, ensuring scalability, flexibility, and efficiency using pipeline authoring tools like Metaflow and large-scale data processing technologies like Spark.
  • Define and extend our internal standards for style, maintenance, and best practices for a high-scale data platform.
  • Collaborate with researchers and other stakeholders to understand their data needs including model training and production monitoring systems and develop solutions that meet those requirements.
  • Take ownership of key data engineering projects and work independently to design, develop, and maintain high-quality data solutions.
  • Ensure data quality, integrity, and security by implementing robust data validation, monitoring, and access controls.
  • Evaluate and recommend data technologies and tools to improve the efficiency and effectiveness of the data engineering process.
  • Continuously monitor, maintain, and improve the performance and stability of the data infrastructure.

AWSDockerSQLElasticSearchETLKubernetesData engineeringNosqlSparkData modeling

Posted 4 days ago
Apply
Apply

πŸ“ Worldwide

🧭 Full-Time

NOT STATED
  • Own the design and implementation of cross-domain data models that support key business metrics and use cases.
  • Partner with analysts and data engineers to translate business logic into performant, well-documented dbt models.
  • Champion best practices in testing, documentation, CI/CD, and version control, and guide others in applying them.
  • Act as a technical mentor to other analytics engineers, supporting their development and reviewing their code.
  • Collaborate with central data platform and embedded teams to improve data quality, metric consistency, and lineage tracking.
  • Drive alignment on model architecture across domainsβ€”ensuring models are reusable, auditable, and trusted.
  • Identify and lead initiatives to reduce technical debt and modernise legacy reporting pipelines.
  • Contribute to the long-term vision of analytics engineering at Pleo and help shape our roadmap for scalability and impact.

SQLData AnalysisETLData engineeringCI/CDMentoringDocumentationData visualizationData modelingData analyticsData management

Posted 4 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 183600.0 - 216000.0 USD per year

πŸ” Mental Healthcare

🏒 Company: HeadwayπŸ‘₯ 201-500πŸ’° $125,000,000 Series C over 1 year agoMental Health Care

  • 6+ years of experience in a data engineering role building products, ideally in a fast-paced environment
  • Good foundations in Python and SQL.
  • Experience with Spark, PySpark, DBT, Snowflake and Airflow
  • Knowledge of visualization tools, such as Metabase, Jupyter Notebooks (Python)
  • A knack for simplifying data, expressing information in charts and tables
  • Collaborate on the design and improvements of the data infrastructure
  • Partner with product and engineering to advocate best practices and build supporting systems and infrastructure for the various data needs
  • Create data pipelines that stitch together various data sources in order to produce valuable business insights
  • Create real-time data pipelines in collaboration with the Data Science team

PythonSQLETLSnowflakeAirflowData engineeringRDBMSSparkRESTful APIsData visualizationData modeling

Posted 4 days ago
Apply
Apply

πŸ“ United States, Canada

πŸ” Software Development

AWSSQLCloud ComputingData AnalysisETLData engineeringData visualizationData modeling

Posted 11 days ago
Apply
Apply

πŸ“ United States of America

🧭 Full-Time

πŸ’Έ 92700.0 - 185400.0 USD per year

πŸ” Health Solutions

  • Bachelors in Computer Science or equivalent
  • Proven ability to complete projects in a timely manner while clearly measuring progress
  • Strong software engineering fundamentals (data structures, algorithms, async programming patterns, object-oriented design, parallel programming)
  • Strong understanding and demonstrated experience with at least one popular programming language (.NET or Java) and SQL constructs.
  • Experience writing and maintaining frontend client applications, Angular preferred
  • Strong experience with revision control (Git)
  • Experience with cloud-based systems (Azure / AWS / GCP).
  • High level understanding of big data design (data lake, data mesh, data warehouse) and data normalization patterns
  • Demonstrated experience with Queuing technologies (Kafka / SNS / RabbitMQ etc)
  • Demonstrated experience with Metrics, Logging, Monitoring and Alerting tools
  • Strong communication skills
  • Strong experience with use of RESTful APIs
  • High level understanding of HL7 V2.x / FHIR based interface messages.
  • High level understanding of system deployment tasks and technologies. (CI/CD Pipeline, K8s, Terraform).
  • Communicate with business leaders to help translate requirements into functional specification
  • Develop broad understanding of business logic and functionality of current systems
  • Analyze and manipulate data by writing and running SQL queries
  • Analyze logs to identify and prevent potential issues from occurring
  • Deliver clean and functional code in accordance with business requirements
  • Consume data from any source, such a flat files, streaming systems, or RESTful APIs
  • Interface with Electronic Health Records
  • Engineer scalable, reliable, and performant systems to manage data
  • Collaborate closely with other Engineers, QA, Scrum master, Product Manager in your team as well as across the organization
  • Build quality systems while expanding offerings to dependent teams
  • Comfortable in multiple roles, from Design and Development to Code Deployment to and monitoring and investigating in production systems.

AWSSQLCloud ComputingGCPGitJavaKafkaKubernetesAlgorithmsApache KafkaAzureData engineeringData Structures.NETAngularREST APICI/CDRESTful APIsTerraform

Posted 15 days ago
Apply
Apply

πŸ“ United States

πŸ” Software Development

🏒 Company: phDataπŸ‘₯ 501-1000πŸ’° $2,499,997 Seed about 7 years agoInformation ServicesAnalyticsInformation Technology

  • 4+ years as a hands-on Data Engineer and/or Software Engineer designing and implementing data solutions
  • Programming expertise in Java, Python and/or Scala
  • Experience with core cloud data platforms including Snowflake, AWS, Azure, Databricks and GCP
  • Experience using SQL
  • Client-facing written and verbal communication skills and experience
  • Design and implement data solutions
  • Ensure performance, security, scalability, and robust data integration.
  • Create and deliver detailed presentations
  • Detailed solution documentation (e.g. including POCS and roadmaps, sequence diagrams, class hierarchies, logical system views, etc.)

AWSPythonSoftware DevelopmentSQLCloud ComputingData AnalysisETLGCPJavaKafkaSnowflakeAzureData engineeringSparkCommunication SkillsCI/CDProblem SolvingAgile methodologiesRESTful APIsDocumentationScalaData modeling

Posted 21 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 144000.0 - 180000.0 USD per year

πŸ” Software Development

🏒 Company: HungryrootπŸ‘₯ 101-250πŸ’° $40,000,000 Series C almost 4 years agoArtificial Intelligence (AI)Food and BeverageE-CommerceRetailConsumer GoodsSoftware

  • 5+ years of experience in ETL development and data modeling
  • 5+ years of experience in both Scala and Python
  • 5+ years of experience in Spark
  • Excellent problem-solving skills and the ability to translate business problems into practical solutions
  • 2+ years of experience working with the Databricks Platform
  • Develop pipelines in Spark (Python + Scala) in the Databricks Platform
  • Build cross-functional working relationships with business partners in Food Analytics, Operations, Marketing, and Web/App Development teams to power pipeline development for the business
  • Ensure system reliability and performance
  • Deploy and maintain data pipelines in production
  • Set an example of code quality, data quality, and best practices
  • Work with Analysts and Data Engineers to enable high quality self-service analytics for all of Hungryroot
  • Investigate datasets to answer business questions, ensuring data quality and business assumptions are understood before deploying a pipeline

AWSPythonSQLApache AirflowData MiningETLSnowflakeAlgorithmsAmazon Web ServicesData engineeringData StructuresSparkCI/CDRESTful APIsMicroservicesJSONScalaData visualizationData modelingData analyticsData management

Posted 21 days ago
Apply
Apply

πŸ“ United States

πŸ’Έ 135000.0 - 155000.0 USD per year

πŸ” Software Development

🏒 Company: JobgetherπŸ‘₯ 11-50πŸ’° $1,493,585 Seed about 2 years agoInternet

  • 8+ years of experience as a data engineer, with a strong background in data lake systems and cloud technologies.
  • 4+ years of hands-on experience with AWS technologies, including S3, Redshift, EMR, Kafka, and Spark.
  • Proficient in Python or Node.js for developing data pipelines and creating ETLs.
  • Strong experience with data integration and frameworks like Informatica and Python/Scala.
  • Expertise in creating and managing AWS services (EC2, S3, Lambda, etc.) in a production environment.
  • Solid understanding of Agile methodologies and software development practices.
  • Strong analytical and communication skills, with the ability to influence both IT and business teams.
  • Design and develop scalable data pipelines that integrate enterprise systems and third-party data sources.
  • Build and maintain data infrastructure to ensure speed, accuracy, and uptime.
  • Collaborate with data science teams to build feature engineering pipelines and support machine learning initiatives.
  • Work with AWS cloud technologies like S3, Redshift, and Spark to create a world-class data mesh environment.
  • Ensure proper data governance and implement data quality checks and lineage at every stage of the pipeline.
  • Develop and maintain ETL processes using AWS Glue, Lambda, and other AWS services.
  • Integrate third-party data sources and APIs into the data ecosystem.

AWSNode.jsPythonSQLETLKafkaData engineeringSparkAgile methodologiesScalaData modelingData management

Posted 21 days ago
Apply