Apply

Data Engineer

Posted 3 months agoViewed

View full description

πŸ’Ž Seniority level: Middle, Minimum of three (3) years

πŸ“ Location: United States

πŸ’Έ Salary: 100000 - 130000 USD per year

πŸ” Industry: Nonprofit, Civic Engagement, Data Analytics

🏒 Company: MurmurationπŸ‘₯ 1-10

πŸ—£οΈ Languages: English

⏳ Experience: Minimum of three (3) years

πŸͺ„ Skills: AWSDockerPythonSoftware DevelopmentSnowflakeAirflowData engineeringCommunication SkillsCI/CDProblem Solving

Requirements:
  • Problem-solver with a passion for using data and technology to drive social impact.
  • Education and/or experience in Computer Science, Computer Engineering, or a relevant field.
  • Minimum of three (3) years of relevant experience in data engineering or a related field.
  • Curiosity and a drive to continuously learn and adapt to new technologies and challenges.
  • Familiarity with data orchestration tools (e.g., Dagster, Airflow) and ELT processes (e.g., dbt).
  • Familiarity with analytic databases (e.g., Snowflake) and cloud infrastructure (e.g., AWS).
  • Experience working flexibly within smaller teams.
  • Practical knowledge of software development lifecycle (SDLC).
  • Proficiency in Python, Docker, and container orchestration tools.
  • Understanding of CI/CD pipelines and automation tools.
  • Strong written and verbal communication skills.
Responsibilities:
  • Collaborate closely with cross-functional teams to understand challenges, design solutions, and implement data pipelines that meet both immediate and long-term needs.
  • Build and maintain scalable, reliable data pipelines using tools such as Dagster, Airflow, Snowflake, AWS, MongoDB, and dbt.
  • Manage data from various sources, ensuring timely ingestion, quality, and integrity.
  • Transform raw data into structured, usable formats that empower our analytical and product teams.
  • Implement and maintain robust monitoring, alerting, and documentation processes.
  • Continuously optimize our data infrastructure for performance and efficiency.
  • Provide support and troubleshooting for data-related issues across the organization.
  • Contribute to a culture of knowledge sharing and continuous improvement within the team.
Apply

Related Jobs

Apply

πŸ“ US, Europe

🧭 Full-Time

πŸ’Έ 175000.0 - 205000.0 USD per year

πŸ” Cloud computing and AI services

🏒 Company: CoreWeaveπŸ’° $642,000,000 Secondary Market about 1 year agoCloud ComputingMachine LearningInformation TechnologyCloud Infrastructure

  • 5+ years of experience with Kubernetes and Helm, with a deep understanding of container orchestration.
  • Hands-on experience administering and optimizing clustered computing technologies on Kubernetes, such as Spark, Trino, Flink, Ray, Kafka, StarRocks or similar.
  • 5+ years of programming experience in C++, C#, Java, or Python.
  • 3+ years of experience scripting in Python or Bash for automation and tooling.
  • Strong understanding of data storage technologies, distributed computing, and big data processing pipelines.
  • Proficiency in data security best practices and managing access in complex systems.

  • Architect, deploy, and scale data storage and processing infrastructure to support analytics and data science workloads.
  • Manage and maintain data lake and clustered computing services, ensuring reliability, security, and scalability.
  • Build and optimize frameworks and tools to simplify the usage of big data technologies.
  • Collaborate with cross-functional teams to align data infrastructure with business goals and requirements.
  • Ensure data governance and security best practices across all platforms.
  • Monitor, troubleshoot, and optimize system performance and resource utilization.

PythonBashKubernetesApache Kafka

Posted 7 days ago
Apply
Apply

πŸ“ US

🧭 Full-Time

πŸ’Έ 110000.0 - 125000.0 USD per year

πŸ” Beauty industry

  • Bachelor’s degree in data engineering or relevant discipline.
  • 5+ years of hands-on experience as a data engineer managing data pipelines.
  • Advanced proficiency in modern ETL/ELT stacks - expertise in Fivetran, DBT, and Snowflake.
  • Understanding of data analytics and tools, including Metabase and PowerBI.
  • Expert-level Python and SQL skills with deep experience in DBT transformations.
  • Strong understanding of cloud-native data architectures and modern data warehousing principles.
  • Familiarity with data security, governance, and compliance standards.
  • Adept at designing and delivering interactive dashboards.

  • Building and managing robust ETL/ELT pipelines using Fivetran, DBT, and Snowflake.
  • Developing and optimizing data models and analytical reports.
  • Collaborating with stakeholders to create data pipelines that align with BI needs.
  • Designing and developing SQL-based solutions to transform data.
  • Building the reporting infrastructure from ambiguous requirements.
  • Continuously improving customer experience and providing technical leadership.
  • Evangelizing best practices and technologies.

PythonSQLBusiness IntelligenceETLSnowflakeData engineeringData modeling

Posted 8 days ago
Apply
Apply

πŸ“ USA, Canada, Mexico

🧭 Full-Time

πŸ’Έ 175000.0 USD per year

πŸ” Digital tools for hourly employees

🏒 Company: TeamSenseπŸ‘₯ 11-50πŸ’° Seed 11 months agoInformation ServicesInformation TechnologySoftware

  • Bachelor's or Master's degree in Computer Science, Software Engineering, or a related technical field.
  • 7+ years of professional experience in software engineering including 5+ years of experience in data engineering.
  • Proven expertise in building and managing scalable data platforms.
  • Proficiency in Python.
  • Strong knowledge of SQL, data modeling, data migration and database systems such as PostgreSQL and MongoDB.
  • Exceptional problem-solving skills optimizing data systems.

  • As a Senior Data Engineer, your primary responsibility is to contribute to the design, development, and maintenance of a scalable and reliable data platform.
  • Analyze the current database and warehouse.
  • Design and develop scalable ETL/ELT pipelines to support data migration.
  • Build and maintain robust, scalable, and high-performing data platforms, including data lakes and/or warehouses.
  • Implement data engineering best practices and design patterns.
  • Guide design reviews for new features impacting data.

PostgreSQLPythonSQLETLMongoDBData engineeringData modeling

Posted 9 days ago
Apply
Apply

πŸ“ South Africa, Mauritius, Kenya, Nigeria

πŸ” Technology, Marketplaces

  • BSc degree in Computer Science, Information Systems, Engineering, or related technical field or equivalent work experience.
  • 3+ years related work experience.
  • Minimum of 2 years experience building and optimizing β€˜big data’ data pipelines, architectures and maintaining data sets.
  • Experienced in Python.
  • Experienced in SQL (PostgreSQL, MS SQL).
  • Experienced in using cloud services: AWS, Azure or GCP.
  • Proficiency in version control, CI/CD and GitHub.
  • Understanding/experience in Glue and PySpark highly desirable.
  • Experience in managing data life cycle.
  • Proficiency in manipulating, processing and architecting large disconnected data sets for analytical requirements.
  • Ability to maintain and optimise processes supporting data transformation, data structures, metadata, dependency and workload management.
  • Good understanding of data management principles - data quality assurance and governance.
  • Strong analytical skills related to working with unstructured datasets.
  • Understanding of message queuing, stream processing, and highly scalable β€˜big data’ datastores.
  • Strong attention to detail.
  • Good communication and interpersonal skills.

  • Suggest efficiencies and execute on implementation of internal process improvements in automating manual processes.
  • Implement enhancements and new features across data systems.
  • Improve streamline processes within data systems with support from Senior Data Engineer.
  • Test CI/CD process for optimal data pipelines.
  • Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Highly efficient in ETL processes.
  • Develop and conduct unit tests on data pipelines as well as ensuring data consistency.
  • Develop and maintain automated monitoring solutions.
  • Support reporting and analytics infrastructure.
  • Maintain data quality and data governance as well as upkeep of overall maintenance of data infrastructure systems.
  • Maintain data warehouse and data lake metadata, data catalogue, and user documentation for internal business users.
  • Ensure best practice is implemented and maintained on database.

AWSPostgreSQLPythonSQLETLGitCI/CD

Posted 9 days ago
Apply
Apply

πŸ“ United States

πŸ’Έ 170000.0 - 220000.0 USD per year

πŸ” Healthcare technology

🏒 Company: Red Cell PartnersπŸ‘₯ 11-50Financial ServicesVenture CapitalFinance

  • Proven experience in a leadership role driving ML system development and optimization, preferably in healthcare or related fields.
  • Demonstrated expertise in training ML models and building robust training pipelines for healthcare applications.
  • Strong understanding of machine learning frameworks such as TensorFlow, PyTorch, or similar, with applications in healthcare.
  • Proficient in programming languages like Python or Go, with the ability to write efficient, clean, and maintainable code for healthcare systems.
  • Excellent written and verbal communication skills, with the ability to convey technical concepts to both technical and non-technical audiences in healthcare settings.
  • A track record of delivering impactful machine learning solutions that have been successfully deployed in real-world healthcare applications.
  • Familiarity with healthcare data privacy regulations and best practices for handling sensitive medical information.

  • Lead the team in architecting, building, and optimizing ML systems to deliver high-quality, real-world results in healthcare settings.
  • Design and implement robust training pipelines for machine learning models, ensuring efficiency and scalability for healthcare data.
  • Fine-tune ML models to meet specific healthcare needs and optimize their performance for various medical applications.
  • Develop and implement feedback mechanisms to continuously improve the accuracy and effectiveness of ML in healthcare contexts.
  • Collaborate with cross-functional teams to understand healthcare business requirements and translate them into actionable ML solutions.
  • Stay up-to-date with the latest advancements in machine learning and healthcare technology, implementing best practices to enhance our ML infrastructure.
  • Coach and mentor junior data engineers, fostering a culture of continuous learning and growth within the Lightbox Health team.
  • Communicate complex technical concepts and findings to non-technical stakeholders in a clear and concise manner, particularly in healthcare contexts.

LeadershipPythonMachine LearningPyTorchData engineeringTensorflow

Posted 10 days ago
Apply
Apply

πŸ“ Texas, Maryland, Pennsylvania, Minnesota, Florida, Georgia, Illinois

πŸ” Ecommerce, collectible card games

🏒 Company: TCGPlayer_External_Career

  • Bachelor’s degree in computer science, information technology, or related field, or equivalent experience.
  • 12 years or more experience in designing scalable and reliable datastores.
  • Mastery of MongoDB data modeling and query design, with significant experience in RDBMS technologies, preferably PostgreSQL.
  • Experience designing datastores for microservices and event-driven applications.
  • Experience with data governance support in medium-to-large organizations.
  • Strong written and verbal communication skills for collaboration across roles.

  • Act as a subject matter expert for MongoDB, providing guidance and materials to improve proficiency.
  • Guide selection of datastore technologies for applications to meet data needs.
  • Consult on database design for performance and scalability.
  • Write effective code for data management.
  • Support engineers with database interface advice.
  • Develop data flow strategies and define storage requirements for microservices.
  • Troubleshoot and enhance existing database designs.
  • Collaborate to ensure data architectures are efficient and scalable.
  • Lead cross-application datastore projects related to security and data governance.
  • Research emerging datastore capabilities for strategic planning.
  • Define and implement data storage strategies for microservices.

PostgreSQLMongoDBData engineeringMicroservicesData modeling

Posted 11 days ago
Apply
Apply

πŸ“ US

🧭 Full-Time

πŸ’Έ 206700.0 - 289400.0 USD per year

πŸ” Social Media / Technology

  • MS or PhD in a quantitative discipline: engineering, statistics, operations research, computer science, informatics, applied mathematics, economics, etc.
  • 7+ years of experience with large-scale ETL systems and building clean, maintainable code (Python preferred).
  • Strong programming proficiency in Python, SQL, Spark, Scala.
  • Experience with data modeling, ETL concepts, and patterns for data governance.
  • Experience with data workflows, data modeling, and engineering.
  • Experience in data visualization and dashboard design using tools like Looker, Tableau, and D3.
  • Deep understanding of relational and MPP database designs.
  • Proven track record of cross-functional collaboration and excellent communication skills.

  • Act as the analytics engineering lead within Ads DS team contributing to data science data quality and automation initiatives.
  • Work on ETLs, reporting dashboards, and data aggregations for business tracking and ML model development.
  • Develop and maintain robust data pipelines for data ingestion, processing, and transformation.
  • Create user-friendly tools for internal team use, streamlining analysis and reporting processes.
  • Lead efforts to build a data-driven culture by enabling data self-service.
  • Provide technical guidance and mentorship to data analysts.

PythonSQLETLAirflowSparkScalaData visualizationData modeling

Posted 12 days ago
Apply
Apply

πŸ“ US

πŸ’Έ 103200.0 - 128950.0 USD per year

πŸ” Genetics and healthcare

🏒 Company: NateraπŸ‘₯ 1001-5000πŸ’° $250,000,000 Post-IPO Equity over 1 year agoπŸ«‚ Last layoff almost 2 years agoWomen'sBiotechnologyMedicalGeneticsHealth Diagnostics

  • BS degree in computer science or a comparable program or equivalent experience.
  • 8+ years of overall software development experience, ideally in complex data management applications.
  • Experience with SQL and No-SQL databases including Dynamo, Cassandra, Postgres, Snowflake.
  • Proficiency in data technologies such as Hive, Hbase, Spark, EMR, Glue.
  • Ability to manipulate and extract value from large datasets.
  • Knowledge of data management fundamentals and distributed systems.

  • Work with other engineers and product managers to make design and implementation decisions.
  • Define requirements in collaboration with stakeholders and users to create reliable applications.
  • Implement best practices in development processes.
  • Write specifications, design software components, fix defects, and create unit tests.
  • Review design proposals and perform code reviews.
  • Develop solutions for the Clinicogenomics platform utilizing AWS cloud services.

AWSPythonSQLAgileDynamoDBSnowflakeData engineeringPostgresSparkData modelingData management

Posted 19 days ago
Apply
Apply

πŸ“ United States

πŸ” Utilities

🏒 Company: ScalepexπŸ‘₯ 11-50Staffing AgencyFinanceProfessional Services

  • Minimum of 5 years of experience in data engineering.
  • Proficiency in AWS services such as Step Functions, Lambda, Glue, S3, DynamoDB, and Redshift.
  • Strong programming skills in Python with experience using PySpark and Pandas for large-scale data processing.
  • Hands-on experience with distributed systems and scalable architectures.
  • Knowledge of ETL/ELT processes for integrating diverse datasets.
  • Familiarity with utilities-specific datasets is highly desirable.
  • Strong analytical skills to work with unstructured datasets.
  • Knowledge of data governance practices.

  • Design and build scalable data pipelines using AWS services to process and transform large datasets from utility systems.
  • Orchestrate workflows using AWS Step Functions.
  • Implement ETL/ELT processes to clean, transform, and integrate data.
  • Leverage distributed systems experience to ensure reliability and performance.
  • Utilize AWS Lambda for serverless application development.
  • Design data models for analytics tailored to utilities use cases.
  • Continuously monitor and optimize data pipeline performance.

AWSPythonDynamoDBETLData engineeringServerlessPandasData modeling

Posted 23 days ago
Apply
Apply

πŸ“ US

🧭 Full-Time

πŸ” Healthcare

🏒 Company: Particle HealthπŸ‘₯ 11-50πŸ’° $10,000,000 about 1 month agoDeveloper APIsElectronic Health Record (EHR)Health Care

  • 8+ years of experience in data engineering with a strong background in data architecture and modeling.
  • 5+ years of experience in data architecture, data modeling, and data governance, preferably for SaaS products or cloud applications.
  • Experience with healthcare data formats like CCDA, FHIR, HL7v2.
  • Strong knowledge in data modeling tools and languages such as ERD, UML, SQL, and NoSQL.
  • Expertise in Python and hands-on experience in Spark for large-scale data processing.

  • Build and optimize efficient, scalable data pipelines for ingestion, transformation, and enrichment.
  • Lead the design and development of a robust data architecture guiding data modeling and quality.
  • Monitor data issues and ensure data availability, reliability, and scalability for the SaaS product.
  • Define and maintain standards for data usage and compliance, collaborating across teams to align architecture with business goals.

PythonSQLCloud ComputingETLMachine LearningData engineeringNosqlSparkData modelingData management

Posted 23 days ago
Apply