Data Engineer

Posted 5 months agoViewed

View full description

📍 Location: United States

🔍 Industry: Consulting and Technology

🏢 Company: BCC-NIH

🗣️ Languages: English

🪄 Skills: PythonSQLAgileBashKubernetesSCRUMJiraInterpersonal skillsTroubleshootingScripting

Requirements:

Bachelor's degree in a STEM field (Engineering, Computer Science, Mathematics, Physics) or equivalent industry experience in bioinformatics.
Experience in large and complex data operations environments including relational databases and SQL.
Proficiency in scripting languages like Bash and Python.
Familiarity with LINUX/UNIX systems and troubleshooting operational pipelines.
Excellent interpersonal skills and team collaboration.

Responsibilities:

Support data engineering efforts at the National Institutes of Health (NIH) by managing and optimizing data pipelines.
Troubleshoot operational pipelines to resolve priority issues and implement effective solutions.

Apply

Related Jobs

Apply

🔥 Sr. Data Engineer (GC25001)

Posted 1 day ago

📍 United States

🧭 Full-Time

💸 150363.0 - 180870.0 USD per year

🔍 Software Development

🔧 Requirements

At least a Bachelors Degree or foreign equivalent in Computer Science, Computer Engineering, Electrical and Electronics Engineering, or a closely related technical field, and at least five (5) years of post-bachelor’s, progressive experience writing shell scripts; validating data; and engaging in data wrangling.
Experience must include at least three (3) years of experience debugging data; transforming data into Microsoft SQL server; developing processes to import data into HDFS using Sqoop; and using Java, UNIX Shell Scripts, and Python.
Experience must also include at least one (1) year of experience developing Hive scripts for data transformation on data lake projects; converting Hive scripts to Pyspark applications; automating in Hadoop; and implementing CI/CD pipelines.

💡 Responsibilities

Design, develop, test, and implement Big Data technical solutions.
Recommend the right technologies and solutions for a given use case, from the application layer to infrastructure.
Lead the delivery of compiling and installing database systems, integrating data from a variety of data sources (data warehouse, data marts) utilizing on-prem or cloud-based data structures.
Drive solution architecture and perform deployments of data pipelines and applications.
Author DDL and DML SQL spanning technical tacks.
Develop data transformation code and highly complex provisioning pipelines.
Ingest data from relational databases.
Execute automation strategy.

AWSPythonSQLHadoopJavaKafkaSnowflakeData engineeringSparkCI/CDScalaScriptingDebugging

Posted 1 day ago

Apply

🔥 Data Engineer-I

Posted 1 day ago

📍 USA

🔍 Healthcare

🏢 Company: Innovaccer Inc.

🔧 Requirements

SQL knowledge
ETL/ELT/Data pipeline knowledge
Python knowledge
Powershell / Bash knowledge
Excellent problem-solving and effective communication skills
Self-motivation, integrity and honesty

💡 Responsibilities

Collaborate with team, management, departments using virtual tools
Run Production data pipelines/processes, ensure the integrity of the data, and send out deliverables based on requirement/runbook documentation
Coordinate with the various technical teams to resolve issues/bugs/optimize said production processes
Coordinate with internal client facing team members to communicate the status of deliverables
Help develop/improve technical documentation to guide future software development projects and operations
Dedicated time to explore building out tech stack and capabilities where there are applicable use cases
Provide critical thinking, technical innovation, and extra attention to detail by serving as a trusted team member and peer code reviewer
Assists with external client communications when deliverables or receivables do not meet technical or project requirements, ensuring timely resolution and alignment

PythonSQLBashETLMicrosoft AzurePostgresData modeling

Posted 1 day ago

Apply

🔥 Senior Data Engineer

Posted 4 days ago

📍 Worldwide

🔍 Hospitality

🏢 Company: Lighthouse

🔧 Requirements

4+ years of professional experience using Python, Java, or Scala for data processing (Python preferred)
You stay up-to-date with industry trends, emerging technologies, and best practices in data engineering.
Improve, manage, and teach standards for code maintainability and performance in code submitted and reviewed
Ship large features independently, generate architecture recommendations and have the ability to implement them
Great communication: Regularly achieve consensus amongst teams
Familiarity with GCP, Kubernetes (GKE preferred), CI/CD tools (Gitlab CI preferred), familiarity with the concept of Lambda Architecture.
Experience with Apache Beam or Apache Spark for distributed data processing or event sourcing technologies like Apache Kafka.
Familiarity with monitoring tools like Grafana & Prometheus.

💡 Responsibilities

Design and develop scalable, reliable data pipelines using the Google Cloud stack.
Optimise data pipelines for performance and scalability.
Implement and maintain data governance frameworks, ensuring data accuracy, consistency, and compliance.
Monitor and troubleshoot data pipeline issues, implementing proactive measures for reliability and performance.
Collaborate with the DevOps team to automate deployments and improve developer experience on the data front.
Work with data science and analytics teams to enable them to bring their research to production grade data solutions, using technologies like airflow, dbt or MLflow (but not limited to)
As a part of a platform team, you will communicate effectively with teams across the entire engineering organisation, to provide them with reliable foundational data models and data tools.
Mentor and provide technical guidance to other engineers working with data.

PythonSQLApache AirflowETLGCPKubernetesApache KafkaData engineeringCI/CDMentoringTerraformScalaData modeling

Posted 4 days ago

Apply

🔥 Senior Data Engineer

Posted 5 days ago

📍 United States

🧭 Full-Time

💸 183600.0 - 216000.0 USD per year

🔍 Software Development

🔧 Requirements

6+ years of experience in a data engineering role building products, ideally in a fast-paced environment
Good foundations in Python and SQL.
Experience with Spark, PySpark, DBT, Snowflake and Airflow
Knowledge of visualization tools, such as Metabase, Jupyter Notebooks (Python)

💡 Responsibilities

Collaborate on the design and improvements of the data infrastructure
Partner with product and engineering to advocate best practices and build supporting systems and infrastructure for the various data needs
Create data pipelines that stitch together various data sources in order to produce valuable business insights
Create real-time data pipelines in collaboration with the Data Science team

PythonSQLSnowflakeAirflowData engineeringSparkData visualizationData modeling

Posted 5 days ago

Apply

🔥 Senior Data Engineer

Posted 5 days ago

📍 United States

🧭 Full-Time

🔍 Healthcare

🏢 Company: Rad AI👥 101-250💰 $60,000,000 Series C 2 months agoArtificial Intelligence (AI)Enterprise Software Health Care

🔧 Requirements

4+ years relevant experience in data engineering.
Expertise in designing and developing distributed data pipelines using big data technologies on large scale data sets.
Deep and hands-on experience designing, planning, productionizing, maintaining and documenting reliable and scalable data infrastructure and data products in complex environments.
Solid experience with big data processing and analytics on AWS, using services such as Amazon EMR and AWS Batch.
Experience in large scale data processing technologies such as Spark.
Expertise in orchestrating workflows using tools like Metaflow.
Experience with various database technologies including SQL, NoSQL databases (e.g., AWS DynamoDB, ElasticSearch, Postgresql).
Hands-on experience with containerization technologies, such as Docker and Kubernetes.

💡 Responsibilities

Design and implement the data architecture, ensuring scalability, flexibility, and efficiency using pipeline authoring tools like Metaflow and large-scale data processing technologies like Spark.
Define and extend our internal standards for style, maintenance, and best practices for a high-scale data platform.
Collaborate with researchers and other stakeholders to understand their data needs including model training and production monitoring systems and develop solutions that meet those requirements.
Take ownership of key data engineering projects and work independently to design, develop, and maintain high-quality data solutions.
Ensure data quality, integrity, and security by implementing robust data validation, monitoring, and access controls.
Evaluate and recommend data technologies and tools to improve the efficiency and effectiveness of the data engineering process.
Continuously monitor, maintain, and improve the performance and stability of the data infrastructure.

AWSDockerSQLElasticSearchETLKubernetesData engineeringNosqlSparkData modeling

Posted 5 days ago

Apply

🔥 Senior Data Engineer - Data Services

Posted 5 days ago

📍 Worldwide

🧭 Full-Time

🔧 Requirements

NOT STATED

💡 Responsibilities

Own the design and implementation of cross-domain data models that support key business metrics and use cases.
Partner with analysts and data engineers to translate business logic into performant, well-documented dbt models.
Champion best practices in testing, documentation, CI/CD, and version control, and guide others in applying them.
Act as a technical mentor to other analytics engineers, supporting their development and reviewing their code.
Collaborate with central data platform and embedded teams to improve data quality, metric consistency, and lineage tracking.
Drive alignment on model architecture across domains—ensuring models are reusable, auditable, and trusted.
Identify and lead initiatives to reduce technical debt and modernise legacy reporting pipelines.
Contribute to the long-term vision of analytics engineering at Pleo and help shape our roadmap for scalability and impact.

SQLData AnalysisETLData engineeringCI/CDMentoringDocumentationData visualizationData modelingData analyticsData management

Posted 5 days ago

Apply

🔥 Senior Data Engineer

Posted 5 days ago

📍 United States

🧭 Full-Time

💸 183600.0 - 216000.0 USD per year

🔍 Mental Healthcare

🏢 Company: Headway👥 201-500💰 $125,000,000 Series C over 1 year agoMental Health Care

🔧 Requirements

6+ years of experience in a data engineering role building products, ideally in a fast-paced environment
Good foundations in Python and SQL.
Experience with Spark, PySpark, DBT, Snowflake and Airflow
Knowledge of visualization tools, such as Metabase, Jupyter Notebooks (Python)
A knack for simplifying data, expressing information in charts and tables

💡 Responsibilities

Collaborate on the design and improvements of the data infrastructure
Partner with product and engineering to advocate best practices and build supporting systems and infrastructure for the various data needs
Create data pipelines that stitch together various data sources in order to produce valuable business insights
Create real-time data pipelines in collaboration with the Data Science team

PythonSQLETLSnowflakeAirflowData engineeringRDBMSSparkRESTful APIsData visualizationData modeling

Posted 5 days ago

Apply

🔥 Staff Automation & Data Engineer

Posted 6 days ago

📍 United States

🧭 Full-Time

💸 114000.0 - 171599.0 USD per year

🔍 Fintech

🔧 Requirements

Strong expertise in data pipeline development (ETL/ELT) and workflow automation.
Proficiency in Python, SQL, and scripting languages for data processing and automation.
Hands-on experience with Workato, Google Apps Script, and API-driven automation.

💡 Responsibilities

Automate customer support, success, and service workflows to improve speed, accuracy, and responsiveness.
Build and maintain scalable ETL/ELT pipelines to ensure real-time access to critical customer data.
Implement self-service automation to enable customers and internal teams to quickly access information.

PythonSQLETLJiraAPI testingData engineeringCI/CDRESTful APIsData visualizationScriptingCustomer Success

Posted 6 days ago

Apply

🔥 Staff Data Engineer

Posted 6 days ago

📍 United States

🏢 Company: ge_externalsite

🔧 Requirements

Hands-on experience in programming languages like Java, Python or Scala and experience in writing SQL scripts for Oracle, MySQL, PostgreSQL or HiveQL
Exposure to industry standard data modeling tools (e.g., ERWin, ER Studio, etc.).
Exposure to Extract, Transform & Load (ETL) tools like Informatica or Talend
Exposure to industry standard data catalog, automated data discovery and data lineage tools (e.g., Alation, Collibra, etc., )
Experience with Big Data / Hadoop / Spark / Hive / NoSQL database engines (i.e. Cassandra or HBase)
Exposure to unstructured datasets and ability to handle XML, JSON file formats
Conduct exploratory data analysis and generate visual summaries of data. Identify data quality issues proactively.
Developing reusable code pipelines through CI/CD.
Hands-on experience of big data or MPP databases.
Developing and executing integrated test plans.

💡 Responsibilities

Be responsible for identifying solutions for complex data analysis and data structure.
Be responsible for creating digital thread requirements
Be responsible for change management of database artifacts to support next gen QMS applications
Be responsible for monitoring data availability and data health of complex systems
Understand industry trends and stay up to date on associated Quality and tech landscape.
Design & build technical data dictionaries and support business glossaries to analyze the datasets
This role may also work on other Quality team digital and strategic deliveries that support the business.
Perform data profiling and data analysis for source systems, manually maintained data, machine or sensor generated data and target data repositories
Design & build both logical and physical data models for both Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) solutions
Develop and maintain data mapping specifications based on the results of data analysis and functional requirements
Build a variety of data loading & data transformation methods using multiple tools and technologies.
Design & build automated Extract, Transform & Load (ETL) jobs based on data mapping specifications
Manage metadata structures needed for building reusable Extract, Transform & Load (ETL) components.
Analyze reference datasets and familiarize with Master Data Management (MDM) tools.
Analyze the impact of changes to downstream systems/products and recommend alternatives to minimize the impact.
Derive solutions and make recommendations from deep dive data analysis proactively.
Design and build Data Quality (DQ) rules.
Drives design and implementation of the roadmap.
Design and develop complex code in multiple languages.
This role may also work on other Quality team digital and strategic deliveries that support the business.

PostgreSQLPythonSQLData AnalysisETLHadoopJavaMySQLOracleData engineeringNosqlSparkCI/CDAgile methodologiesJSONScalaData visualizationData modeling

Posted 6 days ago

Apply

🔥 Staff Data Engineer

Posted 7 days ago

📍 United States

🧭 Full-Time

🔍 Software Development

🏢 Company: Apollo.io👥 501-1000💰 $100,000,000 Series D over 1 year agoSoftware Development

🔧 Requirements

8+ years of experience as a data platform engineer or a software engineer in data or big data engineer.
Experience in data modeling, data warehousing, APIs, and building data pipelines.
Deep knowledge of databases and data warehousing with an ability to collaborate cross-functionally.
Bachelor's degree in a quantitative field (Physical/Computer Science, Engineering, Mathematics, or Statistics).

💡 Responsibilities

Develop and maintain scalable data pipelines and build new integrations to support continuing increases in data volume and complexity.
Develop and improve Data APIs used in machine learning / AI product offerings
Implement automated monitoring, alerting, self-healing (restartable/graceful failures) features while building the consumption pipelines.
Implement processes and systems to monitor data quality, ensuring production data is always accurate and available.
Write unit/integration tests, contribute to the engineering wiki, and document work.
Define company data models and write jobs to populate data models in our data warehouse.
Work closely with all business units and engineering teams to develop a strategy for long-term data platform architecture.

PythonSQLApache AirflowApache HadoopCloud ComputingETLApache KafkaData engineeringFastAPIData modelingData analytics

Posted 7 days ago

Apply