Staff Data Engineer

Posted 3 days agoViewed

💎 Seniority level: Staff, 4+ years (Bachelor’s degree) OR 7+ years (Associate’s degree) OR 9+ years (High School Diploma)

📍 Location: United States

🏢 Company: ge_externalsite

🗣️ Languages: English

⏳ Experience: 4+ years (Bachelor’s degree) OR 7+ years (Associate’s degree) OR 9+ years (High School Diploma)

🪄 Skills: PostgreSQLPythonSQLData AnalysisETLHadoopJavaMySQLOracleData engineeringNosqlSparkCI/CDAgile methodologiesJSONScalaData visualizationData modeling

Requirements:

Hands-on experience in programming languages like Java, Python or Scala and experience in writing SQL scripts for Oracle, MySQL, PostgreSQL or HiveQL
Exposure to industry standard data modeling tools (e.g., ERWin, ER Studio, etc.).
Exposure to Extract, Transform & Load (ETL) tools like Informatica or Talend
Exposure to industry standard data catalog, automated data discovery and data lineage tools (e.g., Alation, Collibra, etc., )
Experience with Big Data / Hadoop / Spark / Hive / NoSQL database engines (i.e. Cassandra or HBase)
Exposure to unstructured datasets and ability to handle XML, JSON file formats
Conduct exploratory data analysis and generate visual summaries of data. Identify data quality issues proactively.
Developing reusable code pipelines through CI/CD.
Hands-on experience of big data or MPP databases.
Developing and executing integrated test plans.

Responsibilities:

Be responsible for identifying solutions for complex data analysis and data structure.
Be responsible for creating digital thread requirements
Be responsible for change management of database artifacts to support next gen QMS applications
Be responsible for monitoring data availability and data health of complex systems
Understand industry trends and stay up to date on associated Quality and tech landscape.
Design & build technical data dictionaries and support business glossaries to analyze the datasets
This role may also work on other Quality team digital and strategic deliveries that support the business.
Perform data profiling and data analysis for source systems, manually maintained data, machine or sensor generated data and target data repositories
Design & build both logical and physical data models for both Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) solutions
Develop and maintain data mapping specifications based on the results of data analysis and functional requirements
Build a variety of data loading & data transformation methods using multiple tools and technologies.
Design & build automated Extract, Transform & Load (ETL) jobs based on data mapping specifications
Manage metadata structures needed for building reusable Extract, Transform & Load (ETL) components.
Analyze reference datasets and familiarize with Master Data Management (MDM) tools.
Analyze the impact of changes to downstream systems/products and recommend alternatives to minimize the impact.
Derive solutions and make recommendations from deep dive data analysis proactively.
Design and build Data Quality (DQ) rules.
Drives design and implementation of the roadmap.
Design and develop complex code in multiple languages.
This role may also work on other Quality team digital and strategic deliveries that support the business.

Apply

Related Jobs

Apply

🔥 Staff Data Engineer (Remote, United States)

Posted 4 days ago

📍 United States

🧭 Full-Time

🔍 Software Development

🏢 Company: Apollo.io👥 501-1000💰 $100,000,000 Series D over 1 year agoSoftware Development

🔧 Requirements

8+ years of experience as a data platform engineer or a software engineer in data or big data engineer.
Experience in data modeling, data warehousing, APIs, and building data pipelines.
Deep knowledge of databases and data warehousing with an ability to collaborate cross-functionally.
Bachelor's degree in a quantitative field (Physical/Computer Science, Engineering, Mathematics, or Statistics).

💡 Responsibilities

Develop and maintain scalable data pipelines and build new integrations to support continuing increases in data volume and complexity.
Develop and improve Data APIs used in machine learning / AI product offerings
Implement automated monitoring, alerting, self-healing (restartable/graceful failures) features while building the consumption pipelines.
Implement processes and systems to monitor data quality, ensuring production data is always accurate and available.
Write unit/integration tests, contribute to the engineering wiki, and document work.
Define company data models and write jobs to populate data models in our data warehouse.
Work closely with all business units and engineering teams to develop a strategy for long-term data platform architecture.

PythonSQLApache AirflowApache HadoopCloud ComputingETLApache KafkaData engineeringFastAPIData modelingData analytics

Posted 4 days ago

Apply

🔥 Staff Data Engineer: Host Pricing & Settings

Posted 13 days ago

📍 United States

💸 204000.0 - 260000.0 USD per year

🔍 Software Development

🏢 Company: Airbnb👥 5001-10000💰 Secondary Market almost 5 years ago🫂 Last layoff about 2 years agoHospitality Travel Accommodations PropTech Marketplace Mobile Apps Travel

🔧 Requirements

9+ years of experience with a BS/Masters or 6+ years with a PhD
Expertise in SQL and proficient in at least one data engineering language, such as Python or Scala
Experience with Superset and Tableau
Expertise in large-scale distributed data processing frameworks like Presto or Spark
Experience with an ETL framework like Airflow
Extensive knowledge of data management concepts, including data modeling, ETL processes, data warehousing, and data governance.
Understanding of data security and privacy principles, as well as regulatory compliance requirements (e.g., GDPR, CCPA).
Strong problem-solving skills and the ability to translate business requirements into technical solutions.
Excellent communication skills, both written and verbal, ability to distill complex ideas for technical and non-technical stakeholders
Strong capability to forge trusted partnerships across working teams

💡 Responsibilities

Design and implement data pipelines by leveraging best-in-class tools and infrastructure to meet critical business and product requirements.
Develop high quality data assets for product and AI/ML use-cases
Collaborate with cross-functional teams to gather requirements, assess data needs, and design efficient solutions that align with business objectives.
Contribute to the development of long-term data strategies and roadmaps and ML infrastructure development within the organization.
Influence the trajectory of data in decision making
Improve trust in our data by championing for data quality across the stack
Identify and actively work upon opportunities for automation and implement data management tools and frameworks to enhance efficiency and productivity.
Mentor and coach team members, providing guidance in data engineering best practices and support to enhance their skills and performance.

LeadershipPythonSQLETLMachine LearningCross-functional Team LeadershipTableauAirflowData engineeringREST APISparkCI/CDProblem SolvingExcellent communication skillsScalaData visualizationData modelingData management

Posted 13 days ago

Apply

🔥 Staff Data Engineer

Posted 24 days ago

📍 United States

🧭 Full-Time

🔍 Software Development

🏢 Company: Life360👥 251-500💰 $33,038,258 Post-IPO Equity over 2 years ago🫂 Last layoff about 2 years agoAndroid Family Apps Mobile Apps Mobile

🔧 Requirements

Minimum 7 years of experience working with high volume data infrastructure.
Experience with Databricks and AWS.
Experience with dbt.
Experience with job orchestration tooling like Airflow.
Proficient programming in Python.
Proficient with SQL and the ability to optimize complex queries.
Proficient with large-scale data processing using Spark and/or Presto/Trino.
Proficient in data modeling and database design.
Experience with streaming data with a tool like Kinesis or Kafka.
Experience working with high volume event based data architecture like Amplitude and Braze.
Experience in modern development lifecycle including Agile methodology, CI/CD, automated deployments using Terraform, GitHub Actions, etc.
Knowledge and proficiency in the latest open source and data frameworks, modern data platform tech stacks and tools.
Always learning and staying up to speed with the fast moving data world.
You have good communication and collaboration skills and can work independently.
BS in Computer Science, Software Engineering, Mathematics, or equivalent experience.

💡 Responsibilities

Design, implement, and manage scalable data processing platforms used for real-time analytics and exploratory data analysis.
Manage our financial data from ingestion through ETL to storage and batch processing.
Automate, test and harden all data workflows.
Architect logical and physical data models to ensure the needs of the business are met.
Collaborate across the data teams, engineering, data science, and analytics, to understand their needs, while applying engineering best practices.
Architect and develop systems and algorithms for distributed real-time analytics and data processing.
Implement strategies for acquiring data to develop new insights.
Mentor junior engineers, imparting best practices and institutionalizing efficient processes to foster growth and innovation within the team.
Champion data engineering best practices and institutionalizing efficient processes to foster growth and innovation within the team.

AWSProject ManagementPythonSQLApache AirflowETLKafkaAlgorithmsData engineeringData StructuresREST APISparkCommunication SkillsAnalytical SkillsCollaborationCI/CDProblem SolvingAgile methodologiesMentoringTerraformData visualizationTechnical supportData modelingData analyticsData managementDebugging

Posted 24 days ago

Apply

🔥 Staff Data Engineer

Posted about 1 month ago

📍 United States, Canada

🧭 Full-Time

💸 200000.0 - 228000.0 USD per year

🔍 Software Development

🏢 Company: Later👥 1-10 Consumer Electronics iOS Apps Software

🔧 Requirements

10+ years of experience in data engineering, software engineering, or related fields.
Proven experience leading the technical strategy and execution of large-scale data platforms.
Expertise in cloud technologies (Google Cloud Platform, AWS, Azure) with a focus on scalable data solutions (BigQuery, Snowflake, Redshift, etc.).
Strong proficiency in SQL, Python, and distributed data processing frameworks (Apache Spark, Flink, Beam, etc.).
Extensive experience with streaming data architectures using Kafka, Flink, Pub/Sub, Kinesis, or similar technologies.
Expertise in data modeling, schema design, indexing, partitioning, and performance tuning for analytical workloads, including data governance (security, access control, compliance: GDPR, CCPA, SOC 2)
Strong experience designing and optimizing scalable, fault-tolerant data pipelines using workflow orchestration tools like Airflow, Dagster, or Dataflow.
Ability to lead and influence engineering teams, drive cross-functional projects, and align stakeholders towards a common data vision.
Experience mentoring senior and mid-level data engineers to enhance team performance and skill development.

💡 Responsibilities

Lead the design and evolution of a scalable data architecture that meets analytical, machine learning, and operational needs.
Architect and optimize data pipelines for batch and real-time data processing, ensuring efficiency and reliability.
Implement best practices for distributed data processing, ensuring scalability, performance, and cost-effectiveness of data workflows.
Define and enforce data governance policies, implement automated validation checks, and establish monitoring frameworks to maintain data integrity.
Ensure data security and compliance with industry regulations by designing appropriate access controls, encryption mechanisms, and auditing processes.
Drive innovation in data engineering practices by researching and implementing new technologies, tools, and methodologies.
Work closely with data scientists, engineers, analysts, and business stakeholders to understand data requirements and deliver impactful solutions.
Develop reusable frameworks, libraries, and automation tools to improve efficiency, reliability, and maintainability of data infrastructure.
Guide and mentor data engineers, fostering a high-performing engineering culture through best practices, peer reviews, and knowledge sharing.
Establish and monitor SLAs for data pipelines, proactively identifying and mitigating risks to ensure high availability and reliability.

AWSPythonSQLApache AirflowCloud ComputingData AnalysisETLGCPKafkaMachine LearningSnowflakeData engineeringData modelingData management

Posted about 1 month ago

Apply

🔥 Staff Data Engineer

Posted about 1 month ago

📍 United States

🧭 Full-Time

💸 85500.0 - 117500.0 USD per year

🔍 Software Development

🔧 Requirements

5+ years of work experience as a data engineer/full stack engineering, coding in Python.
5+ years of experience building web scraping tools in python, using Beautiful Soup, Scrapy, Selenium, or similar tooling
3-5 years of deployment experience with CI/CD
Strong experience of HTML, CSS, JavaScript, and browser behavior.
Experience with RESTful APIs and JSON/XML data formats.
Knowledge of cloud platforms and containerization technologies (e.g., Docker, Kubernetes).
Advanced understanding of how at least one big data processing technology works under the hood (e.g. Spark / Hadoop / HDFS / Redshift / BigQuery / Snowflake)

💡 Responsibilities

Use modern tooling to build robust, extensible, and performant web scraping platform
Build thoughtful and reliable data acquisition and integration solutions to meet business requirements and data sourcing needs.
Deliver best in class infrastructure solutions for flexible and repeatable applications across disparate sources.
Troubleshoot, improve and scale existing data pipelines, models and solutions
Build upon data engineering's CI/CD deployments, and infrastructure-as-code for provisioning AWS and 3rd party (Apify) services.

AWSBackend DevelopmentPostgreSQLPythonSQLApache AirflowETLData engineeringREST APINodeJSSoftware EngineeringData analytics

Posted about 1 month ago

Apply

🔥 Staff Data Engineer

Posted about 2 months ago

📍 United States

💸 131414.0 - 197100.0 USD per year

🔍 Mental healthcare

🏢 Company: Headspace👥 11-50 Wellness Health Care Child Care

🔧 Requirements

10+ years of success in enterprise data solutions and high-impact initiatives.
Expertise in platforms like Databricks, Snowflake, dbt, and Redshift.
Experience designing and optimizing real-time and batch ETL pipelines.
Demonstrated leadership and mentorship abilities in engineering.
Strong collaboration skills with product and analytics stakeholders.
Bachelor’s or advanced degree in Computer Science, Engineering, or a related field.

💡 Responsibilities

Drive the architecture and implementation of pySpark data pipelines.
Create and enforce design patterns in code and schema.
Design and lead secure and compliant data warehousing platforms.
Partner with analytics and product leaders for actionable insights.
Mentor team members on dbt architecture and foster a data-first culture.
Act as a thought leader on data strategy and cross-functional roadmaps.

SQLCloud ComputingETLSnowflakeData engineeringData modelingData analytics

Posted about 2 months ago

Apply

🔥 Staff Data Engineer

Posted 2 months ago

📍 United States

🧭 Full-Time

💸 130000.0 - 170000.0 USD per year

🔍 Data Engineering

🔧 Requirements

8+ years experience in a data engineering role
Strong knowledge of REST-based APIs and cloud technologies (AWS, Azure, GCP)
Experience with Python/SQL for building data pipelines
Bachelor's degree in computer science or related field

💡 Responsibilities

Design and build data pipelines across various source systems
Collaborate with teams to develop data acquisition and integration strategies
Coach and guide others in scalable pipeline building
Deploy to cloud-based platforms and troubleshoot issues

AWSDockerPythonSQLApache AirflowCloud ComputingETLGCPMachine LearningSnowflakeData engineeringREST APIData modeling

Posted 2 months ago

Apply

🔥 Staff Data Engineer

Posted 2 months ago

📍 United States, Canada

🧭 Full-Time

🔍 Software Development

🔧 Requirements

12+ years of experience in data engineering
Expertise in designing scalable data architectures
Strong programming skills in Python and Scala
Experience with Apache Spark, Databricks, Delta Lake
Proficiency with relational and NoSQL databases

💡 Responsibilities

Design and implement scalable data pipelines
Define and enforce data engineering best practices
Conduct code reviews and mentor team members
Build and maintain batch and real-time data pipelines
Ensure data quality, governance, and security

PostgreSQLPythonDynamoDBETLMySQLData engineeringCI/CDScala

Posted 2 months ago

Apply

🔥 Staff Data Engineer

Posted 2 months ago

📍 United States

🧭 Full-Time

💸 170000.0 - 195000.0 USD per year

🔍 Healthcare

🏢 Company: Parachute Health👥 101-250💰 $1,000 over 5 years agoMedical Health Care Software

🔧 Requirements

5+ years of relevant experience.
Experience in Data Engineering with Python.
Experience building customer-facing software.
Strong listening and communication skills.
Time management and organizational skills.
Proactive, a driven self-starter who can work independently or as part of a team.
Ability to think with the 'big picture' in mind.
Passionate about improving patient outcomes in the healthcare space.

💡 Responsibilities

Architect solutions to integrate and manage large volumes of data across various internal and external systems.
Establish best practices and data governance standards to ensure that data infrastructure is built for long-term scalability.
Build and maintain a reporting product for external customers that visualizes data and provides tabular reports.
Collaborate across the organization to assess data engineering needs.

PythonETLAirflowData engineeringData visualization

Posted 2 months ago

Apply

🔥 Staff Data Engineer

Posted 3 months ago

📍 US, Ontario, CAN

🔍 Food waste reduction and grocery technology

🏢 Company: Afresh👥 51-100💰 $115,000,000 Series B over 2 years agoArtificial Intelligence (AI)Logistics Food and Beverage Machine Learning Agriculture Supply Chain Management Software

🔧 Requirements

Significant experience designing and maintaining ETLs that process large-scale datasets.
Proficiency with Python, PySpark, SQL, and experience with tools like Databricks, Snowflake, or DBT.
Strong problem-solving skills with ambiguous requirements.
Focus on practical outcomes balancing technical rigor and execution.
Experience with complex, unclean datasets and innovative processing methods.
Identifying areas for tooling or automation to simplify workflows.
Excellent communication skills for technical presentation.
Proven leadership in technical projects with mentoring ability.

💡 Responsibilities

Build tools and frameworks that streamline customer integrations.
Create robust ETLs in PySpark and DBT to process billions of records.
Collaborate with teams to design and deliver data solutions for new products.
Identify optimizations to improve ETL runtime and scalability.
Solve data quality challenges with messy datasets.
Investigate and implement new technologies into the data platform.
Support team members by mentoring and leading technical discussions.

PythonSQLETLData engineeringData management

Posted 3 months ago

Apply