Apply

Staff Data Engineer

Posted 10 days agoViewed

View full description

πŸ’Ž Seniority level: Staff, 7+ years

πŸ“ Location: United States

πŸ’Έ Salary: 117800.0 - 214300.0 USD per year

πŸ” Industry: Software Development

🏒 Company: careers_gm

πŸ—£οΈ Languages: English

⏳ Experience: 7+ years

πŸͺ„ Skills: AWSDockerPostgreSQLPythonSQLApache HadoopCloud ComputingData AnalysisETLJavaKubernetesMySQLAlgorithmsApache KafkaData engineeringData scienceData StructuresREST APINosqlCI/CDProblem SolvingJSONScalaData visualizationData modelingScriptingData analyticsData management

Requirements:
  • 7+ years of hands-on experience.
  • Bachelor's degree (or equivalent work experience) in Computer Science, Data Science, Software Engineering, or a related field.
  • Strong understanding and ability to provide mentorship in the areas of data ETL processes and tools for designing and managing data pipelines
  • Proficient with big data frameworks and tools like Apache Hadoop, Apache Spark, or Apache Kafka for processing and analyzing large datasets.
  • Hands on experience with data serialization formats like JSON, Parquet and XML
  • Consistently models and leads in best practices and optimization for scripting skills in languages like Python, Java, Scala, etc for automation and data processing.
  • Proficient with database administration and performance tuning for databases like MySQL, PostgresSQL or NoSQL databases
  • Proficient with containerization (e.g., Docker) and orchestration platforms (e.g., Kubernetes) for managing data applications.
  • Experience with cloud platforms and data services for data storage and processing
  • Consistently designs solutions and build data solutions that are highly automated, performant, with quality checks that provide data consistency and accuracy outcomes
  • Experienced at actively managing large-scale data engineering projects, including planning, resource allocation, risk management, and ensuring successful project delivery and adjust style for all delivery methods (ie: Waterfall, Agile, POD, etc)
  • Understands data governance principles, data privacy regulations, and experience implementing security measures to protect data
  • Able to integrate data engineering pipelines with machine learning models and platforms
  • Strong problem-solving skills to identify and resolve complex data engineering issues efficiently.
  • Ability to work effectively in cross-functional teams, collaborate with data scientists, analysts, and stakeholders to deliver data solutions.
  • Ability to lead and mentor junior data engineers, providing guidance and support in complex data engineering projects.
  • Influential communication skills to effectively convey technical concepts to non-technical stakeholders and document data engineering processes.
  • Models a mindset of continuous learning, staying updated with the latest advancements in data engineering technologies, and a drive for innovation.
Responsibilities:
  • Design, construct, install and maintain data architectures, including database and large-scale processing systems.
  • Develop and maintain ETL (Extract, Transform, Load) processes to collect, cleanse and transform data from various sources inclusive of cloud.
  • Design and implement data pipelines to collect, process and transfer data from various sources to storage systems (data warehouses, data lakes, etc)
  • Implement security measures to protect sensitive data and ensure compliance with data privacy regulations.
  • Build data solutions that ensure data quality, integrity and security through data validation, monitoring, and compliance with data governance policies
  • Administer and optimize databases for performance and scalability
  • Maintain Master Data, Metadata, Data Management Repositories, Logical Data Models, and Data Standards
  • Troubleshoot and resolve data-related issues affecting data quality fidelity
  • Document data architectures, processes and best practices for knowledge sharing across the GM data engineering community
  • Participate in the evaluation and selection of data related tools and technologies
  • Collaborate across other engineering functions within EDAI, Marketing Technology, and Software & Services
Apply

Related Jobs

Apply

πŸ“ United States

🏒 Company: ge_externalsite

  • Hands-on experience in programming languages like Java, Python or Scala and experience in writing SQL scripts for Oracle, MySQL, PostgreSQL or HiveQL
  • Exposure to industry standard data modeling tools (e.g., ERWin, ER Studio, etc.).
  • Exposure to Extract, Transform & Load (ETL) tools like Informatica or Talend
  • Exposure to industry standard data catalog, automated data discovery and data lineage tools (e.g., Alation, Collibra, etc., )
  • Experience with Big Data / Hadoop / Spark / Hive / NoSQL database engines (i.e. Cassandra or HBase)
  • Exposure to unstructured datasets and ability to handle XML, JSON file formats
  • Conduct exploratory data analysis and generate visual summaries of data. Identify data quality issues proactively.
  • Developing reusable code pipelines through CI/CD.
  • Hands-on experience of big data or MPP databases.
  • Developing and executing integrated test plans.
  • Be responsible for identifying solutions for complex data analysis and data structure.
  • Be responsible for creating digital thread requirements
  • Be responsible for change management of database artifacts to support next gen QMS applications
  • Be responsible for monitoring data availability and data health of complex systems
  • Understand industry trends and stay up to date on associated Quality and tech landscape.
  • Design & build technical data dictionaries and support business glossaries to analyze the datasets
  • This role may also work on other Quality team digital and strategic deliveries that support the business.
  • Perform data profiling and data analysis for source systems, manually maintained data, machine or sensor generated data and target data repositories
  • Design & build both logical and physical data models for both Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) solutions
  • Develop and maintain data mapping specifications based on the results of data analysis and functional requirements
  • Build a variety of data loading & data transformation methods using multiple tools and technologies.
  • Design & build automated Extract, Transform & Load (ETL) jobs based on data mapping specifications
  • Manage metadata structures needed for building reusable Extract, Transform & Load (ETL) components.
  • Analyze reference datasets and familiarize with Master Data Management (MDM) tools.
  • Analyze the impact of changes to downstream systems/products and recommend alternatives to minimize the impact.
  • Derive solutions and make recommendations from deep dive data analysis proactively.
  • Design and build Data Quality (DQ) rules.
  • Drives design and implementation of the roadmap.
  • Design and develop complex code in multiple languages.
  • This role may also work on other Quality team digital and strategic deliveries that support the business.

PostgreSQLPythonSQLData AnalysisETLHadoopJavaMySQLOracleData engineeringNosqlSparkCI/CDAgile methodologiesJSONScalaData visualizationData modeling

Posted 3 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ” Software Development

🏒 Company: Apollo.ioπŸ‘₯ 501-1000πŸ’° $100,000,000 Series D over 1 year agoSoftware Development

  • 8+ years of experience as a data platform engineer or a software engineer in data or big data engineer.
  • Experience in data modeling, data warehousing, APIs, and building data pipelines.
  • Deep knowledge of databases and data warehousing with an ability to collaborate cross-functionally.
  • Bachelor's degree in a quantitative field (Physical/Computer Science, Engineering, Mathematics, or Statistics).
  • Develop and maintain scalable data pipelines and build new integrations to support continuing increases in data volume and complexity.
  • Develop and improve Data APIs used in machine learning / AI product offerings
  • Implement automated monitoring, alerting, self-healing (restartable/graceful failures) features while building the consumption pipelines.
  • Implement processes and systems to monitor data quality, ensuring production data is always accurate and available.
  • Write unit/integration tests, contribute to the engineering wiki, and document work.
  • Define company data models and write jobs to populate data models in our data warehouse.
  • Work closely with all business units and engineering teams to develop a strategy for long-term data platform architecture.

PythonSQLApache AirflowApache HadoopCloud ComputingETLApache KafkaData engineeringFastAPIData modelingData analytics

Posted 4 days ago
Apply
Apply

πŸ“ United States

πŸ’Έ 204000.0 - 260000.0 USD per year

πŸ” Software Development

🏒 Company: AirbnbπŸ‘₯ 5001-10000πŸ’° Secondary Market almost 5 years agoπŸ«‚ Last layoff about 2 years agoHospitalityTravel AccommodationsPropTechMarketplaceMobile AppsTravel

  • 9+ years of experience with a BS/Masters or 6+ years with a PhD
  • Expertise in SQL and proficient in at least one data engineering language, such as Python or Scala
  • Experience with Superset and Tableau
  • Expertise in large-scale distributed data processing frameworks like Presto or Spark
  • Experience with an ETL framework like Airflow
  • Extensive knowledge of data management concepts, including data modeling, ETL processes, data warehousing, and data governance.
  • Understanding of data security and privacy principles, as well as regulatory compliance requirements (e.g., GDPR, CCPA).
  • Strong problem-solving skills and the ability to translate business requirements into technical solutions.
  • Excellent communication skills, both written and verbal, ability to distill complex ideas for technical and non-technical stakeholders
  • Strong capability to forge trusted partnerships across working teams
  • Design and implement data pipelines by leveraging best-in-class tools and infrastructure to meet critical business and product requirements.
  • Develop high quality data assets for product and AI/ML use-cases
  • Collaborate with cross-functional teams to gather requirements, assess data needs, and design efficient solutions that align with business objectives.
  • Contribute to the development of long-term data strategies and roadmaps and ML infrastructure development within the organization.
  • Influence the trajectory of data in decision making
  • Improve trust in our data by championing for data quality across the stack
  • Identify and actively work upon opportunities for automation and implement data management tools and frameworks to enhance efficiency and productivity.
  • Mentor and coach team members, providing guidance in data engineering best practices and support to enhance their skills and performance.

LeadershipPythonSQLETLMachine LearningCross-functional Team LeadershipTableauAirflowData engineeringREST APISparkCI/CDProblem SolvingExcellent communication skillsScalaData visualizationData modelingData management

Posted 13 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ” Software Development

🏒 Company: Life360πŸ‘₯ 251-500πŸ’° $33,038,258 Post-IPO Equity over 2 years agoπŸ«‚ Last layoff about 2 years agoAndroidFamilyAppsMobile AppsMobile

  • Minimum 7 years of experience working with high volume data infrastructure.
  • Experience with Databricks and AWS.
  • Experience with dbt.
  • Experience with job orchestration tooling like Airflow.
  • Proficient programming in Python.
  • Proficient with SQL and the ability to optimize complex queries.
  • Proficient with large-scale data processing using Spark and/or Presto/Trino.
  • Proficient in data modeling and database design.
  • Experience with streaming data with a tool like Kinesis or Kafka.
  • Experience working with high volume event based data architecture like Amplitude and Braze.
  • Experience in modern development lifecycle including Agile methodology, CI/CD, automated deployments using Terraform, GitHub Actions, etc.
  • Knowledge and proficiency in the latest open source and data frameworks, modern data platform tech stacks and tools.
  • Always learning and staying up to speed with the fast moving data world.
  • You have good communication and collaboration skills and can work independently.
  • BS in Computer Science, Software Engineering, Mathematics, or equivalent experience.
  • Design, implement, and manage scalable data processing platforms used for real-time analytics and exploratory data analysis.
  • Manage our financial data from ingestion through ETL to storage and batch processing.
  • Automate, test and harden all data workflows.
  • Architect logical and physical data models to ensure the needs of the business are met.
  • Collaborate across the data teams, engineering, data science, and analytics, to understand their needs, while applying engineering best practices.
  • Architect and develop systems and algorithms for distributed real-time analytics and data processing.
  • Implement strategies for acquiring data to develop new insights.
  • Mentor junior engineers, imparting best practices and institutionalizing efficient processes to foster growth and innovation within the team.
  • Champion data engineering best practices and institutionalizing efficient processes to foster growth and innovation within the team.

AWSProject ManagementPythonSQLApache AirflowETLKafkaAlgorithmsData engineeringData StructuresREST APISparkCommunication SkillsAnalytical SkillsCollaborationCI/CDProblem SolvingAgile methodologiesMentoringTerraformData visualizationTechnical supportData modelingData analyticsData managementDebugging

Posted 24 days ago
Apply
Apply
πŸ”₯ Staff Data Engineer
Posted about 1 month ago

πŸ“ United States, Canada

🧭 Full-Time

πŸ’Έ 200000.0 - 228000.0 USD per year

πŸ” Software Development

🏒 Company: LaterπŸ‘₯ 1-10Consumer ElectronicsiOSAppsSoftware

  • 10+ years of experience in data engineering, software engineering, or related fields.
  • Proven experience leading the technical strategy and execution of large-scale data platforms.
  • Expertise in cloud technologies (Google Cloud Platform, AWS, Azure) with a focus on scalable data solutions (BigQuery, Snowflake, Redshift, etc.).
  • Strong proficiency in SQL, Python, and distributed data processing frameworks (Apache Spark, Flink, Beam, etc.).
  • Extensive experience with streaming data architectures using Kafka, Flink, Pub/Sub, Kinesis, or similar technologies.
  • Expertise in data modeling, schema design, indexing, partitioning, and performance tuning for analytical workloads, including data governance (security, access control, compliance: GDPR, CCPA, SOC 2)
  • Strong experience designing and optimizing scalable, fault-tolerant data pipelines using workflow orchestration tools like Airflow, Dagster, or Dataflow.
  • Ability to lead and influence engineering teams, drive cross-functional projects, and align stakeholders towards a common data vision.
  • Experience mentoring senior and mid-level data engineers to enhance team performance and skill development.
  • Lead the design and evolution of a scalable data architecture that meets analytical, machine learning, and operational needs.
  • Architect and optimize data pipelines for batch and real-time data processing, ensuring efficiency and reliability.
  • Implement best practices for distributed data processing, ensuring scalability, performance, and cost-effectiveness of data workflows.
  • Define and enforce data governance policies, implement automated validation checks, and establish monitoring frameworks to maintain data integrity.
  • Ensure data security and compliance with industry regulations by designing appropriate access controls, encryption mechanisms, and auditing processes.
  • Drive innovation in data engineering practices by researching and implementing new technologies, tools, and methodologies.
  • Work closely with data scientists, engineers, analysts, and business stakeholders to understand data requirements and deliver impactful solutions.
  • Develop reusable frameworks, libraries, and automation tools to improve efficiency, reliability, and maintainability of data infrastructure.
  • Guide and mentor data engineers, fostering a high-performing engineering culture through best practices, peer reviews, and knowledge sharing.
  • Establish and monitor SLAs for data pipelines, proactively identifying and mitigating risks to ensure high availability and reliability.

AWSPythonSQLApache AirflowCloud ComputingData AnalysisETLGCPKafkaMachine LearningSnowflakeData engineeringData modelingData management

Posted about 1 month ago
Apply
Apply
πŸ”₯ Staff Data Engineer
Posted about 1 month ago

πŸ“ United States

🧭 Full-Time

πŸ’Έ 85500.0 - 117500.0 USD per year

πŸ” Software Development

  • 5+ years of work experience as a data engineer/full stack engineering, coding in Python.
  • 5+ years of experience building web scraping tools in python, using Beautiful Soup, Scrapy, Selenium, or similar tooling
  • 3-5 years of deployment experience with CI/CD
  • Strong experience of HTML, CSS, JavaScript, and browser behavior.
  • Experience with RESTful APIs and JSON/XML data formats.
  • Knowledge of cloud platforms and containerization technologies (e.g., Docker, Kubernetes).
  • Advanced understanding of how at least one big data processing technology works under the hood (e.g. Spark / Hadoop / HDFS / Redshift / BigQuery / Snowflake)
  • Use modern tooling to build robust, extensible, and performant web scraping platform
  • Build thoughtful and reliable data acquisition and integration solutions to meet business requirements and data sourcing needs.
  • Deliver best in class infrastructure solutions for flexible and repeatable applications across disparate sources.
  • Troubleshoot, improve and scale existing data pipelines, models and solutions
  • Build upon data engineering's CI/CD deployments, and infrastructure-as-code for provisioning AWS and 3rd party (Apify) services.

AWSBackend DevelopmentPostgreSQLPythonSQLApache AirflowETLData engineeringREST APINodeJSSoftware EngineeringData analytics

Posted about 1 month ago
Apply
Apply
πŸ”₯ Staff Data Engineer
Posted about 2 months ago

πŸ“ United States

πŸ’Έ 131414.0 - 197100.0 USD per year

πŸ” Mental healthcare

🏒 Company: HeadspaceπŸ‘₯ 11-50WellnessHealth CareChild Care

  • 10+ years of success in enterprise data solutions and high-impact initiatives.
  • Expertise in platforms like Databricks, Snowflake, dbt, and Redshift.
  • Experience designing and optimizing real-time and batch ETL pipelines.
  • Demonstrated leadership and mentorship abilities in engineering.
  • Strong collaboration skills with product and analytics stakeholders.
  • Bachelor’s or advanced degree in Computer Science, Engineering, or a related field.
  • Drive the architecture and implementation of pySpark data pipelines.
  • Create and enforce design patterns in code and schema.
  • Design and lead secure and compliant data warehousing platforms.
  • Partner with analytics and product leaders for actionable insights.
  • Mentor team members on dbt architecture and foster a data-first culture.
  • Act as a thought leader on data strategy and cross-functional roadmaps.

SQLCloud ComputingETLSnowflakeData engineeringData modelingData analytics

Posted about 2 months ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 130000.0 - 170000.0 USD per year

πŸ” Data Engineering

  • 8+ years experience in a data engineering role
  • Strong knowledge of REST-based APIs and cloud technologies (AWS, Azure, GCP)
  • Experience with Python/SQL for building data pipelines
  • Bachelor's degree in computer science or related field
  • Design and build data pipelines across various source systems
  • Collaborate with teams to develop data acquisition and integration strategies
  • Coach and guide others in scalable pipeline building
  • Deploy to cloud-based platforms and troubleshoot issues

AWSDockerPythonSQLApache AirflowCloud ComputingETLGCPMachine LearningSnowflakeData engineeringREST APIData modeling

Posted 2 months ago
Apply
Apply

πŸ“ United States, Canada

🧭 Full-Time

πŸ” Software Development

  • 12+ years of experience in data engineering
  • Expertise in designing scalable data architectures
  • Strong programming skills in Python and Scala
  • Experience with Apache Spark, Databricks, Delta Lake
  • Proficiency with relational and NoSQL databases
  • Design and implement scalable data pipelines
  • Define and enforce data engineering best practices
  • Conduct code reviews and mentor team members
  • Build and maintain batch and real-time data pipelines
  • Ensure data quality, governance, and security

PostgreSQLPythonDynamoDBETLMySQLData engineeringCI/CDScala

Posted 2 months ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 170000.0 - 195000.0 USD per year

πŸ” Healthcare

🏒 Company: Parachute HealthπŸ‘₯ 101-250πŸ’° $1,000 over 5 years agoMedicalHealth CareSoftware

  • 5+ years of relevant experience.
  • Experience in Data Engineering with Python.
  • Experience building customer-facing software.
  • Strong listening and communication skills.
  • Time management and organizational skills.
  • Proactive, a driven self-starter who can work independently or as part of a team.
  • Ability to think with the 'big picture' in mind.
  • Passionate about improving patient outcomes in the healthcare space.
  • Architect solutions to integrate and manage large volumes of data across various internal and external systems.
  • Establish best practices and data governance standards to ensure that data infrastructure is built for long-term scalability.
  • Build and maintain a reporting product for external customers that visualizes data and provides tabular reports.
  • Collaborate across the organization to assess data engineering needs.

PythonETLAirflowData engineeringData visualization

Posted 2 months ago
Apply