Apply

Staff Data Engineer

Posted about 2 months agoViewed

View full description

๐Ÿ’Ž Seniority level: Staff, 10+ years

๐Ÿ“ Location: United States, Canada

๐Ÿ’ธ Salary: 200000.0 - 228000.0 USD per year

๐Ÿ” Industry: Software Development

๐Ÿข Company: Later๐Ÿ‘ฅ 1-10Consumer ElectronicsiOSAppsSoftware

๐Ÿ—ฃ๏ธ Languages: English

โณ Experience: 10+ years

๐Ÿช„ Skills: AWSPythonSQLApache AirflowCloud ComputingData AnalysisETLGCPKafkaMachine LearningSnowflakeData engineeringData modelingData management

Requirements:
  • 10+ years of experience in data engineering, software engineering, or related fields.
  • Proven experience leading the technical strategy and execution of large-scale data platforms.
  • Expertise in cloud technologies (Google Cloud Platform, AWS, Azure) with a focus on scalable data solutions (BigQuery, Snowflake, Redshift, etc.).
  • Strong proficiency in SQL, Python, and distributed data processing frameworks (Apache Spark, Flink, Beam, etc.).
  • Extensive experience with streaming data architectures using Kafka, Flink, Pub/Sub, Kinesis, or similar technologies.
  • Expertise in data modeling, schema design, indexing, partitioning, and performance tuning for analytical workloads, including data governance (security, access control, compliance: GDPR, CCPA, SOC 2)
  • Strong experience designing and optimizing scalable, fault-tolerant data pipelines using workflow orchestration tools like Airflow, Dagster, or Dataflow.
  • Ability to lead and influence engineering teams, drive cross-functional projects, and align stakeholders towards a common data vision.
  • Experience mentoring senior and mid-level data engineers to enhance team performance and skill development.
Responsibilities:
  • Lead the design and evolution of a scalable data architecture that meets analytical, machine learning, and operational needs.
  • Architect and optimize data pipelines for batch and real-time data processing, ensuring efficiency and reliability.
  • Implement best practices for distributed data processing, ensuring scalability, performance, and cost-effectiveness of data workflows.
  • Define and enforce data governance policies, implement automated validation checks, and establish monitoring frameworks to maintain data integrity.
  • Ensure data security and compliance with industry regulations by designing appropriate access controls, encryption mechanisms, and auditing processes.
  • Drive innovation in data engineering practices by researching and implementing new technologies, tools, and methodologies.
  • Work closely with data scientists, engineers, analysts, and business stakeholders to understand data requirements and deliver impactful solutions.
  • Develop reusable frameworks, libraries, and automation tools to improve efficiency, reliability, and maintainability of data infrastructure.
  • Guide and mentor data engineers, fostering a high-performing engineering culture through best practices, peer reviews, and knowledge sharing.
  • Establish and monitor SLAs for data pipelines, proactively identifying and mitigating risks to ensure high availability and reliability.
Apply

Related Jobs

Apply

๐Ÿ“ United States

๐Ÿข Company: ge_externalsite

  • Hands-on experience in programming languages like Java, Python or Scala and experience in writing SQL scripts for Oracle, MySQL, PostgreSQL or HiveQL
  • Exposure to industry standard data modeling tools (e.g., ERWin, ER Studio, etc.).
  • Exposure to Extract, Transform & Load (ETL) tools like Informatica or Talend
  • Exposure to industry standard data catalog, automated data discovery and data lineage tools (e.g., Alation, Collibra, etc., )
  • Experience with Big Data / Hadoop / Spark / Hive / NoSQL database engines (i.e. Cassandra or HBase)
  • Exposure to unstructured datasets and ability to handle XML, JSON file formats
  • Conduct exploratory data analysis and generate visual summaries of data. Identify data quality issues proactively.
  • Developing reusable code pipelines through CI/CD.
  • Hands-on experience of big data or MPP databases.
  • Developing and executing integrated test plans.
  • Be responsible for identifying solutions for complex data analysis and data structure.
  • Be responsible for creating digital thread requirements
  • Be responsible for change management of database artifacts to support next gen QMS applications
  • Be responsible for monitoring data availability and data health of complex systems
  • Understand industry trends and stay up to date on associated Quality and tech landscape.
  • Design & build technical data dictionaries and support business glossaries to analyze the datasets
  • This role may also work on other Quality team digital and strategic deliveries that support the business.
  • Perform data profiling and data analysis for source systems, manually maintained data, machine or sensor generated data and target data repositories
  • Design & build both logical and physical data models for both Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) solutions
  • Develop and maintain data mapping specifications based on the results of data analysis and functional requirements
  • Build a variety of data loading & data transformation methods using multiple tools and technologies.
  • Design & build automated Extract, Transform & Load (ETL) jobs based on data mapping specifications
  • Manage metadata structures needed for building reusable Extract, Transform & Load (ETL) components.
  • Analyze reference datasets and familiarize with Master Data Management (MDM) tools.
  • Analyze the impact of changes to downstream systems/products and recommend alternatives to minimize the impact.
  • Derive solutions and make recommendations from deep dive data analysis proactively.
  • Design and build Data Quality (DQ) rules.
  • Drives design and implementation of the roadmap.
  • Design and develop complex code in multiple languages.
  • This role may also work on other Quality team digital and strategic deliveries that support the business.

PostgreSQLPythonSQLData AnalysisETLHadoopJavaMySQLOracleData engineeringNosqlSparkCI/CDAgile methodologiesJSONScalaData visualizationData modeling

Posted 5 days ago
Apply
Apply

๐Ÿ“ United States

๐Ÿงญ Full-Time

๐Ÿ” Software Development

๐Ÿข Company: Apollo.io๐Ÿ‘ฅ 501-1000๐Ÿ’ฐ $100,000,000 Series D over 1 year agoSoftware Development

  • 8+ years of experience as a data platform engineer or a software engineer in data or big data engineer.
  • Experience in data modeling, data warehousing, APIs, and building data pipelines.
  • Deep knowledge of databases and data warehousing with an ability to collaborate cross-functionally.
  • Bachelor's degree in a quantitative field (Physical/Computer Science, Engineering, Mathematics, or Statistics).
  • Develop and maintain scalable data pipelines and build new integrations to support continuing increases in data volume and complexity.
  • Develop and improve Data APIs used in machine learning / AI product offerings
  • Implement automated monitoring, alerting, self-healing (restartable/graceful failures) features while building the consumption pipelines.
  • Implement processes and systems to monitor data quality, ensuring production data is always accurate and available.
  • Write unit/integration tests, contribute to the engineering wiki, and document work.
  • Define company data models and write jobs to populate data models in our data warehouse.
  • Work closely with all business units and engineering teams to develop a strategy for long-term data platform architecture.

PythonSQLApache AirflowApache HadoopCloud ComputingETLApache KafkaData engineeringFastAPIData modelingData analytics

Posted 6 days ago
Apply
Apply

๐Ÿ“ United States

๐Ÿงญ Full-Time

๐Ÿ’ธ 117800.0 - 214300.0 USD per year

๐Ÿ” Software Development

๐Ÿข Company: careers_gm

  • 7+ years of hands-on experience.
  • Bachelor's degree (or equivalent work experience) in Computer Science, Data Science, Software Engineering, or a related field.
  • Strong understanding and ability to provide mentorship in the areas of data ETL processes and tools for designing and managing data pipelines
  • Proficient with big data frameworks and tools like Apache Hadoop, Apache Spark, or Apache Kafka for processing and analyzing large datasets.
  • Hands on experience with data serialization formats like JSON, Parquet and XML
  • Consistently models and leads in best practices and optimization for scripting skills in languages like Python, Java, Scala, etc for automation and data processing.
  • Proficient with database administration and performance tuning for databases like MySQL, PostgresSQL or NoSQL databases
  • Proficient with containerization (e.g., Docker) and orchestration platforms (e.g., Kubernetes) for managing data applications.
  • Experience with cloud platforms and data services for data storage and processing
  • Consistently designs solutions and build data solutions that are highly automated, performant, with quality checks that provide data consistency and accuracy outcomes
  • Experienced at actively managing large-scale data engineering projects, including planning, resource allocation, risk management, and ensuring successful project delivery and adjust style for all delivery methods (ie: Waterfall, Agile, POD, etc)
  • Understands data governance principles, data privacy regulations, and experience implementing security measures to protect data
  • Able to integrate data engineering pipelines with machine learning models and platforms
  • Strong problem-solving skills to identify and resolve complex data engineering issues efficiently.
  • Ability to work effectively in cross-functional teams, collaborate with data scientists, analysts, and stakeholders to deliver data solutions.
  • Ability to lead and mentor junior data engineers, providing guidance and support in complex data engineering projects.
  • Influential communication skills to effectively convey technical concepts to non-technical stakeholders and document data engineering processes.
  • Models a mindset of continuous learning, staying updated with the latest advancements in data engineering technologies, and a drive for innovation.
  • Design, construct, install and maintain data architectures, including database and large-scale processing systems.
  • Develop and maintain ETL (Extract, Transform, Load) processes to collect, cleanse and transform data from various sources inclusive of cloud.
  • Design and implement data pipelines to collect, process and transfer data from various sources to storage systems (data warehouses, data lakes, etc)
  • Implement security measures to protect sensitive data and ensure compliance with data privacy regulations.
  • Build data solutions that ensure data quality, integrity and security through data validation, monitoring, and compliance with data governance policies
  • Administer and optimize databases for performance and scalability
  • Maintain Master Data, Metadata, Data Management Repositories, Logical Data Models, and Data Standards
  • Troubleshoot and resolve data-related issues affecting data quality fidelity
  • Document data architectures, processes and best practices for knowledge sharing across the GM data engineering community
  • Participate in the evaluation and selection of data related tools and technologies
  • Collaborate across other engineering functions within EDAI, Marketing Technology, and Software & Services

AWSDockerPostgreSQLPythonSQLApache HadoopCloud ComputingData AnalysisETLJavaKubernetesMySQLAlgorithmsApache KafkaData engineeringData scienceData StructuresREST APINosqlCI/CDProblem SolvingJSONScalaData visualizationData modelingScriptingData analyticsData management

Posted 11 days ago
Apply
Apply

๐Ÿ“ United States

๐Ÿ’ธ 204000.0 - 260000.0 USD per year

๐Ÿ” Software Development

๐Ÿข Company: Airbnb๐Ÿ‘ฅ 5001-10000๐Ÿ’ฐ Secondary Market almost 5 years ago๐Ÿซ‚ Last layoff about 2 years agoHospitalityTravel AccommodationsPropTechMarketplaceMobile AppsTravel

  • 9+ years of experience with a BS/Masters or 6+ years with a PhD
  • Expertise in SQL and proficient in at least one data engineering language, such as Python or Scala
  • Experience with Superset and Tableau
  • Expertise in large-scale distributed data processing frameworks like Presto or Spark
  • Experience with an ETL framework like Airflow
  • Extensive knowledge of data management concepts, including data modeling, ETL processes, data warehousing, and data governance.
  • Understanding of data security and privacy principles, as well as regulatory compliance requirements (e.g., GDPR, CCPA).
  • Strong problem-solving skills and the ability to translate business requirements into technical solutions.
  • Excellent communication skills, both written and verbal, ability to distill complex ideas for technical and non-technical stakeholders
  • Strong capability to forge trusted partnerships across working teams
  • Design and implement data pipelines by leveraging best-in-class tools and infrastructure to meet critical business and product requirements.
  • Develop high quality data assets for product and AI/ML use-cases
  • Collaborate with cross-functional teams to gather requirements, assess data needs, and design efficient solutions that align with business objectives.
  • Contribute to the development of long-term data strategies and roadmaps and ML infrastructure development within the organization.
  • Influence the trajectory of data in decision making
  • Improve trust in our data by championing for data quality across the stack
  • Identify and actively work upon opportunities for automation and implement data management tools and frameworks to enhance efficiency and productivity.
  • Mentor and coach team members, providing guidance in data engineering best practices and support to enhance their skills and performance.

LeadershipPythonSQLETLMachine LearningCross-functional Team LeadershipTableauAirflowData engineeringREST APISparkCI/CDProblem SolvingExcellent communication skillsScalaData visualizationData modelingData management

Posted 15 days ago
Apply
Apply

๐Ÿ“ United States

๐Ÿงญ Full-Time

๐Ÿ” Software Development

๐Ÿข Company: Life360๐Ÿ‘ฅ 251-500๐Ÿ’ฐ $33,038,258 Post-IPO Equity over 2 years ago๐Ÿซ‚ Last layoff about 2 years agoAndroidFamilyAppsMobile AppsMobile

  • Minimum 7 years of experience working with high volume data infrastructure.
  • Experience with Databricks and AWS.
  • Experience with dbt.
  • Experience with job orchestration tooling like Airflow.
  • Proficient programming in Python.
  • Proficient with SQL and the ability to optimize complex queries.
  • Proficient with large-scale data processing using Spark and/or Presto/Trino.
  • Proficient in data modeling and database design.
  • Experience with streaming data with a tool like Kinesis or Kafka.
  • Experience working with high volume event based data architecture like Amplitude and Braze.
  • Experience in modern development lifecycle including Agile methodology, CI/CD, automated deployments using Terraform, GitHub Actions, etc.
  • Knowledge and proficiency in the latest open source and data frameworks, modern data platform tech stacks and tools.
  • Always learning and staying up to speed with the fast moving data world.
  • You have good communication and collaboration skills and can work independently.
  • BS in Computer Science, Software Engineering, Mathematics, or equivalent experience.
  • Design, implement, and manage scalable data processing platforms used for real-time analytics and exploratory data analysis.
  • Manage our financial data from ingestion through ETL to storage and batch processing.
  • Automate, test and harden all data workflows.
  • Architect logical and physical data models to ensure the needs of the business are met.
  • Collaborate across the data teams, engineering, data science, and analytics, to understand their needs, while applying engineering best practices.
  • Architect and develop systems and algorithms for distributed real-time analytics and data processing.
  • Implement strategies for acquiring data to develop new insights.
  • Mentor junior engineers, imparting best practices and institutionalizing efficient processes to foster growth and innovation within the team.
  • Champion data engineering best practices and institutionalizing efficient processes to foster growth and innovation within the team.

AWSProject ManagementPythonSQLApache AirflowETLKafkaAlgorithmsData engineeringData StructuresREST APISparkCommunication SkillsAnalytical SkillsCollaborationCI/CDProblem SolvingAgile methodologiesMentoringTerraformData visualizationTechnical supportData modelingData analyticsData managementDebugging

Posted 25 days ago
Apply
Apply
๐Ÿ”ฅ Staff Data Engineer
Posted about 2 months ago

๐Ÿ“ United States

๐Ÿงญ Full-Time

๐Ÿ’ธ 85500.0 - 117500.0 USD per year

๐Ÿ” Software Development

  • 5+ years of work experience as a data engineer/full stack engineering, coding in Python.
  • 5+ years of experience building web scraping tools in python, using Beautiful Soup, Scrapy, Selenium, or similar tooling
  • 3-5 years of deployment experience with CI/CD
  • Strong experience of HTML, CSS, JavaScript, and browser behavior.
  • Experience with RESTful APIs and JSON/XML data formats.
  • Knowledge of cloud platforms and containerization technologies (e.g., Docker, Kubernetes).
  • Advanced understanding of how at least one big data processing technology works under the hood (e.g. Spark / Hadoop / HDFS / Redshift / BigQuery / Snowflake)
  • Use modern tooling to build robust, extensible, and performant web scraping platform
  • Build thoughtful and reliable data acquisition and integration solutions to meet business requirements and data sourcing needs.
  • Deliver best in class infrastructure solutions for flexible and repeatable applications across disparate sources.
  • Troubleshoot, improve and scale existing data pipelines, models and solutions
  • Build upon data engineering's CI/CD deployments, and infrastructure-as-code for provisioning AWS and 3rd party (Apify) services.

AWSBackend DevelopmentPostgreSQLPythonSQLApache AirflowETLData engineeringREST APINodeJSSoftware EngineeringData analytics

Posted about 2 months ago
Apply
Apply
๐Ÿ”ฅ Staff Data Engineer
Posted about 2 months ago

๐Ÿ“ United States

๐Ÿ’ธ 131414.0 - 197100.0 USD per year

๐Ÿ” Mental healthcare

๐Ÿข Company: Headspace๐Ÿ‘ฅ 11-50WellnessHealth CareChild Care

  • 10+ years of success in enterprise data solutions and high-impact initiatives.
  • Expertise in platforms like Databricks, Snowflake, dbt, and Redshift.
  • Experience designing and optimizing real-time and batch ETL pipelines.
  • Demonstrated leadership and mentorship abilities in engineering.
  • Strong collaboration skills with product and analytics stakeholders.
  • Bachelorโ€™s or advanced degree in Computer Science, Engineering, or a related field.
  • Drive the architecture and implementation of pySpark data pipelines.
  • Create and enforce design patterns in code and schema.
  • Design and lead secure and compliant data warehousing platforms.
  • Partner with analytics and product leaders for actionable insights.
  • Mentor team members on dbt architecture and foster a data-first culture.
  • Act as a thought leader on data strategy and cross-functional roadmaps.

SQLCloud ComputingETLSnowflakeData engineeringData modelingData analytics

Posted about 2 months ago
Apply
Apply

๐Ÿ“ United States

๐Ÿงญ Full-Time

๐Ÿ’ธ 130000.0 - 170000.0 USD per year

๐Ÿ” Data Engineering

  • 8+ years experience in a data engineering role
  • Strong knowledge of REST-based APIs and cloud technologies (AWS, Azure, GCP)
  • Experience with Python/SQL for building data pipelines
  • Bachelor's degree in computer science or related field
  • Design and build data pipelines across various source systems
  • Collaborate with teams to develop data acquisition and integration strategies
  • Coach and guide others in scalable pipeline building
  • Deploy to cloud-based platforms and troubleshoot issues

AWSDockerPythonSQLApache AirflowCloud ComputingETLGCPMachine LearningSnowflakeData engineeringREST APIData modeling

Posted 2 months ago
Apply
Apply

๐Ÿ“ United States

๐Ÿงญ Full-Time

๐Ÿ’ธ 170000.0 - 195000.0 USD per year

๐Ÿ” Healthcare

๐Ÿข Company: Parachute Health๐Ÿ‘ฅ 101-250๐Ÿ’ฐ $1,000 over 5 years agoMedicalHealth CareSoftware

  • 5+ years of relevant experience.
  • Experience in Data Engineering with Python.
  • Experience building customer-facing software.
  • Strong listening and communication skills.
  • Time management and organizational skills.
  • Proactive, a driven self-starter who can work independently or as part of a team.
  • Ability to think with the 'big picture' in mind.
  • Passionate about improving patient outcomes in the healthcare space.
  • Architect solutions to integrate and manage large volumes of data across various internal and external systems.
  • Establish best practices and data governance standards to ensure that data infrastructure is built for long-term scalability.
  • Build and maintain a reporting product for external customers that visualizes data and provides tabular reports.
  • Collaborate across the organization to assess data engineering needs.

PythonETLAirflowData engineeringData visualization

Posted 2 months ago
Apply
Apply

๐Ÿ“ US, Ontario, CAN

๐Ÿ” Food waste reduction and grocery technology

๐Ÿข Company: Afresh๐Ÿ‘ฅ 51-100๐Ÿ’ฐ $115,000,000 Series B over 2 years agoArtificial Intelligence (AI)LogisticsFood and BeverageMachine LearningAgricultureSupply Chain ManagementSoftware

  • Significant experience designing and maintaining ETLs that process large-scale datasets.
  • Proficiency with Python, PySpark, SQL, and experience with tools like Databricks, Snowflake, or DBT.
  • Strong problem-solving skills with ambiguous requirements.
  • Focus on practical outcomes balancing technical rigor and execution.
  • Experience with complex, unclean datasets and innovative processing methods.
  • Identifying areas for tooling or automation to simplify workflows.
  • Excellent communication skills for technical presentation.
  • Proven leadership in technical projects with mentoring ability.
  • Build tools and frameworks that streamline customer integrations.
  • Create robust ETLs in PySpark and DBT to process billions of records.
  • Collaborate with teams to design and deliver data solutions for new products.
  • Identify optimizations to improve ETL runtime and scalability.
  • Solve data quality challenges with messy datasets.
  • Investigate and implement new technologies into the data platform.
  • Support team members by mentoring and leading technical discussions.

PythonSQLETLData engineeringData management

Posted 3 months ago
Apply