Apply

Senior Data Engineer

Posted 6 months agoViewed

View full description

πŸ’Ž Seniority level: Senior, 5+ years experience

πŸ“ Location: USA, CAN, MEX

πŸ” Industry: Transportation technology

🏒 Company: Fleetio

πŸ—£οΈ Languages: English

⏳ Experience: 5+ years experience

πŸͺ„ Skills: AWSProject ManagementPythonSQLBusiness IntelligenceDesign PatternsKafkaSnowflakeTableauData engineeringServerlessCommunication SkillsCI/CD

Requirements:
  • 5+ years experience working in a data engineering or data-focused software engineering role.
  • Experience transforming raw data into clean models using standard tools of the modern data stack.
  • Deep understanding of ELT and data modeling concepts.
  • Experience with streaming data and pipelines (Kafka or Kinesis).
  • Proficiency in Python with a proven track record of delivering production-ready Python applications.
  • Experience in designing, building, and administering modern data pipelines and data warehouses.
  • Experience with dbt.
  • Familiarity with semantic layers like Cube or MetricFlow.
  • Experience with Snowflake, BigQuery, or Redshift.
  • Knowledge of version control tools such as GitHub or GitLab.
  • Experience with ELT tools like Stitch or Fivetran.
  • Experience with orchestration tools such as Prefect or Dagster.
  • Knowledge of CI/CD and IaaC tooling such as GitHub Actions and Terraform.
  • Experience with business intelligence solutions (Metabase, Looker, Tableau, Periscope, Mode).
  • Familiarity with serverless cloud functions (AWS Lambda, Google Cloud Functions, etc.).
  • Excellent communication and project management skills with a customer service-focused mindset.
Responsibilities:
  • Enable and scale self-serve analytics for all Fleetio team members by modeling data and metrics via tools like dbt.
  • Develop data destinations, custom integrations, and maintain open source packages for customer data integration.
  • Maintain and develop custom data pipelines from operational source systems for both streaming and batch sources.
  • Work on the development of internal data infrastructure, improving data hygiene and integrity through ELT pipeline monitoring.
  • Architect, design, and implement core components of data platform including data observability and data science products.
  • Develop and maintain streaming data pipelines from various databases and sources.
  • Collaborate across the company to tailor data needs and ensure data is appropriately modeled and available.
  • Document best practices and coach others on data modeling and SQL query optimization, managing roles, permissions, and deprecated projects.
Apply

Related Jobs

Apply

πŸ“ Worldwide

πŸ” Hospitality

🏒 Company: Lighthouse

  • 4+ years of professional experience using Python, Java, or Scala for data processing (Python preferred)
  • You stay up-to-date with industry trends, emerging technologies, and best practices in data engineering.
  • Improve, manage, and teach standards for code maintainability and performance in code submitted and reviewed
  • Ship large features independently, generate architecture recommendations and have the ability to implement them
  • Great communication: Regularly achieve consensus amongst teams
  • Familiarity with GCP, Kubernetes (GKE preferred),Β  CI/CD tools (Gitlab CI preferred), familiarity with the concept of Lambda Architecture.
  • Experience with Apache Beam or Apache Spark for distributed data processing or event sourcing technologies like Apache Kafka.
  • Familiarity with monitoring tools like Grafana & Prometheus.
  • Design and develop scalable, reliable data pipelines using the Google Cloud stack.
  • Optimise data pipelines for performance and scalability.
  • Implement and maintain data governance frameworks, ensuring data accuracy, consistency, and compliance.
  • Monitor and troubleshoot data pipeline issues, implementing proactive measures for reliability and performance.
  • Collaborate with the DevOps team to automate deployments and improve developer experience on the data front.
  • Work with data science and analytics teams to enable them to bring their research to production grade data solutions, using technologies like airflow, dbt or MLflow (but not limited to)
  • As a part of a platform team, you will communicate effectively with teams across the entire engineering organisation, to provide them with reliable foundational data models and data tools.
  • Mentor and provide technical guidance to other engineers working with data.

PythonSQLApache AirflowETLGCPKubernetesApache KafkaData engineeringCI/CDMentoringTerraformScalaData modeling

Posted 3 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 183600.0 - 216000.0 USD per year

πŸ” Software Development

  • 6+ years of experience in a data engineering role building products, ideally in a fast-paced environment
  • Good foundations in Python and SQL.
  • Experience with Spark, PySpark, DBT, Snowflake and Airflow
  • Knowledge of visualization tools, such as Metabase, Jupyter Notebooks (Python)
  • Collaborate on the design and improvements of the data infrastructure
  • Partner with product and engineering to advocate best practices and build supporting systems and infrastructure for the various data needs
  • Create data pipelines that stitch together various data sources in order to produce valuable business insights
  • Create real-time data pipelines in collaboration with the Data Science team

PythonSQLSnowflakeAirflowData engineeringSparkData visualizationData modeling

Posted 4 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ” Healthcare

🏒 Company: Rad AIπŸ‘₯ 101-250πŸ’° $60,000,000 Series C 2 months agoArtificial Intelligence (AI)Enterprise SoftwareHealth Care

  • 4+ years relevant experience in data engineering.
  • Expertise in designing and developing distributed data pipelines using big data technologies on large scale data sets.
  • Deep and hands-on experience designing, planning, productionizing, maintaining and documenting reliable and scalable data infrastructure and data products in complex environments.
  • Solid experience with big data processing and analytics on AWS, using services such as Amazon EMR and AWS Batch.
  • Experience in large scale data processing technologies such as Spark.
  • Expertise in orchestrating workflows using tools like Metaflow.
  • Experience with various database technologies including SQL, NoSQL databases (e.g., AWS DynamoDB, ElasticSearch, Postgresql).
  • Hands-on experience with containerization technologies, such as Docker and Kubernetes.
  • Design and implement the data architecture, ensuring scalability, flexibility, and efficiency using pipeline authoring tools like Metaflow and large-scale data processing technologies like Spark.
  • Define and extend our internal standards for style, maintenance, and best practices for a high-scale data platform.
  • Collaborate with researchers and other stakeholders to understand their data needs including model training and production monitoring systems and develop solutions that meet those requirements.
  • Take ownership of key data engineering projects and work independently to design, develop, and maintain high-quality data solutions.
  • Ensure data quality, integrity, and security by implementing robust data validation, monitoring, and access controls.
  • Evaluate and recommend data technologies and tools to improve the efficiency and effectiveness of the data engineering process.
  • Continuously monitor, maintain, and improve the performance and stability of the data infrastructure.

AWSDockerSQLElasticSearchETLKubernetesData engineeringNosqlSparkData modeling

Posted 4 days ago
Apply
Apply

πŸ“ Worldwide

🧭 Full-Time

NOT STATED
  • Own the design and implementation of cross-domain data models that support key business metrics and use cases.
  • Partner with analysts and data engineers to translate business logic into performant, well-documented dbt models.
  • Champion best practices in testing, documentation, CI/CD, and version control, and guide others in applying them.
  • Act as a technical mentor to other analytics engineers, supporting their development and reviewing their code.
  • Collaborate with central data platform and embedded teams to improve data quality, metric consistency, and lineage tracking.
  • Drive alignment on model architecture across domainsβ€”ensuring models are reusable, auditable, and trusted.
  • Identify and lead initiatives to reduce technical debt and modernise legacy reporting pipelines.
  • Contribute to the long-term vision of analytics engineering at Pleo and help shape our roadmap for scalability and impact.

SQLData AnalysisETLData engineeringCI/CDMentoringDocumentationData visualizationData modelingData analyticsData management

Posted 4 days ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 183600.0 - 216000.0 USD per year

πŸ” Mental Healthcare

🏒 Company: HeadwayπŸ‘₯ 201-500πŸ’° $125,000,000 Series C over 1 year agoMental Health Care

  • 6+ years of experience in a data engineering role building products, ideally in a fast-paced environment
  • Good foundations in Python and SQL.
  • Experience with Spark, PySpark, DBT, Snowflake and Airflow
  • Knowledge of visualization tools, such as Metabase, Jupyter Notebooks (Python)
  • A knack for simplifying data, expressing information in charts and tables
  • Collaborate on the design and improvements of the data infrastructure
  • Partner with product and engineering to advocate best practices and build supporting systems and infrastructure for the various data needs
  • Create data pipelines that stitch together various data sources in order to produce valuable business insights
  • Create real-time data pipelines in collaboration with the Data Science team

PythonSQLETLSnowflakeAirflowData engineeringRDBMSSparkRESTful APIsData visualizationData modeling

Posted 5 days ago
Apply
Apply

πŸ“ Costa Rica, Brazil, Argentina, Chile, Mexico

πŸ” Insider Risk Management

🏒 Company: TeramindπŸ‘₯ 51-100Productivity ToolsSecurityCyber SecurityEnterprise SoftwareSoftware

  • 6+ years of experience in data engineering, with a proven track record of successfully delivering data-driven solutions.
  • Strong expertise in designing and building scalable data pipelines using industry-standard tools and frameworks.
  • Experience with big data technologies and distributed systems, such as Hadoop, Spark, or similar frameworks.
  • Proficient programming skills in languages such as Python, Java, or Scala, alongside a solid understanding of database management systems (SQL and NoSQL).
  • Understanding of data requirements for machine learning applications and how to optimize data for model training.
  • Experience with security data processing and compliance standards is preferred, ensuring that data handling meets industry regulations and best practices.
  • Design and implement robust data architecture tailored for AI-driven features, ensuring it meets the evolving needs of our platform.
  • Build and maintain efficient data pipelines for processing user activity data, ensuring data flows seamlessly throughout our systems.
  • Develop comprehensive systems for data storage, retrieval, and processing, facilitating quick and reliable access to information.
  • Ensure high standards of data quality and availability, enabling machine learning models to produce accurate and actionable insights.
  • Enhance the performance and scalability of our data infrastructure to accommodate growing data demands and user activity.
  • Work closely with data scientists and machine learning engineers to understand their data requirements and ensure data solutions are tailored to their needs.

PythonSQLApache HadoopETLMachine LearningAzureData engineeringNosqlComplianceScalaData visualizationData modelingData management

Posted 6 days ago
Apply
Apply

πŸ“ United States, Canada

🧭 Full-Time

πŸ” Software Development

  • Strong hands-on experience with Python and core Python Data Processing tools such as pandas, numpy, scipy, scikit
  • Experience with cloud tools and environments like Docker, Kubernetes, GCP, and/or Azure
  • Experience with Spark/PySpark
  • Experience with Data Lineage and Data Cataloging
  • Relational and non-relational database experience
  • Experience with Data Warehouses and Lakes, such as Bigquery, Databricks, or Snowflake
  • Experience in designing and building data pipelines that scale
  • Strong communication skills, with the ability to convey technical solutions to both technical and non-technical stakeholders
  • Experience working effectively in a fast-paced, agile environment as part of a collaborative team
  • Ability to work independently and as part of a team
  • Willingness and enthusiasm to learn new technologies and tackle challenging problems
  • Experience in Infrastructure as Code tools like Terraform
  • Advanced SQL expertise, including experience with complex queries, query optimization, and working with various database systems
  • Work with business stakeholders to understand their goals, challenges, and decisions
  • Assist with building solutions that standardize their data approach to common problems across the company
  • Incorporate observability and testing best practices into projects
  • Assist in the development of processes to ensure their data is trusted and well-documented
  • Effectively work with data analysts on refining the data model used for reporting and analytical purposes
  • Improve the availability and consistency of data points crucial for analysis
  • Standing up a reporting system in BigQuery from scratch, including data replication, infrastructure setup, dbt model creation, and Integration with reporting endpoints
  • Revamping orchestration and execution to reduce critical data delivery times
  • Database archiving to move data from a live database to cold storage

AWSSQLCloud ComputingData AnalysisETLData engineeringData visualizationData modeling

Posted 11 days ago
Apply
Apply

πŸ“ Canada

🧭 Full-Time

πŸ” Fintech

🏒 Company: CoinmeπŸ‘₯ 51-100πŸ’° $772,801 Seed over 2 years agoCryptocurrencyBlockchainBitcoinFinTechVirtual Currency

  • 7+ years of experience with ETL, SQL, PowerBI, Tableau, or similar technologies
  • Strong understanding of data modeling, database design, and SQL
  • Experience working with Apache Kafka or MSK solution
  • Extensive experience delivering solutions on Snowflake or other cloud-based data warehouses
  • Proficiency in Python/R and familiarity with modern data engineering practices
  • Strong analytical and problem-solving skills
  • Experience with machine learning (ML)
  • Design, develop, and maintain scalable data pipelines.
  • Implement data ingestion frameworks.
  • Optimize data pipelines for performance.
  • Develop and deliver data assets.
  • Evaluate and improve existing data solutions.
  • Experience in data quality management.
  • Collaborate with engineers and product managers.
  • Lead the deployment and maintenance of data solutions.
  • Champion best practices in data development.
  • Conduct code reviews and provide mentorship.
  • Create and maintain process documentation.
  • Monitor data pipelines for performance.
  • Implement logging, monitoring, and alerting systems.
  • Drive the team’s Agile process.

PythonSQLAgileETLMachine LearningSnowflakeTableauApache KafkaData engineeringData visualizationData modeling

Posted 13 days ago
Apply
Apply

πŸ“ United States of America

🧭 Full-Time

πŸ’Έ 92700.0 - 185400.0 USD per year

πŸ” Health Solutions

  • Bachelors in Computer Science or equivalent
  • Proven ability to complete projects in a timely manner while clearly measuring progress
  • Strong software engineering fundamentals (data structures, algorithms, async programming patterns, object-oriented design, parallel programming)
  • Strong understanding and demonstrated experience with at least one popular programming language (.NET or Java) and SQL constructs.
  • Experience writing and maintaining frontend client applications, Angular preferred
  • Strong experience with revision control (Git)
  • Experience with cloud-based systems (Azure / AWS / GCP).
  • High level understanding of big data design (data lake, data mesh, data warehouse) and data normalization patterns
  • Demonstrated experience with Queuing technologies (Kafka / SNS / RabbitMQ etc)
  • Demonstrated experience with Metrics, Logging, Monitoring and Alerting tools
  • Strong communication skills
  • Strong experience with use of RESTful APIs
  • High level understanding of HL7 V2.x / FHIR based interface messages.
  • High level understanding of system deployment tasks and technologies. (CI/CD Pipeline, K8s, Terraform).
  • Communicate with business leaders to help translate requirements into functional specification
  • Develop broad understanding of business logic and functionality of current systems
  • Analyze and manipulate data by writing and running SQL queries
  • Analyze logs to identify and prevent potential issues from occurring
  • Deliver clean and functional code in accordance with business requirements
  • Consume data from any source, such a flat files, streaming systems, or RESTful APIs
  • Interface with Electronic Health Records
  • Engineer scalable, reliable, and performant systems to manage data
  • Collaborate closely with other Engineers, QA, Scrum master, Product Manager in your team as well as across the organization
  • Build quality systems while expanding offerings to dependent teams
  • Comfortable in multiple roles, from Design and Development to Code Deployment to and monitoring and investigating in production systems.

AWSSQLCloud ComputingGCPGitJavaKafkaKubernetesAlgorithmsApache KafkaAzureData engineeringData Structures.NETAngularREST APICI/CDRESTful APIsTerraform

Posted 15 days ago
Apply
Apply

πŸ“ India, Canada, United Kingdom

🧭 Full-Time

πŸ” Software Development

🏒 Company: Loopio Inc.

  • 5+ years of experience in data engineering in a high-growth agile software development environment
  • Strong understanding of database concepts, modeling, SQL, query optimization
  • Ability to learn fast and translate data into actionable results
  • Experience developing in Python and Pyspark
  • Hands-on experience with the AWS services (RDS, S3, Redshift, Glue, Quicksight, Athena, ECS)
  • Strong understanding of relational databases (RDS, MySQL) and NoSQL
  • Experience with ETL & Data warehousing, building fact & dimensional data models
  • Experience with data processing frameworks such as Spark / Databricks
  • Experience in developing Big Data solutions (migration, storage, processing)
  • Experience with CI/CD tools (Jenkins) and pipeline orchestration tools (Databricks Jobs, Airflow)
  • Experience working with data visualization and BI platforms (Quicksight, Tableau, Sisense, etc)
  • Experience working with Clickstream data (Amplitude, Pendo, etc)
  • Experience building and supporting large-scale systems in a production environment
  • Strong communication, collaboration, and analytical skills
  • Demonstrated ability to work with a high degree of ambiguity, and leadership within a team (mentorship, ownership, innovation)
  • Ability to clearly communicate technical roadmap, challenges, and mitigation
  • Be responsible for building, evolving and scaling data platforms and ETL pipelines, with an eye towards the growth of our business and the reliability of our data
  • Promote data-driven decision-making across the organization through data expertise
  • Build advanced automation tooling tooling for data orchestration, evaluation, testing, monitoring, administration, and data operations.
  • Integrate various data sources into our Data lake, including clickstream, relational, and unstructured data
  • Developing and maintaining a feature store for use in analytics & modeling
  • Partner with data scientists to create predictive models to help drive insights and decisions, both in Loopio’s product and internal teams (RevOps, Marketing, CX)
  • Work closely with stakeholders within and across teams to understand the data needs of the business and produce processes that enable a better product and support data-driven decision-making
  • Build scalable data pipelines using Databricks, and AWS (Redshift, S3, RDS), and other cloud technologies
  • Build and support Loopio’s data warehouse (Redshift) and data lake (Databricks delta lake)
  • Orchestrate pipelines using workflow frameworks/tooling

AWSPythonSQLData AnalysisETLJenkinsMachine LearningAirflowData engineeringNosqlSparkCommunication SkillsAnalytical SkillsCollaborationCI/CDData visualizationData modeling

Posted 18 days ago
Apply