Apply

Data Engineer

Posted 24 days agoViewed

View full description

πŸ’Ž Seniority level: Middle, 3+ years

πŸ“ Location: Mexico

🏒 Company: EnrouteπŸ‘₯ 1-10πŸ’° Non-equity Assistance over 4 years agoE-CommerceEnterprise SoftwareSoftware

πŸ—£οΈ Languages: English

⏳ Experience: 3+ years

Requirements:
  • 3+ years of hands-on experience with Databricks for data processing and analytics
  • 3+ years of experience as a Data Engineer or in a similar role
  • Strong proficiency in SQL and experience with database management (SQL, NoSQL)
  • Hands-on experience with ETL development, data wrangling, and pipeline automation
  • Familiarity with big data processing frameworks (Spark, Delta Lake)
  • Experience working with cloud platforms (AWS, Azure, or GCP)
  • Knowledge of dashboarding tools like Power BI, Tableau, or similar
  • Strong programming skills in Python, Scala, or SQL
  • Understanding of data governance, security, and compliance
  • Experience working with structured and unstructured data sources
Responsibilities:
  • Work on data migration
  • Develop new workflows
  • Data wrangling
  • Dashboard creation
Apply

Related Jobs

Apply

🧭 Full-Time

πŸ” Consulting

🏒 Company: P3 Adaptive

  • US Citizenship or Green Card (We don’t sponsor work visas)
  • Strong written and spoken English
  • Proven time management skills
  • Proven ability to connect with a diverse range of technical and non-technical stakeholders
  • Experienced in Project Management
  • Intermediate or better knowledge of T-SQL for DDL and DML applications.
  • Experience with Azure Active Directory Security Groups and Role-Based Access Controls
  • Experience with SSIS, SSAS preferred
  • Experience with PowerShell and Python preferred
  • Insatiable curiosity and love of learning
  • Support the execution of Power BI projects, working alongside expert Principal Consultants and Solution Architects.
  • Create Data Storage Solutions with SQL Server and Data Lakes.
  • Develop ETL Pipelines with Azure Data Factory.
  • Provision Azure Subscriptions and Resources.
  • Develop Automation Solutions using languages such as PowerShell and Python
Posted 9 minutes ago
Apply
Apply

πŸ“ Germany, Italy, Netherlands, Portugal, Romania, Spain, UK

🧭 Full-Time

πŸ” Wellness

  • You have a proven track record of designing and building robust, scalable, and maintainable data models and corresponding pipelines from business requirements.
  • You are skilled at engaging with engineering and product teams to elicit requirements.
  • You are comfortable with big data concepts, ensuring data is efficiently ingested, processed, and made available for data scientists, business analysts, and product teams.
  • You are experienced in maintaining data consistency across the entire data ecosystem.
  • You have experience maintaining and debugging data pipelines in production environments with high criticality, ensuring reliability and performance.
  • Develop and maintain efficient and scalable data models and structures to support analytical workloads.
  • Design, develop, and maintain data pipelines that transform and process large volumes of data while embedding business context and semantics.
  • Implement automated data quality checks to ensure consistency, accuracy, and reliability of data.
  • Ensure correct adoption and usage of Wellhub’s data by data practitioners across the company
  • Live the mission: inspire and empower others by genuinely caring for your own wellbeing and your colleagues. Bring wellbeing to the forefront of work, and create a supportive environment where everyone feels comfortable taking care of themselves, taking time off, and finding work-life balance.

SQLApache AirflowKubernetesApache KafkaData engineeringSparkData modeling

Posted about 4 hours ago
Apply
Apply

πŸ“ Portugal

🧭 Full-Time

🏒 Company: Wellhub

  • Proven track record of designing and building robust, scalable, and maintainable data models and corresponding pipelines from business requirements.
  • Skilled at engaging with engineering and product teams to elicit requirements.
  • Comfortable with big data concepts, ensuring data is efficiently ingested, processed, and made available for data scientists, business analysts, and product teams.
  • Experienced in maintaining data consistency across the entire data ecosystem.
  • Experience maintaining and debugging data pipelines in production environments with high criticality, ensuring reliability and performance.
  • Motivated to contribute to a data-driven culture and take pride in seeing the impact of your work across the company
  • Develop and maintain efficient and scalable data models and structures to support analytical workloads.
  • Design, develop, and maintain data pipelines that transform and process large volumes of data while embedding business context and semantics.
  • Implement automated data quality checks to ensure consistency, accuracy, and reliability of data.
  • Ensure correct adoption and usage of Wellhub’s data by data practitioners across the company
  • Live the mission: inspire and empower others by genuinely caring for your own wellbeing and your colleagues. Bring wellbeing to the forefront of work, and create a supportive environment where everyone feels comfortable taking care of themselves, taking time off, and finding work-life balance.

SQLApache AirflowETLKubernetesApache KafkaData engineeringSparkData visualizationData modelingData analyticsData management

Posted about 7 hours ago
Apply
Apply

🧭 Full-Time

πŸ” E-Learning

🏒 Company: TruelogicπŸ‘₯ 101-250ConsultingWeb DevelopmentWeb DesignSoftware

  • 3-5 years of experience working with PySpark and Apache Spark in Big Data environments.
  • Experience with SQL and relational and NoSQL databases (PostgreSQL, MySQL, MongoDB, etc.).
  • Knowledge of ETL processes and data processing in distributed environments.
  • Familiarity with Apache Hadoop, Hive, or Delta Lake.
  • Experience with cloud storage (AWS S3, Google Cloud Storage, Azure Blob).
  • Proficiency in Git and version control.
  • Strong problem-solving skills and a proactive attitude.
  • A passion for learning and continuous improvement.
  • Design, develop, and optimize data pipelines using PySpark and Apache Spark.
  • Integrate and process data from multiple sources (databases, APIs, files, streaming).
  • Implement efficient data transformations for Big Data in distributed environments.
  • Optimize code to improve performance, scalability, and efficiency in data processing.
  • Collaborate with Data Science, BI, and DevOps teams to ensure seamless integration.
  • Monitor and debug data processes to ensure quality and reliability.
  • Apply best practices in data engineering and maintain clear documentation.
  • Stay up to date with the latest trends in Big Data and distributed computing.
Posted 2 days ago
Apply
Apply

πŸ” Health & Bioinformatics

🏒 Company: Gradient AIπŸ‘₯ 101-250πŸ’° $20,000,000 Series B almost 4 years agoArtificial Intelligence (AI)Machine LearningInsurTechInsuranceHealth Care

  • 5+ years of relevant working experience, with a significant portion focused on healthcare data.
  • Proven experience working with and interpreting health, medical, and bioinformatics data is required, including experience with real-world healthcare datasets.
  • Expertise as a subject matter expert (SME) in health and bioinformatics data, with a deep understanding of the nuances and challenges associated with processing medical and bioinformatics data, and a strong understanding of the healthcare industry.
  • Experience working in Python in a professional environment, ideally in a healthcare or life sciences setting.
  • Desire to learn new skills and tools (e.g., Redshift, Tableau, AWS Lambda, etc.); bonus for experience with healthcare-specific data analysis and visualization tools.
  • Design, build, and implement data systems that fuel our ML and AI models for our health insurance clients, ensuring compliance with healthcare data privacy and security regulations (e.g., HIPAA).
  • Develop tools to extract and process diverse healthcare data sources, including electronic health records (EHRs), medical claims, pharmacy data, and genomic data, and create tools to profile and validate data.
  • Work cross-functionally with data scientists to transform large amounts of health-related and bioinformatics data and store it in a format that facilitates modeling, paying close attention to data quality and integrity in the context of healthcare applications.
  • Contribute to production operations, data pipelines, workflow management, reliability engineering, and more, with an understanding of the critical nature of data reliability in healthcare settings.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a variety of sources using SQL and AWS β€˜big data’ technologies, including experience with healthcare-specific data warehousing and analytics platforms.
  • Leverage expertise as a health and bioinformatics SME to ensure that data pipelines align with the specific requirements of health, medical, and bioinformatics data processing, including the ability to translate complex medical and biological concepts into data requirements.
Posted 2 days ago
Apply
Apply

πŸ” Health & Bioinformatics

  • BS in Computer Science, Bioinformatics, or another quantitative discipline; 5+ years of relevant working experience, with a significant portion focused on healthcare data.
  • Proven experience working with and interpreting health, medical, and bioinformatics data is required, including experience with real-world healthcare datasets.
  • Expertise as a subject matter expert (SME) in health and bioinformatics data, with a deep understanding of the nuances and challenges associated with processing medical and bioinformatics data, and a strong understanding of the healthcare industry.
  • Experience working in Python in a professional environment, ideally in a healthcare or life sciences setting.
  • Desire to learn new skills and tools (e.g., Redshift, Tableau, AWS Lambda, etc.); bonus for experience with healthcare-specific data analysis and visualization tools.
  • Design, build, and implement data systems that fuel our ML and AI models for our health insurance clients, ensuring compliance with healthcare data privacy and security regulations (e.g., HIPAA).
  • Develop tools to extract and process diverse healthcare data sources, including electronic health records (EHRs), medical claims, pharmacy data, and genomic data, and create tools to profile and validate data.
  • Work cross-functionally with data scientists to transform large amounts of health-related and bioinformatics data and store it in a format that facilitates modeling, paying close attention to data quality and integrity in the context of healthcare applications.
  • Contribute to production operations, data pipelines, workflow management, reliability engineering, and more, with an understanding of the critical nature of data reliability in healthcare settings.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a variety of sources using SQL and AWS β€˜big data’ technologies, including experience with healthcare-specific data warehousing and analytics platforms.
  • Leverage expertise as a health and bioinformatics SME to ensure that data pipelines align with the specific requirements of health, medical, and bioinformatics data processing, including the ability to translate complex medical and biological concepts into data requirements.
Posted 2 days ago
Apply
Apply

πŸ“ Worldwide

πŸ” Hospitality

🏒 Company: Lighthouse

  • 4+ years of professional experience using Python, Java, or Scala for data processing (Python preferred)
  • You stay up-to-date with industry trends, emerging technologies, and best practices in data engineering.
  • Improve, manage, and teach standards for code maintainability and performance in code submitted and reviewed
  • Ship large features independently, generate architecture recommendations and have the ability to implement them
  • Great communication: Regularly achieve consensus amongst teams
  • Familiarity with GCP, Kubernetes (GKE preferred),Β  CI/CD tools (Gitlab CI preferred), familiarity with the concept of Lambda Architecture.
  • Experience with Apache Beam or Apache Spark for distributed data processing or event sourcing technologies like Apache Kafka.
  • Familiarity with monitoring tools like Grafana & Prometheus.
  • Design and develop scalable, reliable data pipelines using the Google Cloud stack.
  • Optimise data pipelines for performance and scalability.
  • Implement and maintain data governance frameworks, ensuring data accuracy, consistency, and compliance.
  • Monitor and troubleshoot data pipeline issues, implementing proactive measures for reliability and performance.
  • Collaborate with the DevOps team to automate deployments and improve developer experience on the data front.
  • Work with data science and analytics teams to enable them to bring their research to production grade data solutions, using technologies like airflow, dbt or MLflow (but not limited to)
  • As a part of a platform team, you will communicate effectively with teams across the entire engineering organisation, to provide them with reliable foundational data models and data tools.
  • Mentor and provide technical guidance to other engineers working with data.

PythonSQLApache AirflowETLGCPKubernetesApache KafkaData engineeringCI/CDMentoringTerraformScalaData modeling

Posted 2 days ago
Apply
Apply
πŸ”₯ Data Engineer
Posted 2 days ago

πŸ“ Canada

🧭 Full-Time

πŸ” FinTech

🏒 Company: KOHO

  • 5+ years of mastery in data manipulation and analytics architecture
  • Advanced expertise in dbt (incremental modeling, materializations, snapshots, variables, macros, jinja)
  • Strong knowledge of SQL and how to write efficient SQL queries
  • Strong command of SQL, query optimization, and data warehouse design
  • Building strong relationships with stakeholders (the finance team), scope and prioritize their analytics requests.
  • Understanding business needs and translating them to requirements.
  • Using dbt (Core for development and Cloud for orchestration) to transform, test, deploy, and document financial data while applying software engineering best practices.
  • Troubleshooting variances in reports, and striving to eliminate them at the source.
  • Building game-changing data products that empower the finance team
  • Architecting solutions that transform complex financial data into actionable insights
  • Monitoring, optimizing and troubleshooting warehouse performance (AWS Redshift).
  • Creating scalable, self-service analytics solutions that democratize data access
  • Occasionally building dashboards and reports in Sigma and Drivetrain.
  • Defining processes, building tools, and offering training to empower all data users in the organization.

AWSPythonSQLETLData engineeringData visualizationData modelingFinanceData analytics

Posted 2 days ago
Apply
Apply

🏒 Company: WorkatoπŸ‘₯ 501-1000πŸ’° $200,000,000 Series E over 3 years agoπŸ«‚ Last layoff about 2 years agoSales AutomationCloud ComputingSaaSData IntegrationMarketing Automation

  • 5+ years of work experience building & maintaining data pipelines on data-heavy environments (Data Engineering, Backend with emphasis on data processing)
  • Fluent knowledge of SQL.
  • Strong knowledge of common analytical domain programming languages such as Java, Scala and basic knowledge of Python.
  • Strong experience with Flink and Spark.
  • Experience with Data Pipeline Orchestration tools (Airflow, Dagster or similar).
  • Develop a new usage tracking/billing platform that will provide accurate near real-time data for both circuits.
  • Integrate the new platform smoothly with the back office, internal data warehouse, and in-product analytical and reporting tool called Workato Insights.
  • Address advanced use cases like usage forecasting, anomaly detection, and real-time alerts.
Posted 3 days ago
Apply
Apply

πŸ” Software Development

🏒 Company: WorkatoπŸ‘₯ 501-1000πŸ’° $200,000,000 Series E over 3 years agoπŸ«‚ Last layoff about 2 years agoSales AutomationCloud ComputingSaaSData IntegrationMarketing Automation

  • 5+ years of work experience building & maintaining data pipelines on data-heavy environments
  • Fluent knowledge of SQL
  • Strong knowledge of common analytical domain programming languages such as Java, Scala and basic knowledge of Python
  • Strong experience with Flink and Spark
  • Experience with Data Pipeline Orchestration tools (Airflow, Dagster or similar)
  • Experience with Data Warehousing Solutions (Snowflake, Redshift, BigQuery)
  • Confidence in using Git, K8s and Terraform
  • Develop a new usage tracking/billing platform
  • Integrate the platform with back-office, internal data warehouse, and in-product analytical and reporting tool called Workato Insights
  • Work closely with the ML team
Posted 3 days ago
Apply

Related Articles

Posted about 1 month ago

Why remote work is such a nice opportunity?

Why is remote work so nice? Let's try to see!

Posted 7 months ago

Insights into the evolving landscape of remote work in 2024 reveal the importance of certifications and continuous learning. This article breaks down emerging trends, sought-after certifications, and provides practical solutions for enhancing your employability and expertise. What skills will be essential for remote job seekers, and how can you navigate this dynamic market to secure your dream role?

Posted 8 months ago

Explore the challenges and strategies of maintaining work-life balance while working remotely. Learn about unique aspects of remote work, associated challenges, historical context, and effective strategies to separate work and personal life.

Posted 8 months ago

Google is gearing up to expand its remote job listings, promising more opportunities across various departments and regions. Find out how this move can benefit job seekers and impact the market.

Posted 8 months ago

Learn about the importance of pre-onboarding preparation for remote employees, including checklist creation, documentation, tools and equipment setup, communication plans, and feedback strategies. Discover how proactive pre-onboarding can enhance job performance, increase retention rates, and foster a sense of belonging from day one.