Apply

Data Engineer

Posted 2024-11-07

View full description

πŸ’Ž Seniority level: Senior, 4+ years of relevant professional experience

πŸ“ Location: Ukraine

πŸ” Industry: Transportation and Data Analytics

⏳ Experience: 4+ years of relevant professional experience

πŸͺ„ Skills: PythonSQLBashETLAirflowData engineeringSpark

Requirements:
  • 4+ years of relevant professional experience.
  • Strong experience with Spark and SQL.
  • Experience with scripting languages like Python or Bash.
  • Familiarity with Hadoop ecosystem tools (S3, DynamoDB, MapReduce, etc.).
  • Experience building complex data models and pipelines.
  • 2+ years of experience with workflow management tools (Airflow or similar).
  • Nice to have: experience working with cross-functional analytics, data scientists, and engineering teams.
Responsibilities:
  • Own the core company data pipeline, responsible for scaling data processing flow.
  • Evolve data model and schema based on business and engineering needs.
  • Implement systems to track data quality and consistency.
  • Propose and develop tools for self-service data management.
  • Tune SQL and MapReduce jobs for performance improvement.
  • Drive tech roadmap aligned with team and stakeholders.
  • Write maintainable code considering infrastructure cost and scalability.
  • Participate in code reviews and on-call rotations.
  • Support communication with partners to achieve results.
Apply

Related Jobs

Apply

πŸ“ Any European country

🧭 Full-Time

πŸ” Software development

🏒 Company: Janea Systems

  • Proven experience as a data engineer, preferably with at least 3 or more years of relevant experience.
  • Experience designing cloud native solutions and implementations with Kubernetes.
  • Experience with Airflow or similar pipeline orchestration tools.
  • Strong Python programming skills.
  • Experience collaborating with Data Science and Engineering teams in production environments.
  • Solid understanding of SQL and relational data modeling schemas.
  • Preference for experience with Databricks or Spark.
  • Familiarity with modern data stack design and data lifecycle management.
  • Experience with distributed systems, microservices architecture, and cloud platforms like AWS, Azure, Google Cloud.
  • Excellent problem-solving skills and strong communication skills.

  • Develop and maintain data pipelines using Databricks, Airflow, or similar orchestration systems.
  • Design and implement cloud-native solutions using Kubernetes for high availability.
  • Gather product data requirements and implement solutions to ingest and process data for applications.
  • Collaborate with Data Science and Engineering teams to optimize production-ready applications.
  • Cultivate data from various sources for data scientists and maintain documentation.
  • Design modern data stack for data scientists and ML engineers.

AWSPythonSoftware DevelopmentSQLKubernetesAirflowAzureData scienceSparkCollaboration

Posted 2024-11-07
Apply
Apply
πŸ”₯ Data Engineer
Posted 2024-09-06

πŸ“ Ukraine

🧭 Full-Time

πŸ” Transportation

🏒 Company: LyftπŸ‘₯ 5001-10000πŸ’° $ Post-IPO Equity on 2021-02-01πŸ«‚ on 2023-04-21Ride SharingTransportationAppsMobile AppsSoftware

  • 5+ years of relevant professional experience.
  • Strong experience with Spark.
  • Experience with Hadoop ecosystem, S3, DynamoDB, MapReduce, Yarn, HDFS, Hive, Presto, Pig, HBase, Parquet.
  • Strong skills in at least one scripting language (Python, Bash, etc.).
  • Experience in building complex data models and pipelines.
  • Proficient in SQL languages (MySQL, PostgreSQL, etc.).
  • 4+ years of experience with workflow management tools (Airflow or similar).
  • Experience working with cross-functional teams to bridge business goals with data engineering.

  • Own of the core company data pipeline, responsible for scaling up data processing flow to meet rapid data growth.
  • Evolve data model and schema based on business and engineering needs.
  • Implement systems to track data quality and consistency.
  • Propose and develop tools for self-service data pipeline management (ETL).
  • Tuning SQL and MapReduce jobs for improved performance.
  • Drive the DE team tech roadmap aligning it with team and stakeholders.
  • Build maintainable code considering data infrastructure cost.
  • Participate in code reviews for quality assurance.
  • Engage in on-call rotations for high availability and reliability.
  • Support communication with internal and external partners.

PostgreSQLPythonSQLBashDynamoDBETLHadoopMySQLYarnAirflowData engineeringData scienceSpark

Posted 2024-09-06
Apply
Apply
πŸ”₯ Data Engineer
Posted 2024-07-18

πŸ“ Africa, United Kingdom, Europe, Middle East

🧭 Full-Time

πŸ” Sports and Digital Entertainment

  • 4+ years of experience in a data engineering or similar role.
  • Excellent programming skills in Python and Spark (PySpark / Databricks).
  • 2+ years' experience with Databricks and Azure data services.
  • Experience with other cloud-based data management environments (AWS, Google Cloud, etc.) is an advantage.
  • Experience working with Customer Data Platforms is a plus.
  • Knowledge of managing data quality, including monitoring and alerting.
  • Good understanding of application and database development lifecycles.
  • Experience with remote working and ideally with hyper-growth startups.

  • Building and managing a highly robust and scalable Data Lake/ETL infrastructure.
  • Creating a scalable data pipeline for streaming and batch processing.
  • Ensuring data integrity through fault-tolerant systems and automated data quality monitoring.
  • Continuously improving processes and optimizing performance and scalability.
  • Ensuring privacy and data security are prioritized.
  • Documenting the Data Platform stack comprehensively.
  • Partnering with business stakeholders and product engineering to deliver data products.
  • Collaborating with stakeholders to shape requirements and drive the data platform roadmap.

PythonAgileETLAzureData engineeringSparkDocumentation

Posted 2024-07-18
Apply
Apply
πŸ”₯ Data Engineer
Posted 2024-07-18

πŸ“ Africa, United Kingdom, Europe, Middle East

🧭 Full-Time

πŸ” Sports and Digital Entertainment

  • 4+ years' of experience in a data engineering or similar role.
  • Excellent programming language skills in Python and Spark (PySpark / Databricks).
  • 2+ years’ experience working with Databricks and Azure data services.
  • Experience with cloud-based data management environments like AWS, Google Cloud, Hadoop, Snowflake, Spark, Storm, Kafka is an advantage.
  • Experience managing data quality including monitoring and alerting.
  • Good understanding of the application and database development lifecycle.
  • Remote working experience is a must; hyper-growth startup experience is a strong plus.

  • Design and manage a highly robust and scalable Data Lake/ETL infrastructure and scalable data pipelines for streaming and batch processing.
  • Ensure fault-tolerant systems and processes with a high priority on data integrity, supported by automated quality monitoring and alerting.
  • Continuously seek improvements through fixing recurring problems, delivering helpful features, and optimizing for performance and scalability.
  • Maintain top-class and up-to-date documentation for the entire Data Platform stack.
  • Partner with business stakeholders and product engineering to deliver high-value data products and understand requirements.

PythonETLAzureData engineeringSpark

Posted 2024-07-18
Apply