Apply

Data Engineer

Posted 3 months agoViewed

View full description

πŸ’Ž Seniority level: Middle, At least 3 years

πŸ“ Location: Poland

πŸ” Industry: Healthcare

🏒 Company: Sunscrapers sp. z o.o.

πŸ—£οΈ Languages: English

⏳ Experience: At least 3 years

πŸͺ„ Skills: SparkAnalytical SkillsCI/CDCustomer serviceAttention to detail

Requirements:
  • At least 3 years of professional experience as a data engineer.
  • Undergraduate or graduate degree in Computer Science, Engineering, Mathematics, or similar.
  • Excellent command of spoken and written English (at least C1).
  • Experience in designing data infrastructure using Python, PySpark, Apache Spark, and Delta Spark.
  • Experience managing production spark clusters in Databricks.
  • Proficiency in SQL and experience with Delta Lake architectures.
  • Great analytical skills and attention to detail.
  • Creative problem-solving skills.
  • Great customer service and troubleshooting skills.
Responsibilities:
  • Design and optimize data infrastructure using Python, PySpark, Apache Spark, and Delta Spark.
  • Implement strong data governance frameworks to ensure quality, security, and compliance.
  • Connect Delta Tables to a SQL engine (like Databricks SQL) for efficient querying and analytics.
  • Leverage strong DevOps expertise to deploy and maintain data systems in Azure.
  • Create batch and streaming pipelines for data processing.
Apply

Related Jobs

Apply

πŸ“ Poland

🧭 Full-Time

πŸ” Software Development

🏒 Company: N-iXπŸ‘₯ 1001-5000IT Services and IT Consulting

  • Minimum of 3-4 years as data engineer, or in a relevant field
  • Advanced experience in Python, particularly in delivering production-grade data pipelines and troubleshooting code-based bugs.
  • Structured approach to data insights
  • Familiarity with cloud platforms (preferably Azure)
  • Experience with Databricks, Snowflake, or similar data platforms
  • Knowledge of relational databases, with proficiency in SQL
  • Experience using Apache Spark
  • Experience in creating and maintaining structured documentation
  • Proficiency in utilizing testing frameworks to ensure code reliability and maintainability
  • Experience with Gitlab or equivalent tools
  • English Proficiency: B2 level or higher
  • Design, build, and maintain data pipelines using Python
  • Collaborate with an international team to develop scalable data solutions
  • Conduct in-depth analysis and debugging of system bugs (Tier 2)
  • Develop and maintain smart documentation for process consistency, including the creation and refinement of checklists and workflows
  • Set up and configure new tenants, collaborating closely with team members to ensure smooth onboarding
  • Write integration tests to ensure the quality and reliability of data services
  • Work with Gitlab to manage code and collaborate with team members
  • Utilize Databricks for data processing and management

DockerPythonSQLCloud ComputingData AnalysisETLGitKubernetesSnowflakeApache KafkaAzureData engineeringRDBMSREST APIPandasCI/CDDocumentationMicroservicesDebugging

Posted 6 days ago
Apply
Apply
πŸ”₯ Sr Data Engineer
Posted 12 days ago

πŸ“ US, Europe and India

πŸ” Software Development

  • Extensive experience in developing data and analytics applications in geographically distributed teams
  • Hands-on experience in using modern architectures and frameworks, structured, semi-structured and unstructured data, and programming with Python
  • Hands-on SQL knowledge and experience with relational databases such as MySQL, PostgreSQL, and others
  • Hands-on ETL knowledge and experience
  • Knowledge of commercial data platforms (Databricks, Snowflake) or cloud data warehouses (Redshift, BigQuery)
  • Knowledge of data catalog and MDM tooling (Atlan, Alation, Informatica, Collibra)
  • CICD pipeline for continuous deployment (CloudFormation template)
  • Knowledge of how machine learning / A.I. workloads are implemented in batch and streaming, including the preparing of datasets, training models, and using pre-trained models
  • Exposure to software engineering processes that can be applied to Data Ecosystems
  • Excellent analytical and troubleshooting skills
  • Excellent communication skills
  • Excellent English (both verbal and written)
  • BS. in Computer Science or equivalent
  • Design and develop our best-in-class cloud platform, working on all parts of the code stack from front-end, REST and asynchronous APIs, back-end application logic, SQL/NoSQL databases and integrations with external systems
  • Develop solutions across the data and analytics stack from ETL and Streaming data
  • Design and develop reusable libraries
  • Enhance strong processes in Data Ecosystem
  • Write unit and integration tests

PythonSQLApache AirflowCloud ComputingETLMachine LearningSnowflakeAlgorithmsApache KafkaData engineeringData StructuresCommunication SkillsAnalytical SkillsCI/CDRESTful APIsDevOpsMicroservicesExcellent communication skillsData visualizationData modelingData analyticsData management

Posted 12 days ago
Apply
Apply

πŸ“ Poland

🧭 Full-Time

πŸ” Data Engineering

🏒 Company: Softeta

  • Advanced degree in computer science or related fields
  • 5+ years of experience as a data engineer
  • Proficiency with AirFlow, DBT, DataFlow or similar products
  • Strong knowledge of data structures and data modeling
  • CI/CD pipeline and MLOPs experience
  • Experience with large data sets
  • Experience with GCP / BigQuery
  • Create and maintain pipeline architectures in AirFlow and DBT
  • Assemble large and/or complex datasets for business requirements
  • Improve processes and infrastructure for scale, delivery and automation
  • Maintain and improve data warehouse structure
  • Adjust methods and techniques for large data environments
  • Adopt best-practice coding and review processes
  • Communicate technical details to stakeholders
  • Investigate and resolve anomalies in data
  • Develop and maintain documentation for data products

SQLGCPAirflowData engineeringCI/CDData modeling

Posted 12 days ago
Apply
Apply

πŸ“ Poland, Spain, United Kingdom

πŸ” Beauty marketplace

🏒 Company: BooksyπŸ‘₯ 501-1000πŸ’° Debt Financing 5 months agoMobile PaymentsMarketplaceSaaSPaymentsMobile AppsWellnessSoftware

  • 5+ years of experience in backend and data engineering, with strong system design skills.
  • Practical proficiency in cloud technologies (ideally GCP), with expertise in tools like BigQuery, Dataflow, Pub/Sub, or similar.
  • Hands-on experience with CI/CD tools (e.g., GitLab CI) and infrastructure as code.
  • Strong focus on data quality, governance, and building scalable, automated workflows.
  • Experience designing self-service data platforms and infrastructure.
  • Proven ability to mentor and support others, fostering data literacy across teams.
  • Design and implement robust data solutions.
  • Enable teams to make informed, data-driven decisions.
  • Ensure data is accessible, reliable, and well-governed.
  • Play a key role in driving growth, innovation, and operational excellence.

GCPData engineeringCI/CDData modeling

Posted 23 days ago
Apply
Apply

πŸ“ Poland

🧭 Full-Time

πŸ” Cybersecurity

🏒 Company: AdaptiqπŸ‘₯ 51-100ConsultingProfessional ServicesSoftware

  • At least 5 years of experience in Data Engineering domain.
  • At least 2 years of experience with GoLang.
  • Proficiency in SQL, NoSQL, Kafka/Pulsar, ELK, Redis and column store databases.
  • Experienced with big data tools such as Spark or Flink to enhance system performance and scalability.
  • Proven experience with Kubernetes (K8S) and familiarity with GTP tools.
  • Ability to work effectively in a collaborative team environment.
  • Excellent communication skills and a proactive approach to learning and development.
  • Architect, develop, and maintain robust distributed systems with complex requirements, ensuring scalability and performance.
  • Work closely with cross-functional teams to ensure the seamless integration and functionality of software components.
  • Implement and optimize scalable server systems, utilizing parallel processing, microservices architecture, and security development principles.
  • Utilize SQL, NoSQL, Kafka/Pulsar, ELK, Redis and column store databases in system design and development.
  • Leverage big data tools such as Spark or Flink to enhance system performance and scalability.
  • Demonstrate proficiency in Kubernetes (K8S) and familiarity with GTP tools for efficient deployment and management.

SQLKafkaKubernetesData engineeringGoRedisNosqlSpark

Posted 29 days ago
Apply
Apply

πŸ“ Europe

🧭 Full-Time

πŸ” Supply Chain Risk Analytics

🏒 Company: Everstream AnalyticsπŸ‘₯ 251-500πŸ’° $50,000,000 Series B almost 2 years agoProductivity ToolsArtificial Intelligence (AI)LogisticsMachine LearningRisk ManagementAnalyticsSupply Chain ManagementProcurement

  • Deep understanding of Python, including data manipulation and analysis libraries like Pandas and NumPy.
  • Extensive experience in data engineering, including ETL, data warehousing, and data pipelines.
  • Strong knowledge of AWS services, such as RDS, Lake Formation, Glue, Spark, etc.
  • Experience with real-time data processing frameworks like Apache Kafka/MSK.
  • Proficiency in SQL and NoSQL databases, including PostgreSQL, Opensearch, and Athena.
  • Ability to design efficient and scalable data models.
  • Strong analytical skills to identify and solve complex data problems.
  • Excellent communication and collaboration skills to work effectively with cross-functional teams.
  • Manage and grow a remote team of data engineers based in Europe.
  • Collaborate with Platform and Data Architecture teams to deliver robust, scalable, and maintainable data pipelines.
  • Lead and own data engineering projects, including data ingestion, transformation, and storage.
  • Develop and optimize real-time data processing pipelines using technologies like Apache Kafka/MSK or similar.
  • Design and implement data lakehouses and ETL pipelines using AWS services like Glue or similar.
  • Create efficient data models and optimize database queries for optimal performance.
  • Work closely with data scientists, product managers, and engineers to understand data requirements and translate them into technical solutions.
  • Mentor junior data engineers and share your expertise. Establish and promote best practices.

AWSPostgreSQLPythonSQLETLApache KafkaNosqlSparkData modeling

Posted about 1 month ago
Apply
Apply
πŸ”₯ Senior Data Engineer
Posted about 1 month ago

πŸ“ South Africa, Mauritius, Kenya, Nigeria

πŸ” Technology, Marketplaces

  • BSc degree in Computer Science, Information Systems, Engineering, or related technical field or equivalent work experience.
  • 3+ years related work experience.
  • Minimum of 2 years experience building and optimizing β€˜big data’ data pipelines, architectures and maintaining data sets.
  • Experienced in Python.
  • Experienced in SQL (PostgreSQL, MS SQL).
  • Experienced in using cloud services: AWS, Azure or GCP.
  • Proficiency in version control, CI/CD and GitHub.
  • Understanding/experience in Glue and PySpark highly desirable.
  • Experience in managing data life cycle.
  • Proficiency in manipulating, processing and architecting large disconnected data sets for analytical requirements.
  • Ability to maintain and optimise processes supporting data transformation, data structures, metadata, dependency and workload management.
  • Good understanding of data management principles - data quality assurance and governance.
  • Strong analytical skills related to working with unstructured datasets.
  • Understanding of message queuing, stream processing, and highly scalable β€˜big data’ datastores.
  • Strong attention to detail.
  • Good communication and interpersonal skills.
  • Suggest efficiencies and execute on implementation of internal process improvements in automating manual processes.
  • Implement enhancements and new features across data systems.
  • Improve streamline processes within data systems with support from Senior Data Engineer.
  • Test CI/CD process for optimal data pipelines.
  • Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Highly efficient in ETL processes.
  • Develop and conduct unit tests on data pipelines as well as ensuring data consistency.
  • Develop and maintain automated monitoring solutions.
  • Support reporting and analytics infrastructure.
  • Maintain data quality and data governance as well as upkeep of overall maintenance of data infrastructure systems.
  • Maintain data warehouse and data lake metadata, data catalogue, and user documentation for internal business users.
  • Ensure best practice is implemented and maintained on database.

AWSPostgreSQLPythonSQLETLGitCI/CD

Posted about 1 month ago
Apply
Apply

πŸ“ Poland

πŸ” Financial services

🏒 Company: CapcoπŸ‘₯ 101-250Electric VehicleProduct DesignMechanical EngineeringManufacturing

  • Strong cloud provider’s experience on GCP
  • Hands-on experience using Python; Scala and Java are nice to have
  • Experience in data and cloud technologies such as Hadoop, HIVE, Spark, PySpark, DataProc
  • Hands-on experience with schema design using semi-structured and structured data structures
  • Experience using messaging technologies – Kafka, Spark Streaming
  • Strong experience in SQL
  • Understanding of containerisation (Docker, Kubernetes)
  • Experience in design, build and maintain CI/CD Pipelines
  • Enthusiasm to pick up new technologies as needed
  • Work alongside clients to interpret requirements and define industry-leading solutions
  • Design and develop robust, well tested data pipelines
  • Demonstrate and help clients adhere to best practices in engineering and SDLC
  • Lead and mentor the team of junior and mid-level engineers
  • Contribute to security designs and have advanced knowledge of key security technologies
  • Support internal Capco capabilities by sharing insight, experience and credentials

DockerPythonSQLETLGCPGitHadoopKafkaKubernetesSnowflakeAirflowSparkCI/CD

Posted 2 months ago
Apply
Apply
πŸ”₯ Data Engineer
Posted 4 months ago

πŸ“ Poland

πŸ” Consulting

🏒 Company: Infosys Consulting - Europe

  • Bachelor’s degree in Computer Science, Engineering, or a related field.
  • Proven experience as a Data Engineer or similar role in large scale data implementation.
  • Strong experience in SQL and relational database systems (MySQL, PostgreSQL, Oracle).
  • Proficiency in programming languages such as Python, Java, or Scala.
  • Minimum 5 years of hands-on experience with ETL tools like Apache Nifi or Talend.
  • Familiarity with big data technologies like Hadoop and Spark.
  • Minimum 3 years with cloud-based data services (AWS, Azure, Google Cloud).
  • Knowledge of data modeling, database design, and architecture best practices.
  • Experience with version control (e.g., Git) and agile practices.
  • Develop, construct, test, and maintain scalable data pipelines for large data sets.
  • Integrate data from differing source systems into the data lake or warehouse.
  • Implement ETL processes and ensure data quality and integrity.
  • Design and implement database and data warehousing solutions.
  • Work with cloud platforms to set up data infrastructure.
  • Collaborate with teams and document workflows.
  • Implement data governance and compliance measures.
  • Monitor performance and continuously improve processes.
  • Automate tasks and develop tools for data management.

AWSDockerLeadershipPostgreSQLPythonSQLAgileBusiness IntelligenceDynamoDBETLGitHadoopJavaJenkinsKafkaKubernetesMachine LearningMongoDBMySQLOracleStrategyAzureCassandraData engineeringData scienceNosqlSparkCommunication SkillsCollaborationCI/CDScalaData modeling

Posted 4 months ago
Apply
Apply

πŸ“ Poland

🧭 Full-Time

πŸ” Enterprise security products

🏒 Company: Intuition Machines, Inc.πŸ‘₯ 51-100InternetEducationInternet of ThingsMachine LearningSoftware

  • Thoughtful, conscientious, and self-directed.
  • Experience with data engineering services on major cloud providers.
  • Minimum of 3 years in a data role involving data store design, feature engineering, and building reliable data pipelines.
  • Proven ability to independently decide on data processing strategies.
  • At least 2 years of professional software development experience outside data engineering.
  • Significant coding experience in Python.
  • Experience in building/maintaining distributed data pipelines.
  • Experience with Kafka infrastructure and applications.
  • Deep understanding of SQL and NoSQL databases (preferably Clickhouse).
  • Familiarity with public cloud providers (AWS or Azure).
  • Experience with CI/CD and orchestration platforms: Kubernetes, containerization.
  • Maintain, extend, and improve existing data/ML workflows, and implement new workflows for high-velocity data.
  • Provide systems for ML engineers and researchers to build datasets on demand.
  • Influence data storage and processing strategies.
  • Collaborate with ML, frontend, and backend teams to enhance the data platform.
  • Reduce deployment time for dashboards and ML models.
  • Establish best practices and develop pipelines for efficient dataset usage.
  • Handle large datasets under performance constraints.
  • Iterate quickly to ensure deployment of products or features to millions of users.

AWSPythonSoftware DevelopmentSQLKafkaKubernetesStrategyAzureClickhouseData engineeringNosqlCI/CD

Posted 4 months ago
Apply