Apply

Senior Data Engineer

Posted about 2 months agoViewed

View full description

πŸ’Ž Seniority level: Senior, At least 5 years

πŸ“ Location: Poland

πŸ” Industry: Software development

🏒 Company: Sunscrapers sp. z o.o.

πŸ—£οΈ Languages: English

⏳ Experience: At least 5 years

πŸͺ„ Skills: PythonSQLSnowflakeAirflowAnalytical SkillsCustomer serviceDevOpsAttention to detail

Requirements:
  • At least 5 years of professional experience as a data engineer.
  • Undergraduate or graduate degree in Computer Science, Engineering, Mathematics, or similar.
  • Excellent command in spoken and written English, at least C1.
  • Strong professional experience with Python and SQL.
  • Hands-on experience on DBT and Snowflake.
  • Experience in building data pipelines with Airflow or alternative solutions.
  • Strong understanding of data modeling techniques like Kimball Star Schema.
  • Great analytical skills and attention to detail.
  • Creative problem-solving skills.
  • Great customer service and troubleshooting skills.
Responsibilities:
  • Modeling datasets and schemes for consistency and easy access.
  • Design and implement data transformations and data marts.
  • Integrating third-party systems and external data sources into data warehouse.
  • Building data flows for fetching, aggregation, and data modeling using batch pipelines.
Apply

Related Jobs

Apply

πŸ“ US, Europe

🧭 Full-Time

πŸ’Έ 175000.0 - 205000.0 USD per year

πŸ” Cloud computing and AI services

🏒 Company: CoreWeaveπŸ’° $642,000,000 Secondary Market about 1 year agoCloud ComputingMachine LearningInformation TechnologyCloud Infrastructure

  • 5+ years of experience with Kubernetes and Helm, with a deep understanding of container orchestration.
  • Hands-on experience administering and optimizing clustered computing technologies on Kubernetes, such as Spark, Trino, Flink, Ray, Kafka, StarRocks or similar.
  • 5+ years of programming experience in C++, C#, Java, or Python.
  • 3+ years of experience scripting in Python or Bash for automation and tooling.
  • Strong understanding of data storage technologies, distributed computing, and big data processing pipelines.
  • Proficiency in data security best practices and managing access in complex systems.

  • Architect, deploy, and scale data storage and processing infrastructure to support analytics and data science workloads.
  • Manage and maintain data lake and clustered computing services, ensuring reliability, security, and scalability.
  • Build and optimize frameworks and tools to simplify the usage of big data technologies.
  • Collaborate with cross-functional teams to align data infrastructure with business goals and requirements.
  • Ensure data governance and security best practices across all platforms.
  • Monitor, troubleshoot, and optimize system performance and resource utilization.

PythonBashKubernetesApache Kafka

Posted 6 days ago
Apply
Apply

πŸ“ South Africa, Mauritius, Kenya, Nigeria

πŸ” Technology, Marketplaces

  • BSc degree in Computer Science, Information Systems, Engineering, or related technical field or equivalent work experience.
  • 3+ years related work experience.
  • Minimum of 2 years experience building and optimizing β€˜big data’ data pipelines, architectures and maintaining data sets.
  • Experienced in Python.
  • Experienced in SQL (PostgreSQL, MS SQL).
  • Experienced in using cloud services: AWS, Azure or GCP.
  • Proficiency in version control, CI/CD and GitHub.
  • Understanding/experience in Glue and PySpark highly desirable.
  • Experience in managing data life cycle.
  • Proficiency in manipulating, processing and architecting large disconnected data sets for analytical requirements.
  • Ability to maintain and optimise processes supporting data transformation, data structures, metadata, dependency and workload management.
  • Good understanding of data management principles - data quality assurance and governance.
  • Strong analytical skills related to working with unstructured datasets.
  • Understanding of message queuing, stream processing, and highly scalable β€˜big data’ datastores.
  • Strong attention to detail.
  • Good communication and interpersonal skills.

  • Suggest efficiencies and execute on implementation of internal process improvements in automating manual processes.
  • Implement enhancements and new features across data systems.
  • Improve streamline processes within data systems with support from Senior Data Engineer.
  • Test CI/CD process for optimal data pipelines.
  • Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Highly efficient in ETL processes.
  • Develop and conduct unit tests on data pipelines as well as ensuring data consistency.
  • Develop and maintain automated monitoring solutions.
  • Support reporting and analytics infrastructure.
  • Maintain data quality and data governance as well as upkeep of overall maintenance of data infrastructure systems.
  • Maintain data warehouse and data lake metadata, data catalogue, and user documentation for internal business users.
  • Ensure best practice is implemented and maintained on database.

AWSPostgreSQLPythonSQLETLGitCI/CD

Posted 8 days ago
Apply
Apply

πŸ“ Poland

πŸ” Financial services

🏒 Company: CapcoπŸ‘₯ 101-250Electric VehicleProduct DesignMechanical EngineeringManufacturing

  • Strong cloud provider’s experience on GCP
  • Hands-on experience using Python; Scala and Java are nice to have
  • Experience in data and cloud technologies such as Hadoop, HIVE, Spark, PySpark, DataProc
  • Hands-on experience with schema design using semi-structured and structured data structures
  • Experience using messaging technologies – Kafka, Spark Streaming
  • Strong experience in SQL
  • Understanding of containerisation (Docker, Kubernetes)
  • Experience in design, build and maintain CI/CD Pipelines
  • Enthusiasm to pick up new technologies as needed

  • Work alongside clients to interpret requirements and define industry-leading solutions
  • Design and develop robust, well tested data pipelines
  • Demonstrate and help clients adhere to best practices in engineering and SDLC
  • Lead and mentor the team of junior and mid-level engineers
  • Contribute to security designs and have advanced knowledge of key security technologies
  • Support internal Capco capabilities by sharing insight, experience and credentials

DockerPythonSQLETLGCPGitHadoopKafkaKubernetesSnowflakeAirflowSparkCI/CD

Posted about 1 month ago
Apply
Apply

πŸ“ UK, EU

πŸ” Consultancy

🏒 Company: The Dot CollectiveπŸ‘₯ 11-50Cloud ComputingAnalyticsInformation Technology

  • Advanced knowledge of distributed computing with Spark.
  • Extensive experience with AWS data offerings such as S3, Glue, Lambda.
  • Ability to build CI/CD processes including Infrastructure as Code (e.g. terraform).
  • Expert Python and SQL skills.
  • Agile ways of working.

  • Leading a team of data engineers.
  • Designing and implementing cloud-native data platforms.
  • Owning and managing technical roadmap.
  • Engineering well-tested, scalable, and reliable data pipelines.

AWSPythonSQLAgileSCRUMSparkCollaborationAgile methodologies

Posted 2 months ago
Apply
Apply

πŸ“ Poland

🧭 Full-Time

πŸ” Enterprise security products

🏒 Company: Intuition Machines, Inc.πŸ‘₯ 51-100InternetEducationInternet of ThingsMachine LearningSoftware

  • Thoughtful, conscientious, and self-directed.
  • Experience with data engineering services on major cloud providers.
  • Minimum of 3 years in a data role involving data store design, feature engineering, and building reliable data pipelines.
  • Proven ability to independently decide on data processing strategies.
  • At least 2 years of professional software development experience outside data engineering.
  • Significant coding experience in Python.
  • Experience in building/maintaining distributed data pipelines.
  • Experience with Kafka infrastructure and applications.
  • Deep understanding of SQL and NoSQL databases (preferably Clickhouse).
  • Familiarity with public cloud providers (AWS or Azure).
  • Experience with CI/CD and orchestration platforms: Kubernetes, containerization.

  • Maintain, extend, and improve existing data/ML workflows, and implement new workflows for high-velocity data.
  • Provide systems for ML engineers and researchers to build datasets on demand.
  • Influence data storage and processing strategies.
  • Collaborate with ML, frontend, and backend teams to enhance the data platform.
  • Reduce deployment time for dashboards and ML models.
  • Establish best practices and develop pipelines for efficient dataset usage.
  • Handle large datasets under performance constraints.
  • Iterate quickly to ensure deployment of products or features to millions of users.

AWSPythonSoftware DevelopmentSQLKafkaKubernetesStrategyAzureClickhouseData engineeringNosqlCI/CD

Posted 3 months ago
Apply
Apply

πŸ“ Central EU or Americas

🧭 Full-Time

πŸ” Real estate investment

🏒 Company: RoofstockπŸ‘₯ 501-1000πŸ’° $240,000,000 Series E almost 3 years agoπŸ«‚ Last layoff almost 2 years agoRental PropertyPropTechMarketplaceReal EstateFinTech

  • BS or MS in a technical field: computer science, engineering or similar.
  • 8+ years technical experience working with data.
  • 5+ years strong experience building scalable data services and applications using SQL, Python, Java/Kotlin.
  • Deep understanding of microservices architecture and RESTful API development.
  • Experience with AWS services including messaging and familiarity with real-time data processing frameworks.
  • Significant experience building and deploying data-related infrastructure and robust data pipelines.
  • Strong understanding of data architecture and related challenges.
  • Experience with complex problems and distributed systems focusing on scalability and performance.
  • Strong communication and interpersonal skills.
  • Independent worker able to collaborate with cross-functional teams.

  • Improve and maintain the data services platform.
  • Deliver high-quality data services promptly, ensuring data governance and integrity while meeting objectives and maintaining SLAs.
  • Develop effective architectures and produce key code components contributing to technical solutions.
  • Integrate a diverse network of third-party tools into a cohesive, scalable platform.
  • Continuously enhance system performance and reliability by diagnosing and resolving operational issues.
  • Ensure rigorous testing of the team's work through automated methods.
  • Support data infrastructure and collaborate with the data team on scalable data pipelines.
  • Work within an Agile/Scrum framework with cross-functional teams to deliver value.
  • Influence the enterprise data platform architecture and standards.

AWSDockerPythonSQLAgileETLSCRUMSnowflakeAirflowData engineeringgRPCRESTful APIsMicroservices

Posted 5 months ago
Apply