Apply

Senior Data Engineer

Posted 8 days agoViewed

View full description

πŸ’Ž Seniority level: Senior, 3+ years related work experience

πŸ“ Location: South Africa, Mauritius, Kenya, Nigeria, SAST, NOT STATED

πŸ” Industry: Technology, Marketplaces

⏳ Experience: 3+ years related work experience

πŸͺ„ Skills: AWSPostgreSQLPythonSQLETLGitCI/CD

Requirements:
  • BSc degree in Computer Science, Information Systems, Engineering, or related technical field or equivalent work experience.
  • 3+ years related work experience.
  • Minimum of 2 years experience building and optimizing β€˜big data’ data pipelines, architectures and maintaining data sets.
  • Experienced in Python.
  • Experienced in SQL (PostgreSQL, MS SQL).
  • Experienced in using cloud services: AWS, Azure or GCP.
  • Proficiency in version control, CI/CD and GitHub.
  • Understanding/experience in Glue and PySpark highly desirable.
  • Experience in managing data life cycle.
  • Proficiency in manipulating, processing and architecting large disconnected data sets for analytical requirements.
  • Ability to maintain and optimise processes supporting data transformation, data structures, metadata, dependency and workload management.
  • Good understanding of data management principles - data quality assurance and governance.
  • Strong analytical skills related to working with unstructured datasets.
  • Understanding of message queuing, stream processing, and highly scalable β€˜big data’ datastores.
  • Strong attention to detail.
  • Good communication and interpersonal skills.
Responsibilities:
  • Suggest efficiencies and execute on implementation of internal process improvements in automating manual processes.
  • Implement enhancements and new features across data systems.
  • Improve streamline processes within data systems with support from Senior Data Engineer.
  • Test CI/CD process for optimal data pipelines.
  • Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Highly efficient in ETL processes.
  • Develop and conduct unit tests on data pipelines as well as ensuring data consistency.
  • Develop and maintain automated monitoring solutions.
  • Support reporting and analytics infrastructure.
  • Maintain data quality and data governance as well as upkeep of overall maintenance of data infrastructure systems.
  • Maintain data warehouse and data lake metadata, data catalogue, and user documentation for internal business users.
  • Ensure best practice is implemented and maintained on database.
Apply

Related Jobs

Apply

πŸ“ Canada

🧭 Full-Time

πŸ” Technology for small businesses

🏒 Company: JobberπŸ‘₯ 501-1000πŸ’° $100,000,000 Series D almost 2 years agoSaaSMobileSmall and Medium BusinessesTask Management

  • Proven ability to lead and collaborate in team environments.
  • Strong coding skills in Python and SQL.
  • Expertise in building and maintaining ETL pipelines using tools like Airflow and dbt.
  • Experience with AWS tools such as Redshift, Glue, and Lambda.
  • Familiarity with handling large datasets using tools like Spark.
  • Experience with Terraform for infrastructure management.
  • Knowledge of dimensional modelling, star schemas, and data warehousing.

  • Design, develop, and maintain batch and real-time data pipelines within cloud infrastructure (preferably AWS).
  • Develop tools that automate processes and set up monitoring systems.
  • Collaborate with teams to extract actionable insights from data.
  • Lead initiatives to propose new technologies, participate in design and code reviews, and maintain data integrity.

AWSPythonSQLApache AirflowETLSparkTerraform

Posted 1 day ago
Apply
Apply

πŸ“ US & Canada

πŸ” Fintech

🏒 Company: MesaπŸ‘₯ 11-50Product DesignManufacturingProfessional ServicesSoftware

  • 5+ years of software engineering and operationalizing data pipelines with large and complex datasets.
  • Experience with data modeling, ETL, and patterns for efficient data governance.
  • Experience manipulating large-scale structured and unstructured data.
  • Experience working with batch and stream processing.
  • Strong proficiency with Typescript is a must.
  • Strong SQL skills.
  • Experience using dashboarding tools like Mode, Tableau, Looker.
  • Passionate about event-driven architecture, microservices, data reliability, and observability.
  • Ability to thrive in a fast-paced startup environment and handle ambiguity.

  • Lead data engineering at Mesa by developing and operationalizing scalable and reliable data pipelines.
  • Assemble large, complex data sets that meet functional and non-functional requirements.
  • Work with product and cross functional business stakeholders to enable visualization layers for data-driven decision-making.
  • Drive technical delivery, including architectural design, development, and QA.
  • Participate in customer discovery efforts as beta users help refine the product.

PostgreSQLSQLETLTypeScriptData engineeringMicroservicesData modeling

Posted 2 days ago
Apply
Apply

πŸ“ US, Europe

🧭 Full-Time

πŸ’Έ 175000.0 - 205000.0 USD per year

πŸ” Cloud computing and AI services

🏒 Company: CoreWeaveπŸ’° $642,000,000 Secondary Market about 1 year agoCloud ComputingMachine LearningInformation TechnologyCloud Infrastructure

  • 5+ years of experience with Kubernetes and Helm, with a deep understanding of container orchestration.
  • Hands-on experience administering and optimizing clustered computing technologies on Kubernetes, such as Spark, Trino, Flink, Ray, Kafka, StarRocks or similar.
  • 5+ years of programming experience in C++, C#, Java, or Python.
  • 3+ years of experience scripting in Python or Bash for automation and tooling.
  • Strong understanding of data storage technologies, distributed computing, and big data processing pipelines.
  • Proficiency in data security best practices and managing access in complex systems.

  • Architect, deploy, and scale data storage and processing infrastructure to support analytics and data science workloads.
  • Manage and maintain data lake and clustered computing services, ensuring reliability, security, and scalability.
  • Build and optimize frameworks and tools to simplify the usage of big data technologies.
  • Collaborate with cross-functional teams to align data infrastructure with business goals and requirements.
  • Ensure data governance and security best practices across all platforms.
  • Monitor, troubleshoot, and optimize system performance and resource utilization.

PythonBashKubernetesApache Kafka

Posted 6 days ago
Apply
Apply

πŸ“ US

πŸ’Έ 103200.0 - 128950.0 USD per year

πŸ” Genetics and healthcare

🏒 Company: NateraπŸ‘₯ 1001-5000πŸ’° $250,000,000 Post-IPO Equity over 1 year agoπŸ«‚ Last layoff almost 2 years agoWomen'sBiotechnologyMedicalGeneticsHealth Diagnostics

  • BS degree in computer science or a comparable program or equivalent experience.
  • 8+ years of overall software development experience, ideally in complex data management applications.
  • Experience with SQL and No-SQL databases including Dynamo, Cassandra, Postgres, Snowflake.
  • Proficiency in data technologies such as Hive, Hbase, Spark, EMR, Glue.
  • Ability to manipulate and extract value from large datasets.
  • Knowledge of data management fundamentals and distributed systems.

  • Work with other engineers and product managers to make design and implementation decisions.
  • Define requirements in collaboration with stakeholders and users to create reliable applications.
  • Implement best practices in development processes.
  • Write specifications, design software components, fix defects, and create unit tests.
  • Review design proposals and perform code reviews.
  • Develop solutions for the Clinicogenomics platform utilizing AWS cloud services.

AWSPythonSQLAgileDynamoDBSnowflakeData engineeringPostgresSparkData modelingData management

Posted 18 days ago
Apply
Apply

πŸ“ Brazil

🧭 Full-Time

πŸ” Government affairs technology

  • Bachelor's or Master's degree in Computer Science, Engineering, or related field.
  • 5+ years in data engineering with a proven track record.
  • Expertise in building data pipelines and architectures.
  • Experience in AWS cloud services (EC2, EMR, RDS, Redshift).
  • Proficient in big data tools (Hadoop, Spark, Kafka) and machine learning frameworks (TensorFlow, PyTorch).
  • 3+ years experience with Python.
  • Deep knowledge of SQL and NoSQL databases, workflow management tools (Azkaban, Luigi, Airflow).
  • Understanding of the machine learning model deployment cycle.
  • Experience with vector databases and RAG systems (Langchain, Pinecone, OpenAI/ChatGPT) is a plus.

  • Architect and implement highly scalable advanced Retrieval-Augmented Generation (RAG) data pipelines.
  • Design robust data pipelines for real-time processing and analysis of vast datasets.
  • Design and implement data cleansing and transformation pipelines.
  • Lead cloud-based deployments in AWS ensuring performance and security.
  • Innovate on data architecture for Quorum Copilot's evolving needs.
  • Drive build vs buy, tool selection, and analysis using engineering principles.

AWSPythonSQLApache AirflowETLHadoopKafkaMachine LearningPyTorchNosqlSparkTensorflow

Posted 22 days ago
Apply
Apply
πŸ”₯ Senior Data Engineer
Posted about 1 month ago

πŸ“ United States

🧭 Full-Time

πŸ” Construction technology

🏒 Company: EquipmentShare

  • 7+ years of relevant data platform development experience.
  • Proficient with SQL and a high-order object-oriented programming language (e.g., Python).
  • Experience in designing and building distributed data architectures.
  • Experience with production-grade data pipelines using tools like Airflow, dbt, DataHub, MLFlow.
  • Experience with distributed data platforms like Kafka, Spark, Flink.
  • Familiarity with event data streaming at scale.
  • Proven ability to learn and apply new technologies quickly.
  • Experience in building observability and monitoring into data products.

  • Collaborate with Product Managers, Designers, Engineers, Data Scientists, and Data Analysts.
  • Design, build, and maintain the data platform for automation and self-service.
  • Develop data product framework for analytics features.
  • Create and manage CI/CD pipelines and automated deployment processes.
  • Implement data monitoring and alerting capabilities.
  • Document architecture and processes for collaboration.
  • Mentor peers to enhance their skills.

AWSPythonSQLApache AirflowKafkaMLFlowSnowflakeSparkCI/CD

Posted about 1 month ago
Apply
Apply
πŸ”₯ Senior Data Engineer
Posted about 1 month ago

πŸ“ United States, United Kingdom, Spain, Estonia

πŸ” Identity verification

🏒 Company: VeriffπŸ‘₯ 501-1000πŸ’° $100,000,000 Series C almost 3 years agoπŸ«‚ Last layoff over 1 year agoArtificial Intelligence (AI)Fraud DetectionInformation TechnologyCyber SecurityIdentity Management

  • Expert-level knowledge of SQL, particularly with Redshift.
  • Strong experience in data modeling with an understanding of dimensional data modeling best practices.
  • Proficiency in data transformation frameworks like dbt.
  • Solid programming skills in languages used in data engineering, such as Python or R.
  • Familiarity with orchestration frameworks like Apache Airflow or Luigi.
  • Experience with data from diverse sources including RDBMS and APIs.

  • Collaborate with business stakeholders to design, document, and implement robust data models.
  • Build and optimize data pipelines to transform raw data into actionable insights.
  • Fine-tune query performance and ensure efficient use of data warehouse infrastructure.
  • Ensure data reliability and quality through rigorous testing and monitoring.
  • Assist in migrating from batch processing to real-time streaming systems.
  • Expand support for various use cases including business intelligence and analytics.

PythonSQLApache AirflowETLData engineeringJSONData modeling

Posted about 1 month ago
Apply
Apply

πŸ“ Poland

πŸ” Financial services

🏒 Company: CapcoπŸ‘₯ 101-250Electric VehicleProduct DesignMechanical EngineeringManufacturing

  • Strong cloud provider’s experience on GCP
  • Hands-on experience using Python; Scala and Java are nice to have
  • Experience in data and cloud technologies such as Hadoop, HIVE, Spark, PySpark, DataProc
  • Hands-on experience with schema design using semi-structured and structured data structures
  • Experience using messaging technologies – Kafka, Spark Streaming
  • Strong experience in SQL
  • Understanding of containerisation (Docker, Kubernetes)
  • Experience in design, build and maintain CI/CD Pipelines
  • Enthusiasm to pick up new technologies as needed

  • Work alongside clients to interpret requirements and define industry-leading solutions
  • Design and develop robust, well tested data pipelines
  • Demonstrate and help clients adhere to best practices in engineering and SDLC
  • Lead and mentor the team of junior and mid-level engineers
  • Contribute to security designs and have advanced knowledge of key security technologies
  • Support internal Capco capabilities by sharing insight, experience and credentials

DockerPythonSQLETLGCPGitHadoopKafkaKubernetesSnowflakeAirflowSparkCI/CD

Posted about 1 month ago
Apply
Apply
πŸ”₯ Senior Data Engineer
Posted about 1 month ago

πŸ“ USA

🧭 Full-Time

πŸ’Έ 165000.0 - 210000.0 USD per year

πŸ” E-commerce and AI technologies

🏒 Company: WizardπŸ‘₯ 11-50Customer ServiceManufacturing

  • 5+ years of professional experience in software development with a focus on data engineering.
  • Bachelor's degree in Computer Science or a related field, or equivalent practical experience.
  • Proficiency in Python with software engineering best practices.
  • Strong expertise in building ETL pipelines using tools like Apache Spark.
  • Hands-on experience with NoSQL databases like MongoDB, Cassandra, or DynamoDB.
  • Proficiency in real-time stream processing systems such as Kafka or AWS Kinesis.
  • Experience with cloud platforms (AWS, GCP, Azure) and technologies like Delta Lake and Parquet files.

  • Develop and maintain scalable data infrastructure for batch and real-time processing.
  • Build and optimize ETL pipelines for efficient data flow.
  • Collaborate with data scientists and cross-functional teams for accurate monitoring.
  • Design backend data solutions for microservices architecture.
  • Implement and manage integrations with third-party e-commerce platforms.

AWSPythonDynamoDBElasticSearchETLGCPGitHadoopKafkaMongoDBRabbitmqAzureCassandraRedis

Posted about 1 month ago
Apply
Apply

πŸ“ Ireland, United Kingdom

πŸ” IT, Digital Transformation

🏒 Company: TekenableπŸ‘₯ 51-100Information TechnologyEnterprise SoftwareSoftware

  • Experience with the Azure Intelligent Data Platform, including Data Lakes, Data Factory, Azure Synapse, Azure SQL, and Power BI.
  • Knowledge of Microsoft Fabric.
  • Proficiency in SQL and Python.
  • Understanding of data integration and ETL processes.
  • Ability to work with large datasets and optimize data systems for performance and scalability.
  • Experience working with JSON, CSV, XML, Open API, RESTful API integration and OData v4.0.
  • Strong knowledge of SQL and experience with relational databases.
  • Experience with big data technologies like Hadoop, Spark, or Kafka.
  • Familiarity with cloud platforms such as Azure.
  • Bachelor's degree in Computer Science, Engineering, or a related field.

  • Design, develop, and maintain scalable data pipelines.
  • Collaborate with data analysts to understand their requirements.
  • Implement data integration solutions to meet business needs.
  • Ensure data quality and integrity through testing and validation.
  • Optimize data systems for performance and scalability.

PythonSQLETLHadoopKafkaAzureSparkJSON

Posted about 1 month ago
Apply