Apply

Senior Data Engineer

Posted 1 day agoViewed

View full description

πŸ’Ž Seniority level: Senior

πŸ“ Location: Philippines, Spain, Germany, France, Italy

πŸ” Industry: Fintech, Healthcare, EdTech, Construction, Hospitality

🏒 Company: IntellectsoftπŸ‘₯ 251-500Augmented RealityArtificial Intelligence (AI)DevOpsBlockchainInternet of ThingsUX DesignWeb DevelopmentMobile AppsQuality AssuranceSoftware

Requirements:
  • Proficiency in SQL for data manipulation and querying large datasets.
  • Strong experience with Python for data processing and scripting.
  • Expertise in pySpark for distributed data processing and big data workflows.
  • Hands-on experience with Airflow for workflow orchestration and automation.
  • Deep understanding of Database Management Systems (DBMS), including design, optimization, and maintenance.
  • Solid knowledge of data modeling, ETL pipelines, and data integration.
  • Familiarity with cloud platforms such as AWS, GCP, or Azure.
Responsibilities:
  • Design, develop, and maintain scalable data pipelines and ETL processes.
  • Build and optimize large-scale data processing frameworks using PySpark.
  • Create workflows and automate processes using Apache Airflow.
  • Manage, monitor, and enhance database performance and integrity.
  • Collaborate with cross-functional teams, including data analysts, scientists, and stakeholders, to understand data needs.
  • Ensure data quality, reliability, and compliance with industry standards.
  • Troubleshoot, debug, and optimize data pipelines and workflows.
  • Continuously evaluate and integrate new tools and technologies to enhance data infrastructure.
Apply

Related Jobs

Apply

πŸ” Event Technology

Experience in data engineering, proficiency with data processing tools and languages, and knowledge of modern data architecture.

Contribute to data engineering efforts that drive innovations in event technology and improve user experiences for organizers and participants.
Posted 1 day ago
Apply
Apply

πŸ“ Canada

🧭 Full-Time

πŸ” Technology for small businesses

🏒 Company: JobberπŸ‘₯ 501-1000πŸ’° $100,000,000 Series D almost 2 years agoSaaSMobileSmall and Medium BusinessesTask Management

  • Proven ability to lead and collaborate in team environments.
  • Strong coding skills in Python and SQL.
  • Expertise in building and maintaining ETL pipelines using tools like Airflow and dbt.
  • Experience with AWS tools such as Redshift, Glue, and Lambda.
  • Familiarity with handling large datasets using tools like Spark.
  • Experience with Terraform for infrastructure management.
  • Knowledge of dimensional modelling, star schemas, and data warehousing.

  • Design, develop, and maintain batch and real-time data pipelines within cloud infrastructure (preferably AWS).
  • Develop tools that automate processes and set up monitoring systems.
  • Collaborate with teams to extract actionable insights from data.
  • Lead initiatives to propose new technologies, participate in design and code reviews, and maintain data integrity.

AWSPythonSQLApache AirflowETLSparkTerraform

Posted 1 day ago
Apply
Apply

πŸ“ Georgia, Cyprus, Poland

πŸ” Financial services

🏒 Company: Admirals Group

  • Bachelor's degree in Mathematics, Engineering, Computer Science, or a related field.
  • 5+ years of experience in data engineering.
  • Strong knowledge of data warehouse design methodologies.
  • Extensive experience with Apache Airflow or Prefect, including writing and managing DAGs.
  • Proficiency in SQL.
  • Strong programming skills in Python.
  • Solid experience with BigQuery, Google Cloud Platform, PostgreSQL, MySQL, or similar databases.
  • Hands-on experience with dbt (Data Build Tool).
  • Excellent communication and problem-solving skills.
  • Ability to receive constructive criticism and collaborate effectively with team members.
  • Proficiency in English (at least intermediate level) and fluent Russian.

  • Assemble large, complex data sets that meet functional and non-functional business requirements.
  • Manage and optimize data warehousing processes.
  • Implement and support changes to data models.
  • Develop new data integrations from various sources, including APIs.
  • Enhance data reliability, efficiency, and quality.
Posted 3 days ago
Apply
Apply

πŸ“ US, Europe

🧭 Full-Time

πŸ’Έ 175000.0 - 205000.0 USD per year

πŸ” Cloud computing and AI services

🏒 Company: CoreWeaveπŸ’° $642,000,000 Secondary Market about 1 year agoCloud ComputingMachine LearningInformation TechnologyCloud Infrastructure

  • 5+ years of experience with Kubernetes and Helm, with a deep understanding of container orchestration.
  • Hands-on experience administering and optimizing clustered computing technologies on Kubernetes, such as Spark, Trino, Flink, Ray, Kafka, StarRocks or similar.
  • 5+ years of programming experience in C++, C#, Java, or Python.
  • 3+ years of experience scripting in Python or Bash for automation and tooling.
  • Strong understanding of data storage technologies, distributed computing, and big data processing pipelines.
  • Proficiency in data security best practices and managing access in complex systems.

  • Architect, deploy, and scale data storage and processing infrastructure to support analytics and data science workloads.
  • Manage and maintain data lake and clustered computing services, ensuring reliability, security, and scalability.
  • Build and optimize frameworks and tools to simplify the usage of big data technologies.
  • Collaborate with cross-functional teams to align data infrastructure with business goals and requirements.
  • Ensure data governance and security best practices across all platforms.
  • Monitor, troubleshoot, and optimize system performance and resource utilization.

PythonBashKubernetesApache Kafka

Posted 6 days ago
Apply
Apply

πŸ“ South Africa, Mauritius, Kenya, Nigeria

πŸ” Technology, Marketplaces

  • BSc degree in Computer Science, Information Systems, Engineering, or related technical field or equivalent work experience.
  • 3+ years related work experience.
  • Minimum of 2 years experience building and optimizing β€˜big data’ data pipelines, architectures and maintaining data sets.
  • Experienced in Python.
  • Experienced in SQL (PostgreSQL, MS SQL).
  • Experienced in using cloud services: AWS, Azure or GCP.
  • Proficiency in version control, CI/CD and GitHub.
  • Understanding/experience in Glue and PySpark highly desirable.
  • Experience in managing data life cycle.
  • Proficiency in manipulating, processing and architecting large disconnected data sets for analytical requirements.
  • Ability to maintain and optimise processes supporting data transformation, data structures, metadata, dependency and workload management.
  • Good understanding of data management principles - data quality assurance and governance.
  • Strong analytical skills related to working with unstructured datasets.
  • Understanding of message queuing, stream processing, and highly scalable β€˜big data’ datastores.
  • Strong attention to detail.
  • Good communication and interpersonal skills.

  • Suggest efficiencies and execute on implementation of internal process improvements in automating manual processes.
  • Implement enhancements and new features across data systems.
  • Improve streamline processes within data systems with support from Senior Data Engineer.
  • Test CI/CD process for optimal data pipelines.
  • Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Highly efficient in ETL processes.
  • Develop and conduct unit tests on data pipelines as well as ensuring data consistency.
  • Develop and maintain automated monitoring solutions.
  • Support reporting and analytics infrastructure.
  • Maintain data quality and data governance as well as upkeep of overall maintenance of data infrastructure systems.
  • Maintain data warehouse and data lake metadata, data catalogue, and user documentation for internal business users.
  • Ensure best practice is implemented and maintained on database.

AWSPostgreSQLPythonSQLETLGitCI/CD

Posted 8 days ago
Apply
Apply

🧭 Full-Time

πŸ” Healthcare and weight management

🏒 Company: FoundπŸ‘₯ 51-100πŸ’° $45,999,997 Series C 7 months agoFinancial ServicesBankingFinTech

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
  • 5+ years of experience in data engineering or related areas.
  • Expertise in SQL and data manipulation languages.
  • Proficiency in data pipeline tools such as Airflow, AWS Glue, Spark/PySpark, Pandas.
  • Strong programming skills in Python.
  • Experience with data storage technologies, including Snowflake, Redshift, Databricks.

  • Design, implement, and manage robust and scalable data pipelines.
  • Develop and maintain data models for business intelligence and analytics.
  • Design and implement data warehousing solutions for large data storage.
  • Develop and optimize ETL processes to ensure data accuracy.
  • Implement data quality checks to maintain data integrity.
  • Continuously monitor and optimize data pipelines for performance.
  • Collaborate with data analysts to address data needs.
  • Create and maintain comprehensive documentation of data processes.
Posted 10 days ago
Apply
Apply

πŸ“ US

πŸ’Έ 103200.0 - 128950.0 USD per year

πŸ” Genetics and healthcare

🏒 Company: NateraπŸ‘₯ 1001-5000πŸ’° $250,000,000 Post-IPO Equity over 1 year agoπŸ«‚ Last layoff almost 2 years agoWomen'sBiotechnologyMedicalGeneticsHealth Diagnostics

  • BS degree in computer science or a comparable program or equivalent experience.
  • 8+ years of overall software development experience, ideally in complex data management applications.
  • Experience with SQL and No-SQL databases including Dynamo, Cassandra, Postgres, Snowflake.
  • Proficiency in data technologies such as Hive, Hbase, Spark, EMR, Glue.
  • Ability to manipulate and extract value from large datasets.
  • Knowledge of data management fundamentals and distributed systems.

  • Work with other engineers and product managers to make design and implementation decisions.
  • Define requirements in collaboration with stakeholders and users to create reliable applications.
  • Implement best practices in development processes.
  • Write specifications, design software components, fix defects, and create unit tests.
  • Review design proposals and perform code reviews.
  • Develop solutions for the Clinicogenomics platform utilizing AWS cloud services.

AWSPythonSQLAgileDynamoDBSnowflakeData engineeringPostgresSparkData modelingData management

Posted 18 days ago
Apply
Apply

πŸ“ Brazil

🧭 Full-Time

πŸ” Government affairs technology

  • Bachelor's or Master's degree in Computer Science, Engineering, or related field.
  • 5+ years in data engineering with a proven track record.
  • Expertise in building data pipelines and architectures.
  • Experience in AWS cloud services (EC2, EMR, RDS, Redshift).
  • Proficient in big data tools (Hadoop, Spark, Kafka) and machine learning frameworks (TensorFlow, PyTorch).
  • 3+ years experience with Python.
  • Deep knowledge of SQL and NoSQL databases, workflow management tools (Azkaban, Luigi, Airflow).
  • Understanding of the machine learning model deployment cycle.
  • Experience with vector databases and RAG systems (Langchain, Pinecone, OpenAI/ChatGPT) is a plus.

  • Architect and implement highly scalable advanced Retrieval-Augmented Generation (RAG) data pipelines.
  • Design robust data pipelines for real-time processing and analysis of vast datasets.
  • Design and implement data cleansing and transformation pipelines.
  • Lead cloud-based deployments in AWS ensuring performance and security.
  • Innovate on data architecture for Quorum Copilot's evolving needs.
  • Drive build vs buy, tool selection, and analysis using engineering principles.

AWSPythonSQLApache AirflowETLHadoopKafkaMachine LearningPyTorchNosqlSparkTensorflow

Posted 23 days ago
Apply
Apply

πŸ“ Brazil

🧭 Full-Time

πŸ” Data analysis and AI technology for government affairs

🏒 Company: QuorumπŸ‘₯ 251-500πŸ’° over 4 years agoCRMGovernmentPoliticsSaaSData VisualizationSoftware

  • Bachelor's or Master's degree in Computer Science, Engineering, or related field.
  • 5+ years in data engineering with experience in scaling data-driven products.
  • Expertise in building data pipelines and AWS services (EC2, EMR, RDS, Redshift).
  • Proficient in big data tools (Hadoop, Spark, Kafka) and machine learning frameworks (TensorFlow, PyTorch).
  • 3+ years experience with Python.
  • Deep knowledge of SQL and NoSQL databases, and workflow management tools.

  • Architect and implement highly scalable advanced Retrieval-Augmented Generation (RAG) data pipelines.
  • Design robust data pipelines for real-time processing and analysis.
  • Lead cloud-based deployments in AWS ensuring performance and security.
  • Innovate on data architecture to meet Quorum Copilot's needs.

AWSPythonSQLApache AirflowHadoopKafkaPyTorchNosqlSparkTensorflow

Posted 23 days ago
Apply

Related Articles

Posted 5 months ago

Insights into the evolving landscape of remote work in 2024 reveal the importance of certifications and continuous learning. This article breaks down emerging trends, sought-after certifications, and provides practical solutions for enhancing your employability and expertise. What skills will be essential for remote job seekers, and how can you navigate this dynamic market to secure your dream role?

Posted 5 months ago

Explore the challenges and strategies of maintaining work-life balance while working remotely. Learn about unique aspects of remote work, associated challenges, historical context, and effective strategies to separate work and personal life.

Posted 5 months ago

Google is gearing up to expand its remote job listings, promising more opportunities across various departments and regions. Find out how this move can benefit job seekers and impact the market.

Posted 5 months ago

Learn about the importance of pre-onboarding preparation for remote employees, including checklist creation, documentation, tools and equipment setup, communication plans, and feedback strategies. Discover how proactive pre-onboarding can enhance job performance, increase retention rates, and foster a sense of belonging from day one.

Posted 5 months ago

The article explores the current statistics for remote work in 2024, covering the percentage of the global workforce working remotely, growth trends, popular industries and job roles, geographic distribution of remote workers, demographic trends, work models comparison, job satisfaction, and productivity insights.