Apply

Data Engineer

Posted 17 days agoViewed

View full description

💎 Seniority level: Senior, 10+ years

📍 Location: India

🏢 Company: InfyStrat

🗣️ Languages: English

⏳ Experience: 10+ years

Requirements:
  • Bachelor's degree in Computer Science, Engineering, or a related field.
  • 10+ years of experience as a Data Engineer or in a similar role.
  • Proficiency in SQL and experience with database technologies (e.g., PostgreSQL, SQL Server, MySQL).
  • Hands-on experience with data processing frameworks (e.g., Apache Spark, Apache Kafka).
  • Familiarity with cloud services (e.g., AWS, Azure, Google Cloud) and data warehousing solutions.
  • Strong programming skills, preferably in Python, Java, or Scala.
  • Excellent analytical and problem-solving skills.
  • Effective communication and collaboration abilities to work within cross-functional teams.
Responsibilities:
  • Design, build, and maintain scalable and efficient data pipelines for collecting, processing, and storing data.
  • Collaborate with data scientists and analysts to understand data requirements and ensure data availability.
  • Optimize existing data pipelines for performance, reliability, and scalability.
  • Monitor data flow and troubleshoot issues related to data quality and integrity.
  • Implement data governance best practices to ensure compliance and security.
  • Stay up-to-date with emerging technologies and propose improvements to data infrastructure.
Apply

Related Jobs

Apply

📍 Worldwide

🔍 Algorithmic Trading

  • Commercial experience of financial instruments and markets (equities, futures, options, forex, etc.), particularly understanding how historical data is used for algorithmic trading.
  • Familiarity with market data formats (e.g., MDP, ITCH, FIX, SWIFT, proprietary exchange APIs) and market data providers.
  • Strong programming skills in Python (Go/Rust is a nice to have)
  • Familiarity with ETL (Extract, Transform, Load) processes (or other data pipeline architecture) and tools to clean, normalize, and validate large datasets.
  • Commercial experience in building and maintaining large-scale time series or historical market data in the financial services industry.
  • Strong SQL proficiency: aggregations, joins, subqueries, window functions (first, last, candle, histogram), indexes, query planning, and optimization.
  • Strong problem-solving skills and attention to detail, particularly in ensuring data quality and reliability.
  • Bachelor’s degree in Computer Science, Engineering, or related field.
  • Design, develop, and maintain systems for the acquisition, storage, and retrieval of historical market data from multiple financial exchanges, brokers, and market data vendors
  • Ensure the integrity and accuracy of historical market data, including implementing data validation, cleansing, and normalization processes.
  • Build and optimize data storage solutions, ensuring they are scalable, high-performance, and capable of managing large volumes of time-series data.
  • Develop systems for data versioning and reconciliation to ensure that changes in exchange formats or corrections to past data are properly handled.
  • Implement robust integrations with various market data providers, exchanges, and proprietary data sources to continuously collect and store historical data.
  • Build internal tools to provide easy access to historical data for research and analysis, ensuring performance, ease of use, and data integrity
  • Work closely with quantitative researchers and traders to understand their data requirements and optimize the systems for data retrieval and analysis for backtesting and strategy development.
  • Develop scalable solutions to handle growing volumes of historical market data, including ensuring efficient queries and data retrieval for research and backtesting needs.
  • Work on optimizing data storage solutions, balancing cost-efficiency with performance, and ensuring that large datasets are managed effectively.
  • Ensure historical market data systems comply with regulatory requirements and assist in data retention, integrity, and reporting audits.

PythonSQLETLAlgorithmsData engineeringData StructuresRESTful APIsLinuxData modelingData management

Posted 36 minutes ago
Apply
Apply

📍 States of São Paulo and Rio Grande do Sul, cities of Rio de Janeiro, Belo Horizonte, Florianópolis and Fortaleza

🏢 Company: TELUS Digital Brazil

  • 5+ years of relevant development experience writing high-quality code as a Data Engineer
  • Have actively participated in the design and development of data architectures
  • Hands-on experience in developing and optimizing data pipelines
  • Comprehensive understanding of data modeling, ETL processes, and both SQL and NoSQL databases
  • Experience with a general-purpose programming language such as Python or Scala
  • Experience with GCP platforms and services.
  • Experience with containerization technologies such as Docker and Kubernetes
  • Proven track record in implementing and optimizing data warehousing solutions and data lakes
  • Proficiency in DevOps practices and automation tools for continuous integration and deployment of data solutions
  • Experience with machine learning workflows and supporting data scientists in model deployment
  • Solid understanding of data security and compliance requirements in large-scale data environments
  • Strong ability to communicate effectively with teams and stakeholders, providing and receiving feedback to improve product outcomes.
  • Proficient in communicating and writing in English
  • Develop and optimize scalable, high-performing, secure, and reliable data pipelines that address diverse business needs and considerations
  • Identify opportunities to enhance internal processes, implement automation to streamline manual tasks, and contribute to infrastructure redesign
  • Help mentor and coach a product team towards shared goals and outcomes
  • Navigate difficult conversations by providing constructive feedback to teams
  • Identify obstacles to ensure quality, improve our user experience and how we build tests
  • Be self-aware of limitations, yet curious to learn new solutions while being receptive to constructive feedback from teammates
  • Engage in ongoing research and adoption of new technologies, libraries, frameworks, and best practices to enhance the capabilities of the data team

DockerPythonSQLETLGCPHadoopKafkaKubernetesMachine LearningAirflowData engineeringNosqlSparkCI/CDAgile methodologiesRESTful APIsDevOpsScalaData visualizationData modeling

Posted about 16 hours ago
Apply
Apply
🔥 Senior Data Engineer
Posted about 22 hours ago

📍 India

🧭 Full-Time

  • Hands-on experience in implementing, supporting, and administering modern cloud-based data solutions (Google BigQuery, AWS Redshift, Azure Synapse, Snowflake, etc.).
  • Strong programming skills in SQL, Java, and Python.
  • Experience in configuring and managing data pipelines using Apache Airflow, Informatica, Talend, SAP BODS or API-based extraction.
  • Expertise in real-time data processing frameworks.
  • Strong understanding of Git and CI/CD for automated deployment and version control.
  • Experience with Infrastructure-as-Code tools like Terraform for cloud resource management.
  • Good stakeholder management skills to collaborate effectively across teams.
  • Solid understanding of SAP ERP data and processes to integrate enterprise data sources.
  • Exposure to data visualization and front-end tools (Tableau, Looker, etc).
  • Design and Develop Data Pipelines: Create data pipelines to extract data from various sources, transform it into a standardized format, and load it into a centralized data repository.
  • Build and Maintain Data Infrastructure: Design, implement, and manage data warehouses, data lakes, and other data storage solutions.
  • Ensure Data Quality and Integrity: Develop data validation, cleansing, and normalization processes to ensure data accuracy and consistency.
  • Collaborate with Data Analysts and Business Process Owners: Work with data analysts and business process owners to understand their data requirements and provide data support for their projects.
  • Optimize Data Systems for Performance: Continuously monitor and optimize data systems for performance, scalability, and reliability.
  • Develop and Maintain Data Governance Policies: Create and enforce data governance policies to ensure data security, compliance, and regulatory requirements.

AWSPythonSQLApache AirflowCloud ComputingETLGitJavaSAPSnowflakeData engineeringCommunication SkillsCI/CDRESTful APIsTerraformData visualizationStakeholder managementData modelingEnglish communication

Posted about 22 hours ago
Apply
Apply

📍 United Kingdom

🏢 Company: AlphaSights👥 1001-5000💰 over 17 years agoInformation ServicesKnowledge Management

  • 3+ years of hands-on data engineering development experience, with deep expertise in Python, SQL, and working with SQL/NoSQL databases.
  • Skilled in designing, building, and maintaining data pipelines, data warehouses, and leveraging AWS data services.
  • Strong proficiency in DataOps methodologies and tools, including experience with CI/CD pipelines, containerized applications, and workflow orchestration using Apache Airflow.
  • Design, develop, deploy and support data infrastructure, pipelines and architectures, contributing to an architectural vision that will scale up to be the world's leading research platform.
  • Write clean, efficient, and maintainable code that powers data pipelines, workflows, and data operations in a production environment. Implement reliable, scalable, and well-tested solutions to automate data ingestion, transformation, and orchestration across systems.
  • Manage and optimise key data infrastructure components within AWS, including Amazon Redshift, Apache Airflow for workflow orchestration and other analytical tools. You will be responsible for ensuring the performance, reliability, and scalability of these systems to meet the growing demands of data pipelines and analytics workloads.
  • Overseeing configuration, monitoring, troubleshooting, and continuous improvement of our infrastructure to support delivering high-quality insights and analytics.

PythonSQLApache AirflowETLAmazon Web ServicesRDBMSCI/CD

Posted 1 day ago
Apply
Apply

📍 Brazil

🔍 Real Estate

🏢 Company: Grupo QuintoAndar

  • Has 7 or more years of experience in Data Engineering roles
  • Specialist in technologies, solutions, and concepts of Big Data (Spark, Hadoop, Hive, MapReduce) and multiple languages (YAML, Python)
  • Experience with Airflow, Spark, AWS and Databricks
  • Strong foundation in software engineering principles, with experience working on data-centric systems
  • Experience with columnar storage solutions and/or data lakehouse concepts
  • Proficiency in Python, or one of the main programming languages, and a passion for writing clean and maintainable code
  • Strong knowledge in optimizing SQL query performance
  • Experience in building multidimensional data models (Star and/or Snowflake schema)
  • Understanding of the data lifecycle and concepts such as lineage, governance, privacy, retention, anonymization, etc.
  • Knowledge in infrastructure areas such as containers and orchestration (Kubernetes, ECS), CI/CD strategies, infrastructure as code (Terraform), observability (Prometheus, Grafana), among others
  • Proficiency in English
  • Build and maintain a high-performance data platform
  • Create and edit data pipelines
  • Create data modeling and transformation workflows
  • Be responsible for the entire code development lifecycle (monitoring deployment, documentation, performance, security, adding metrics and alarms, ensuring SLO budget compliance, and more)
  • Investigate inconsistencies and be able to trace the source of differences (data troubleshooting)
  • Enable teams across the company to access and use data more effectively through self-service tools and well-modeled datasets
  • Align with stakeholders to understand their primary needs, while also having a holistic view of the problem and proposing extensible, scalable, and incremental solutions
  • Conduct PoCs and benchmarks to determine the best tool for a given problem, and decide whether to use an off-the-shelf solution or develop one in-house
  • Contribute to defining the strategic vision, crossing team and service boundaries to solve problems
  • Advocate for the value of data analytics and engineering within the organization and fostering a data-driven culture
  • Be a reference within the chapter on technical concepts, tools, and/or best coding practices

AWSPythonSQLHadoopKubernetesAirflowData engineeringGrafanaPrometheusSparkCI/CDTerraformData modelingSoftware EngineeringEnglish communication

Posted 1 day ago
Apply
Apply

🔍 Software Development

  • Experience across the Microsoft data stack (Power BI, Azure, SQL), but are open to using alternative tools when more effective.
  • Strong SQL skills: You can handle complex queries with ease.
  • Cloud proficiency: Ideally with Azure; knowledge of AWS or GCP is also valuable.
  • Excellent communication skills: Clear, proactive, and comfortable with both tech and business stakeholders.
  • Design and build top-tier BI solutions tailored to clients’ specific business needs.
  • Drive innovation through improved processes, tooling, and quality standards.
  • Take ownership of data governance, quality control, and system reliability.
  • Collaborate closely with clients and internal teams to deliver impactful outcomes.
  • Mentor team members and contribute to a culture of continuous learning and knowledge sharing.
Posted 1 day ago
Apply
Apply
🔥 Big Data Engineer
Posted 1 day ago

📍 Spain

🔍 Software Development

🏢 Company: Plain Concepts👥 251-500ConsultingAppsMobile AppsInformation TechnologyMobile

  • 3 years of experience in data engineering.
  • Strong experience with Python or Scala and Spark, processing large datasets.
  • Solid experience in Cloud platforms (Azure or AWS).
  • Hands-on experience building data pipelines (CI/CD).
  • Experience with testing (unit, integration, etc.).
  • Knowledge of SQL and NoSQL databases.
  • Participating in the design and development of Data solutions for challenging projects.
  • Develop projects from scratch with minimal supervision and strong team collaboration.
  • Be a key player in fostering best practices, clean, and reusable code.
  • Develop ETLs using Spark (Python/Scala).
  • Work on cloud-based projects (Azure/AWS).
  • Build scalable pipelines using a variety of technologies.

AWSPythonSQLAgileCloud ComputingETLAzureData engineeringNosqlSparkCI/CDScala

Posted 1 day ago
Apply
Apply

🧭 Full-Time

🔍 Health & Bioinformatics

🏢 Company: Gradient AI👥 101-250💰 $20,000,000 Series B about 4 years agoArtificial Intelligence (AI)Machine LearningInsurTechInsuranceHealth Care

  • BS in Computer Science, Bioinformatics, or another quantitative discipline with 7+ years working with and interpreting health, medical, and bioinformatics data, including real-world healthcare datasets.
  • Subject matter expertise (SME) in health and bioinformatics data, with a strong grasp of the complexities and challenges of processing medical and biological information.
  • Knowledge of healthcare data standards (e.g., FHIR, HL7) and a solid understanding of healthcare data privacy and security regulations (such as HIPAA) are highly desirable.
  • Proficiency in Python and SQL within a professional environment.
  • Hands-on knowledge of big data tools like Apache Spark (PySpark), DataBricks, Snowflake, or similar platforms.
  • Skilled in using data orchestration frameworks such as Airflow, Dagster, or Prefect.
  • Comfortable working within cloud computing environments, preferably AWS, along with Linux systems.
  • Design, build, and implement data systems to support ML and AI models for our health insurance clients, ensuring strict compliance with healthcare data privacy and security regulations (e.g., HIPAA).
  • Develop tools for extracting, processing, and profiling diverse healthcare data sources, including EHRs, medical claims, pharmacy data, and genomic data.
  • Collaborate with data scientists to transform large volumes of health-related and bioinformatics data into modeling-ready formats, prioritizing data quality, integrity, and reliability in healthcare applications.
  • Build and maintain infrastructure for the extraction, transformation, and loading (ETL) of data from a variety of sources using SQL, AWS, and healthcare-specific big data technologies and analytics platforms.
  • Apply health and bioinformatics subject matter expertise to ensure data pipelines meet the unique requirements of health, medical, and bioinformatics data processing - including translating complex medical and biological concepts into actionable data requirements.
Posted 1 day ago
Apply
Apply
🔥 Sr. Data Engineer
Posted 1 day ago

🧭 Full-Time

💸 100000.0 - 140000.0 USD per year

🔍 Healthcare IT

  • 7+ years of professional experience as engineer.
  • 3+ years of experience with healthcare data ecosystems (EHR systems, e-prescription workflows, pharmacy data) and HIPAA compliance.
  • Proficiency with AWS cloud services (Redshift, S3, Lambda, Glue, EMR) and data orchestration tools (Airflow, AWS Step Functions).
  • Experience developing ETL pipelines with Python and CI/CD in GitLab or similar platforms.
  • Design, develop, and maintain production-grade ETL pipelines using Python and GitLab CI/CD to process healthcare data from multiple sources.
  • Configure and optimize AWS cloud services including Redshift, S3, Lambda, Glue, and EMR to build scalable data solutions.
  • Extract, transform, and load data from various healthcare systems including e-prescription workflows, and pharmacy fill/claim data sources.
  • Address unique and complex healthcare data challenges through critical thinking, root cause analysis, and collaborative issue resolution.
  • Perform thorough code and data reviews to certify projects as 'production-ready'.
  • Document data pipelines and technical requirements.
  • Identify opportunities to leverage healthcare data in new ways.
Posted 1 day ago
Apply
Apply

📍 Worldwide

🔍 Algorithmic Trading

  • 7 + years building production‑grade data systems.
  • Familiarity with market data formats (e.g., MDP, ITCH, FIX, proprietary exchange APIs) and market data providers.
  • Expert‑level Python (Go and C++ nice to have).
  • Hands‑on with modern orchestration (Airflow) and event streams (Kafka).
  • Strong SQL proficiency: aggregations, joins, subqueries, window functions (first, last, candle, histogram), indexes, query planning, and optimization.
  • Designing high‑throughput APIs (REST/gRPC) and data access libraries.
  • Strong Linux fundamentals, containers (Docker) and cloud object storage (AWS S3 / GCS).
  • Proven track record of mentoring, code reviews and driving engineering excellence.
  • Architect batch + stream pipelines (Airflow, Kafka, dbt) for diverse structured and unstructured marked data.
  • Implement and tune S3, column‑oriented and time‑series data storage for petabyte‑scale analytics; own partitioning, compression, TTL, versioning and cost optimisation.
  • Develop internal libraries for schema management, data contracts, validation and lineage; contribute to shared libraries and services for internal data consumers for research, backtesting and real-time trading purposes.
  • Embed monitoring, alerting, SLAs, SLOs and CI/CD; champion automated testing, data quality dashboards and incident runbooks.
  • Partner with Data Science, Quant Research, Backend and DevOps to translate requirements into platform capabilities and evangelise best practices.
Posted 1 day ago
Apply

Related Articles

Posted about 1 month ago

How to Overcome Burnout While Working Remotely: Practical Strategies for Recovery

Burnout is a silent epidemic among remote workers. The blurred lines between work and home life, coupled with the pressure to always be “on,” can leave even the most dedicated professionals feeling drained. But burnout doesn’t have to define your remote work experience. With the right strategies, you can recover, recharge, and prevent future episodes. Here’s how.



Posted 7 days ago

Top 10 Skills to Become a Successful Remote Worker by 2025

Remote work is here to stay, and by 2025, the competition for remote jobs will be tougher than ever. To stand out, you need more than just basic skills. Employers want people who can adapt, communicate well, and stay productive without constant supervision. Here’s a simple guide to the top 10 skills that will make you a top candidate for remote jobs in the near future.

Posted 9 months ago

Google is gearing up to expand its remote job listings, promising more opportunities across various departments and regions. Find out how this move can benefit job seekers and impact the market.

Posted 10 months ago

Read about the recent updates in remote work policies by major companies, the latest tools enhancing remote work productivity, and predictive statistics for remote work in 2024.

Posted 10 months ago

In-depth analysis of the tech layoffs in 2024, covering the reasons behind the layoffs, comparisons to previous years, immediate impacts, statistics, and the influence on the remote job market. Discover how startups and large tech companies are adapting, and learn strategies for navigating the new dynamics of the remote job market.