Apply

Senior Data Engineer

Posted 2024-07-11

View full description

💎 Seniority level: Senior, Progressive experience in all of the following areas:

📍 Location: United States, India, United Kingdom

💸 Salary: 150000 - 180000 USD per year

🔍 Industry: B2B technology

🗣️ Languages: English

⏳ Experience: Progressive experience in all of the following areas:

🪄 Skills: GitAirflowClickhouseSpark

Requirements:
  • Four-year degree in Computer Science or related field, or equivalent experience.
  • Designing frameworks and writing efficient data pipelines, including batches and real-time streams.
  • Understanding of data strategies, data analysis, and data model design.
  • Experience with the Spark Ecosystem (YARN, Executors, Livy, etc.).
  • Experience in large scale data streaming, particularly Kafka or similar technologies.
  • Experience with data orchestration frameworks, particularly Airflow or similar.
  • Experience with columnar data stores, particularly Parquet and Clickhouse.
  • Strong SDLC principles (CI/CD, Unit Testing, git, etc.).
  • General understanding of AWS EMR, EC2, S3.
Responsibilities:
  • Help build the next generation unified data platform.
  • Solve complex data warehousing problems.
  • Ensure quality, discoverability, and accessibility of data.
  • Build batch and streaming data pipelines for ingestion, normalization, and analysis.
  • Develop standard design and access patterns.
  • Lead the unification of data from multiple products.
Apply

Related Jobs

Apply

📍 Arizona, California, Connecticut, Colorado, Florida, Georgia, Hawaii, Illinois, Maryland, Massachusetts, Michigan, Minnesota, Missouri, New Hampshire, New York, North Carolina, North Dakota, Ohio, Oregon, Pennsylvania, Rhode Island, South Carolina, Texas, Utah, Vermont, Virginia, Washington, Washington D.C. and Wisconsin

🧭 Full-Time

💸 157791 - 183207 USD per year

🔍 Nonprofit, technology for political campaigns

🏢 Company: ActBlue

  • 3-5 years of experience in data engineering or related roles.
  • Experience building, deploying, and running Machine Learning models in a production environment.
  • Experience maintaining and deploying server-side web applications.
  • Good collaboration skills with remote teams and a team player mentality.
  • Eagerness to learn, support teammates’ growth, and an understanding of performance, scalability, and security.

  • Implement and deliver complex, high-impact data platform projects, managing them through their full lifecycle with minimal guidance.
  • Work closely with application developers, database administrators, and data scientists to create robust infrastructure for data-driven insights.
  • Identify and understand end-user data needs, design solutions, and build scalable data pipelines.
  • Create data frameworks and services for engineers and data scientists to ensure scalability and consistency.
  • Collaborate with data scientists to advance the production-level Machine Learning platform.
  • Cultivate strong relationships with stakeholders and engineering teams to inform technical decisions.

AWSPythonMachine LearningData engineeringTerraform

Posted 2024-11-14
Apply
Apply

📍 US

🧭 Full-Time

🔍 Cloud integration technology

🏢 Company: Cleo (US)

  • 5-7+ years of experience in data engineering focusing on AI/ML models.
  • Hands-on expertise in data transformation and building data pipelines.
  • Leadership experience in mentoring data engineering teams.
  • Strong experience with cloud platforms and big data technologies.

  • Lead the design and build of scalable, reliable, and efficient data pipelines.
  • Set data infrastructure strategy for data warehouses and lakes.
  • Hands-on data transformation for AI/ML models.
  • Build data structures and manage metadata.
  • Implement data quality controls.
  • Collaborate with cross-functional teams to meet data requirements.
  • Optimize ETL processes for AI/ML.
  • Ensure data pipelines support model training needs.
  • Define data governance practices.

LeadershipArtificial IntelligenceETLMachine LearningStrategyData engineeringData StructuresMentoring

Posted 2024-11-14
Apply
Apply
🔥 Senior Data Engineer
Posted 2024-11-13

📍 United Kingdom

🔍 Payment and Financial Services

🏢 Company: Vitesse PSP

  • Experience with data pipeline orchestration tools such as Airflow, Luigi, or similar.
  • Experience with version control systems and CI/CD best practices using GitHub Actions.
  • Knowledge of data governance, privacy regulations (e.g., GDPR), and security best practices.
  • Proficiency with SQL and experience with distributed data processing tools such as Apache Spark.
  • Strong understanding of relational and NoSQL databases (e.g., PostgreSQL, MongoDB, Impala, Cassandra).
  • Experience with cloud infrastructure (Docker and Kubernetes, Terraform).
  • Experience in AWS platform architecture and cloud services.
  • A collaborative team member with Agile experience.
  • Familiarity with stream processing technologies (Kafka or Kinesis).
  • Nice to have: Experience with machine learning frameworks and pipelines, Delta Live Tables, Great Expectations, search optimizers (ElasticSearch/Lucene), REST alternatives (GraphQL, AsyncAPI), data science kits (Jupyter, Anaconda).

  • Design, build, and maintain scalable data pipelines and architectures to handle large volumes of structured and unstructured data.
  • Develop, enhance, and optimize ELT processes for ingesting, processing, and distributing data across multiple platforms in real time.
  • Build and manage data warehouses to support advanced analytics, reporting, and machine learning.
  • Implement data governance, quality checks, and validation processes to ensure the accuracy, consistency, observability, and security of data.
  • Optimize query performance and data storage costs through techniques like partitioning, indexing, vacuuming, and compression.
  • Build monitoring and alerting systems for data pipelines to proactively detect and resolve issues.
  • Optimize existing data pipelines for better performance, cost-efficiency, and scalability.
  • Work with data scientists, analysts, and business stakeholders to understand data needs.
  • Continuously research and integrate cutting-edge data technologies, tools, and practices to improve data engineering processes.
  • Team up with product engineers to identify, root cause, and resolve bugs.
  • Update documentation to help users navigate data products.
  • Ensure the data platform performs well and is always available for blue-chip clients.

AWSDockerGraphQLPostgreSQLSQLAgileElasticSearchKafkaKubernetesMongoDBTableauAirflowCassandraData engineeringElasticsearchNosqlSparkCI/CDTerraformDocumentation

Posted 2024-11-13
Apply
Apply
🔥 Senior Data Engineer
Posted 2024-11-11

📍 USA

🧭 Full-Time

🔍 Energy analytics and forecasting

  • Senior level experience within data engineering with primary focus using Python.
  • Experience with cloud-based infrastructure (Kubernetes/Docker) and data services (GCP, AWS, Azure, et al).
  • Experience building data pipelines with a proven track record of delivering results that impact the business.
  • Experience working on complex large codebase with a focus on refactoring and enhancements.
  • Experience building data monitoring pipelines with a focus on scalability.

  • Rebuilding systems to identify more efficient ways to process data.
  • Automate the entire forecasting pipeline, including data collection, preprocessing, model training, and deployment.
  • Continuously monitor system performance and optimize data processing workflows to reduce latency and improve efficiency.
  • Set up real-time monitoring for data feeds to detect anomalies or issues promptly.
  • Utilize distributed computing and parallel processing to handle large-scale data.
  • Design your data infrastructure to be scalable to accommodate future growth in data volume and sources.

AWSDockerPythonGCPKubernetesAzureData engineering

Posted 2024-11-11
Apply
Apply
🔥 Senior Data Engineer
Posted 2024-11-07

📍 Canada, UK, US

🔍 Smart home technology

🏢 Company: ecobee

  • Proficiency in building data pipelines using Python and SQL.
  • Experience with Apache Spark, Apache Kafka, and Apache Airflow.
  • Experience with cloud-based data platforms, preferably GCP.
  • Familiarity with SQL-based operational databases.
  • Good understanding of machine learning lifecycle.
  • Strong experience in data modeling and schema design.
  • Experience with both batch and real-time data processing.
  • Excellent communication skills for collaborative work.

  • Design, build, and maintain scalable and efficient ETL/ELT pipelines.
  • Implement data extraction and processing solutions for analytics and machine learning.
  • Integrate diverse data sources into centralized data repositories.
  • Develop and maintain data warehousing solutions.
  • Monitor and optimize data workflows for performance and reliability.
  • Implement monitoring and logging for data pipelines.
  • Collaborate with cross-functional teams to understand data requirements.
  • Translate business requirements into technical specifications.
  • Implement data quality checks and cleansing procedures.
  • Create and maintain documentation for data pipelines.
  • Share knowledge and best practices within the team.
  • Architect data pipelines for massive IoT data streams.

LeadershipPythonSQLApache AirflowETLGCPIoTKafkaMachine LearningAirflowApache KafkaData engineeringSparkCommunication SkillsCollaboration

Posted 2024-11-07
Apply
Apply

📍 UK

🧭 Full-Time

🔍 Knowledge management

🏢 Company: AlphaSights

  • 5+ years of hands-on data engineering development.
  • Expert in Python and SQL.
  • Experience with SQL/NoSQL databases.
  • Experienced with AWS data services.
  • Proficiency in DataOps methodologies and tools.
  • Experience with CI/CD pipelines and managing containerized applications.
  • Proficiency in workflow orchestration tools such as Apache Airflow.
  • Experience in designing, building, and maintaining Data Warehouses.
  • Collaborative experience with cross-functional teams.
  • Knowledge of ETL frameworks and best practices.

  • Design, develop, deploy and support data infrastructure, pipelines and architectures.
  • Take ownership of reporting APIs, ensuring accuracy and timeliness for stakeholders.
  • Monitor dataflows and underlying systems, promoting necessary changes for scalability and performance.
  • Collaborate directly with stakeholders to translate business problems into data-driven solutions.
  • Mentor engineers within the technical guild and support team growth.

AWSPythonSQLApache AirflowETLAirflowData engineeringNosqlCI/CD

Posted 2024-11-07
Apply
Apply
🔥 Senior Data Engineer
Posted 2024-11-07

📍 Mexico, Gibraltar, Colombia, USA, Brazil, Argentina

🧭 Full-Time

🔍 FinTech

🏢 Company: Bitso

  • Proven English fluency.
  • 3+ years professional working experience with analytics, ETLs, and data systems.
  • 3+ years with SQL databases, data lake, big data, and cloud infrastructure.
  • 3+ years experience with Spark.
  • BS or Master's in Computer Science or similar.
  • Strong proficiency in SQL, Python, and AWS.
  • Strong data modeling skills.

  • Build processes required for optimal extraction, transformation, and loading of data from various sources using SQL, Python, Spark.
  • Identify, design, and implement internal process improvements while optimizing data delivery and redesigning infrastructure for scalability.
  • Ensure data integrity, quality, and security.
  • Work with stakeholders to assist with data-related technical issues and support their data needs.
  • Manage data separation and security across multiple data sources.

AWSPythonSQLBusiness IntelligenceMachine LearningData engineeringData StructuresSparkCommunication Skills

Posted 2024-11-07
Apply
Apply

📍 US, Germany, UK

🧭 Full-Time

🔍 Music

🏢 Company: SoundCloud

  • Senior Level Data Professional with a minimum of 4 years of experience (ideal 6+ years).
  • Experience with Cloud technologies, specifically GCP (required), with AWS/Azure as a plus.
  • Experience working with BigQuery and advanced SQL knowledge.
  • Proficiency in Python and Airflow.
  • Experience with big data at terabyte/petabyte scale.
  • Data Architecture/solution design experience.
  • Familiarity with Agile methodology and Jira.
  • Experience in data warehousing and analytical data modeling.
  • Knowledge of CI/CD pipelines and Git.
  • Experience in building reliable ETL pipelines and datasets for BI tools (Looker preferred).
  • Basic statistical knowledge and ability to produce high-quality technical documentation.

  • Build and maintain a unified and standardized data warehouse, Corpus, at SoundCloud.
  • Abstract the complexity of SoundCloud’s vast data ecosystem.
  • Collaboration with business reporting, data science, and product teams.
  • Gather and refine requirements, design data architecture and solutions.
  • Build ETL pipelines using Airflow to land data in BigQuery.
  • Model and build the business-ready data layer for dashboarding tools.

PythonSQLAgileETLGCPGitJiraAirflowCI/CD

Posted 2024-11-07
Apply
Apply
🔥 Senior Data Engineer
Posted 2024-11-07

📍 United States

🔍 Data Engineering

🏢 Company: Enable Data Incorporated

  • Bachelor's or Master's degree in computer science, engineering, or a related field.
  • 8+ years of experience as a Data Engineer, with a focus on building cloud-based data solutions.
  • Strong experience with cloud platforms such as Azure or AWS.
  • Proficiency in Apache Spark and Databricks for large-scale data processing and analytics.
  • Experience in designing and implementing data processing pipelines using Spark and Databricks.
  • Strong knowledge of SQL and experience with relational and NoSQL databases.
  • Experience with data integration and ETL processes using tools like Apache Airflow or cloud-native orchestration services.
  • Good understanding of data modeling and schema design principles.
  • Experience with data governance and compliance frameworks.
  • Excellent problem-solving and troubleshooting skills.
  • Strong communication and collaboration skills to work effectively in a cross-functional team.
  • Relevant certifications in cloud platforms, Spark, or Databricks are a plus.

  • Design, develop, and maintain scalable and robust data solutions in the cloud using Apache Spark and Databricks.
  • Gather and analyze data requirements from business stakeholders and identify opportunities for data-driven insights.
  • Build and optimize data pipelines for data ingestion, processing, and integration using Spark and Databricks.
  • Ensure data quality, integrity, and security throughout all stages of the data lifecycle.
  • Collaborate with cross-functional teams to design and implement data models, schemas, and storage solutions.
  • Optimize data processing and analytics performance by tuning Spark jobs and leveraging Databricks features.
  • Provide technical guidance and expertise to junior data engineers and developers.
  • Stay up-to-date with emerging trends and technologies in cloud computing, big data, and data engineering.
  • Contribute to the continuous improvement of data engineering processes, tools, and best practices.

Problem Solving

Posted 2024-11-07
Apply
Apply
🔥 Senior Data Engineer
Posted 2024-11-07

📍 United States

🔍 Data and Analytics Consulting

🏢 Company: OmniData

  • 5+ years of experience in Analytics and Data Warehousing on the Microsoft platform.
  • 5+ years working with Microsoft SQL Server.
  • Experience working with the Microsoft Azure stack.

  • Contribute collaboratively to team meetings.
  • Analyze complex client needs and develop data and analytical solutions.
  • Work independently toward client success while collaborating with internal teams.

SQLMicrosoft AzureMicrosoft SQL ServerAzure

Posted 2024-11-07
Apply