Sr. Data Engineer

Posted 2 days agoViewed

View full description

💎 Seniority level: Senior, 2+ years

📍 Location: United States

🔍 Industry: Software Development

🏢 Company: ge_externalsite

⏳ Experience: 2+ years

🪄 Skills: AWSPostgreSQLPythonSQLApache AirflowApache HadoopData AnalysisData MiningErwinETLHadoop HDFSJavaKafkaMySQLOracleSnowflakeCassandraClickhouseData engineeringData StructuresREST APINosqlSparkJSONData visualizationData modelingData analyticsData management

Requirements:

Exposure to industry standard data modeling tools (e.g., ERWin, ER Studio, etc.).
Exposure to Extract, Transform & Load (ETL) tools like Informatica or Talend
Exposure to industry standard data catalog, automated data discovery and data lineage tools (e.g., Alation, Collibra, TAMR etc., )
Hands-on experience in programming languages like Java, Python or Scala
Hands-on experience in writing SQL scripts for Oracle, MySQL, PostgreSQL or HiveQL
Experience with Big Data / Hadoop / Spark / Hive / NoSQL database engines (i.e. Cassandra or HBase)
Exposure to unstructured datasets and ability to handle XML, JSON file formats

Responsibilities:

Work independently as well as with a team to develop and support Ingestion jobs
Evaluate and understand various data sources (databases, APIs, flat files etc. to determine optimal ingestion strategies
Develop a comprehensive data ingestion architecture, including data pipelines, data transformation logic, and data quality checks, considering scalability and performance requirements.
Choose appropriate data ingestion tools and frameworks based on data volume, velocity, and complexity
Design and build data pipelines to extract, transform, and load data from source systems to target destinations, ensuring data integrity and consistency
Implement data quality checks and validation mechanisms throughout the ingestion process to identify and address data issues
Monitor and optimize data ingestion pipelines to ensure efficient data processing and timely delivery
Set up monitoring systems to track data ingestion performance, identify potential bottlenecks, and trigger alerts for issues
Work closely with data engineers, data analysts, and business stakeholders to understand data requirements and align ingestion strategies with business objectives.
Build technical data dictionaries and support business glossaries to analyze the datasets
Perform data profiling and data analysis for source systems, manually maintained data, machine generated data and target data repositories
Build both logical and physical data models for both Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) solutions
Develop and maintain data mapping specifications based on the results of data analysis and functional requirements
Perform a variety of data loads & data transformations using multiple tools and technologies.
Build automated Extract, Transform & Load (ETL) jobs based on data mapping specifications
Maintain metadata structures needed for building reusable Extract, Transform & Load (ETL) components.
Analyze reference datasets and familiarize with Master Data Management (MDM) tools.
Analyze the impact of downstream systems and products
Derive solutions and make recommendations from deep dive data analysis.
Design and build Data Quality (DQ) rules needed

Apply

Related Jobs

Apply

🔥 Sr Data Engineer

Posted 6 days ago

📍 United States

💸 117400.0 - 190570.0 USD per year

🏢 Company: healthfirst

🔧 Requirements

8+ Years overall IT experience
Enterprise experience in scripting languages primarily Python and Pyspark building enterprise frameworks
Enterprise experience in data ingestion methodologies using different etl tools(Glue,DBT or any other)
Enterprise experience in data warehousing concepts and big data technologies like EMR, Hadoop
Enterprise experience in any cloud infrastructure like AWS,GCP,Azure
Strong SQL expertise across different relational and NoSQL Databases.

💡 Responsibilities

Designs and implements standardized data management procedures around data staging, data ingestion, data preparation, data provisioning, and data destruction (e.g., scripts, programs, automation, etc.)
Ensures quality of technical solutions as data moves across multiple zones and environments
Provides insight into the changing data environment, data processing, data storage and utilization requirements for the company, and offer suggestions for solutions
Ensures managed analytic assets to support the company’s strategic goals by creating and verifying data acquisition requirements and strategy
Develops, constructs, tests, and maintains architectures
Aligns architecture with business requirements and uses programming language and tools
Identifies ways to improve data reliability, efficiency, and quality
Conducts research for industry and business questions
Deploys sophisticated analytics programs, machine learning, and statistical methods to efficiently implement solutions
Prepares data for predictive and prescriptive modeling and find hidden patterns using data
Uses data to discover tasks that can be automated
Creates data monitoring capabilities for each business process and works with data consumers on updates
Aligns data architecture to the solution architecture; contributes to overall solution architecture
Develops patterns for standardizing the environment technology stack
Helps maintain the integrity and security of company data
Additional duties as assigned or required

AWSPythonSQLETLHadoopData engineeringCI/CDDevOpsData modelingScripting

Posted 6 days ago

Apply

🔥 Sr. Data Engineer

Posted 10 days ago

📍 United States

🧭 Full-Time

💸 126100.0 - 168150.0 USD per year

🔍 Data Engineering

🏢 Company: firstamericancareers

🔧 Requirements

5+ years of development experience with any of the following software languages: Python or Scala, and SQL (we use SQL & Python) with cloud experience (Azure preferred or AWS).
Hands-on data security and cloud security methodologies. Experience in configuration and management of data security to meet compliance and CISO security requirements.
Experience creating and maintaining data intensive distributed solutions (especially involving data warehouse, data lake, data analytics) in a cloud environment.
Hands-on experience in modern Data Analytics architectures encompassing data warehouse, data lake etc. designed and engineered in a cloud environment.
Proven professional working experience in Event Streaming Platforms and data pipeline orchestration tools like Apache Kafka, Fivetran, Apache Airflow, or similar tools
Proven professional working experience in any of the following: Databricks, Snowflake, BigQuery, Spark in any flavor, HIVE, Hadoop, Cloudera or RedShift.
Experience developing in a containerized local environment like Docker, Rancher, or Kubernetes preferred
Data Modeling

💡 Responsibilities

Build high-performing cloud data solutions to meet our analytical and BI reporting needs.
Design, implement, test, deploy, and maintain distributed, stable, secure, and scalable data intensive engineering solutions and pipelines in support of data and analytics projects on the cloud, including integrating new sources of data into our central data warehouse, and moving data out to applications and other destinations.
Identify, design, and implement internal process improvements, such as automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability, etc.
Build and enhance a shared data lake that powers decision-making and model building.
Partner with teams across the business to understand their needs and develop end-to-end data solutions.
Collaborate with analysts and data scientists to perform exploratory analysis and troubleshoot issues.
Manage and model data using visualization tools to provide the company with a collaborative data analytics platform.
Build tools and processes to help make the correct data accessible to the right people.
Participate in active rotational support role for production during or after business hours supporting business continuity.
Engage in collaboration and decision making with other engineers.
Design schema and data pipelines to extract, transform, and load (ETL) data from various sources into the data warehouse or data lake.
Create, maintain, and optimize database structures to efficiently store and retrieve large volumes of data.
Evaluate data trends and model simple to complex data solutions that meet day-to-day business demand and plan for future business and technological growth.
Implement data cleansing processes and oversee data quality to maintain accuracy.
Function as a key member of the team to drive development, delivery, and continuous improvement of the cloud-based enterprise data warehouse architecture.

AWSDockerPythonSQLAgileApache AirflowCloud ComputingETLHadoopKubernetesSnowflakeApache KafkaAzureData engineeringSparkScalaData visualizationData modelingData analytics

Posted 10 days ago

Apply

🔥 Sr Data Engineer

Posted 29 days ago

📍 United States, Europe, India

🔍 Software Development

🔧 Requirements

Extensive experience in developing data and analytics applications in geographically distributed teams
Hands-on experience in using modern architectures and frameworks, structured, semi-structured and unstructured data, and programming with Python
Hands-on SQL knowledge and experience with relational databases such as MySQL, PostgreSQL, and others
Hands-on ETL knowledge and experience
Knowledge of commercial data platforms (Databricks, Snowflake) or cloud data warehouses (Redshift, BigQuery)
Knowledge of data catalog and MDM tooling (Atlan, Alation, Informatica, Collibra)
CICD pipeline for continuous deployment (CloudFormation template)
Knowledge of how machine learning / A.I. workloads are implemented in batch and streaming, including the preparing of datasets, training models, and using pre-trained models
Exposure to software engineering processes that can be applied to Data Ecosystems
Excellent analytical and troubleshooting skills
Excellent communication skills
BS. in Computer Science or equivalent

💡 Responsibilities

Design and develop our best-in-class cloud platform, working on all parts of the code stack from front-end, REST and asynchronous APIs, back-end application logic, SQL/NoSQL databases and integrations with external systems
Develop solutions across the data and analytics stack from ETL and Streaming data
Design and develop reusable libraries
Enhance strong processes in Data Ecosystem
Write unit and integration tests

PythonSQLApache AirflowCloud ComputingETLMachine LearningSnowflakeAlgorithmsApache KafkaData engineeringData StructuresCommunication SkillsAnalytical SkillsCI/CDRESTful APIsDevOpsMicroservicesExcellent communication skillsData visualizationData modelingData analyticsData management

Posted 29 days ago

Apply

🔥 Sr. Data Engineer

Posted about 1 month ago

📍 United States

💸 150000.0 - 165000.0 USD per year

🔍 Healthcare

🏢 Company: Transcarent👥 251-500💰 $126,000,000 Series D 10 months agoPersonal Health Health Care Software

🔧 Requirements

You are entrepreneurial and mission-driven and can present your ideas with clarity and confidence.
You are a high-agency person. You refuse to accept undue constraints and the status quo and will not rest until you figure things out.
Advanced expertise in python and dbt for data pipelines.
Advanced working SQL knowledge and experience working with relational databases.
Experience building and optimizing big data pipelines, architectures, and data sets. A definite plus with healthcare experience.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency, and workload management.
A successful history of manipulating, processing, and extracting value from large disconnected datasets.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Strong project management and organizational skills.
Experience supporting and working with cross-functional teams in a dynamic environment.
Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.

💡 Responsibilities

Be a data champion and seek to empower others to leverage the data to its full potential.
Create and maintain optimal data pipeline architecture with high observability and robust operational characteristics.
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal data extraction, transformation, and loading using SQL, python, and dbt from various sources.
Work with stakeholders, including the Executive, Product, Clinical, Data, and Design teams, to assist with data-related technical issues and support their data infrastructure needs.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.

PythonSQLApache AirflowETLKafkaSnowflakeData engineering

Posted about 1 month ago

Apply

🔥 Sr. Data Engineer

Posted 2 months ago

📍 USA, Canada, Mexico

🧭 Full-Time

💸 175000.0 USD per year

🔍 Digital tools for hourly employees

🏢 Company: TeamSense👥 11-50💰 Seed about 1 year agoInformation Services Information Technology Software

🔧 Requirements

Bachelor's or Master's degree in Computer Science, Software Engineering, or a related technical field.
7+ years of professional experience in software engineering including 5+ years of experience in data engineering.
Proven expertise in building and managing scalable data platforms.
Proficiency in Python.
Strong knowledge of SQL, data modeling, data migration and database systems such as PostgreSQL and MongoDB.
Exceptional problem-solving skills optimizing data systems.

💡 Responsibilities

As a Senior Data Engineer, your primary responsibility is to contribute to the design, development, and maintenance of a scalable and reliable data platform.
Analyze the current database and warehouse.
Design and develop scalable ETL/ELT pipelines to support data migration.
Build and maintain robust, scalable, and high-performing data platforms, including data lakes and/or warehouses.
Implement data engineering best practices and design patterns.
Guide design reviews for new features impacting data.

PostgreSQLPythonSQLETLMongoDBData engineeringData modeling

Posted 2 months ago

Apply

🔥 Sr Data Engineer (Scala/Spark)

Posted 3 months ago

📍 United States

🧭 Full-Time

💸 115000.0 - 145000.0 USD per year

🔍 Media and Entertainment

🔧 Requirements

5+ years of relevant experience
Experience in Data Modeling, Data Quality, ETL
2+ years with AWS tech stack
Experience with Python, Spark, and Scala

💡 Responsibilities

Build data pipelines for internal & external datasets
Educate business partners on architecture and capabilities
Write documentation and architecture diagrams

AWSPythonSQLETLSnowflakeSparkLinuxScalaData modeling

Posted 3 months ago

Apply

🔥 Sr. Data Engineer

Posted 4 months ago

📍 United States

🧭 Full-Time

💸 115000.0 - 145000.0 USD per year

🔍 Data Engineering

🔧 Requirements

5+ years experience in data engineering
Strong knowledge of ETL/ELT development principles
Deep experience with Python and SQL
Experience with Airflow or similar orchestration engines
Understanding of CI/CD principles in data engineering

💡 Responsibilities

Design and scale data pipelines across various source systems
Implement data modeling and warehousing principles
Collaborate with teams to understand data requirements
Interface with technology teams for data extraction and transformation
Create documentation for products

PythonSQLApache AirflowCloud ComputingETLMachine LearningData engineeringCI/CDData modeling

Posted 4 months ago

Apply

🔥 Sr. Data Engineer

Posted 5 months ago

📍 United States

🧭 Contract

🏢 Company: Two95 International Inc.

🔧 Requirements

Bachelor’s degree in Computer Science, Computer Information Systems, Engineering, Statistics, or a closely related field (foreign education equivalent accepted).
Experience with AWS services for data and analytics.
5 years of experience in data ingestion, extraction, and integration.
5+ years of hands-on experience with the Mark Logic framework.

💡 Responsibilities

The role focuses on data ingestion, extraction, and integration.
Utilizing AWS services for data and analytics.
Implementing solutions using the Mark Logic framework.

AWSAmazon Web ServicesData engineering

Posted 5 months ago

Apply