Apply

Sr. Data Engineer

Posted 2 days agoViewed

View full description

πŸ’Ž Seniority level: Senior, 2+ years

πŸ“ Location: United States

πŸ” Industry: Software Development

🏒 Company: ge_externalsite

⏳ Experience: 2+ years

πŸͺ„ Skills: AWSPostgreSQLPythonSQLApache AirflowApache HadoopData AnalysisData MiningErwinETLHadoop HDFSJavaKafkaMySQLOracleSnowflakeCassandraClickhouseData engineeringData StructuresREST APINosqlSparkJSONData visualizationData modelingData analyticsData management

Requirements:
  • Exposure to industry standard data modeling tools (e.g., ERWin, ER Studio, etc.).
  • Exposure to Extract, Transform & Load (ETL) tools like Informatica or Talend
  • Exposure to industry standard data catalog, automated data discovery and data lineage tools (e.g., Alation, Collibra, TAMR etc., )
  • Hands-on experience in programming languages like Java, Python or Scala
  • Hands-on experience in writing SQL scripts for Oracle, MySQL, PostgreSQL or HiveQL
  • Experience with Big Data / Hadoop / Spark / Hive / NoSQL database engines (i.e. Cassandra or HBase)
  • Exposure to unstructured datasets and ability to handle XML, JSON file formats
Responsibilities:
  • Work independently as well as with a team to develop and support Ingestion jobs
  • Evaluate and understand various data sources (databases, APIs, flat files etc. to determine optimal ingestion strategies
  • Develop a comprehensive data ingestion architecture, including data pipelines, data transformation logic, and data quality checks, considering scalability and performance requirements.
  • Choose appropriate data ingestion tools and frameworks based on data volume, velocity, and complexity
  • Design and build data pipelines to extract, transform, and load data from source systems to target destinations, ensuring data integrity and consistency
  • Implement data quality checks and validation mechanisms throughout the ingestion process to identify and address data issues
  • Monitor and optimize data ingestion pipelines to ensure efficient data processing and timely delivery
  • Set up monitoring systems to track data ingestion performance, identify potential bottlenecks, and trigger alerts for issues
  • Work closely with data engineers, data analysts, and business stakeholders to understand data requirements and align ingestion strategies with business objectives.
  • Build technical data dictionaries and support business glossaries to analyze the datasets
  • Perform data profiling and data analysis for source systems, manually maintained data, machine generated data and target data repositories
  • Build both logical and physical data models for both Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) solutions
  • Develop and maintain data mapping specifications based on the results of data analysis and functional requirements
  • Perform a variety of data loads & data transformations using multiple tools and technologies.
  • Build automated Extract, Transform & Load (ETL) jobs based on data mapping specifications
  • Maintain metadata structures needed for building reusable Extract, Transform & Load (ETL) components.
  • Analyze reference datasets and familiarize with Master Data Management (MDM) tools.
  • Analyze the impact of downstream systems and products
  • Derive solutions and make recommendations from deep dive data analysis.
  • Design and build Data Quality (DQ) rules needed
Apply

Related Jobs

Apply

πŸ“ United States

πŸ’Έ 117400.0 - 190570.0 USD per year

🏒 Company: healthfirst

  • 8+ Years overall IT experience
  • Enterprise experience in scripting languages primarily Python and Pyspark building enterprise frameworks
  • Enterprise experience in data ingestion methodologies using different etl tools(Glue,DBT or any other)
  • Enterprise experience in data warehousing concepts and big data technologies like EMR, Hadoop
  • Enterprise experience in any cloud infrastructure like AWS,GCP,Azure
  • Strong SQL expertise across different relational and NoSQL Databases.
  • Designs and implements standardized data management procedures around data staging, data ingestion, data preparation, data provisioning, and data destruction (e.g., scripts, programs, automation, etc.)
  • Ensures quality of technical solutions as data moves across multiple zones and environments
  • Provides insight into the changing data environment, data processing, data storage and utilization requirements for the company, and offer suggestions for solutions
  • Ensures managed analytic assets to support the company’s strategic goals by creating and verifying data acquisition requirements and strategy
  • Develops, constructs, tests, and maintains architectures
  • Aligns architecture with business requirements and uses programming language and tools
  • Identifies ways to improve data reliability, efficiency, and quality
  • Conducts research for industry and business questions
  • Deploys sophisticated analytics programs, machine learning, and statistical methods to efficiently implement solutions
  • Prepares data for predictive and prescriptive modeling and find hidden patterns using data
  • Uses data to discover tasks that can be automated
  • Creates data monitoring capabilities for each business process and works with data consumers on updates
  • Aligns data architecture to the solution architecture; contributes to overall solution architecture
  • Develops patterns for standardizing the environment technology stack
  • Helps maintain the integrity and security of company data
  • Additional duties as assigned or required

AWSPythonSQLETLHadoopData engineeringCI/CDDevOpsData modelingScripting

Posted 6 days ago
Apply
Apply
πŸ”₯ Sr. Data Engineer
Posted 10 days ago

πŸ“ United States

🧭 Full-Time

πŸ’Έ 126100.0 - 168150.0 USD per year

πŸ” Data Engineering

🏒 Company: firstamericancareers

  • 5+ years of development experience with any of the following software languages: Python or Scala, and SQL (we use SQL & Python) with cloud experience (Azure preferred or AWS).
  • Hands-on data security and cloud security methodologies. Experience in configuration and management of data security to meet compliance and CISO security requirements.
  • Experience creating and maintaining data intensive distributed solutions (especially involving data warehouse, data lake, data analytics) in a cloud environment.
  • Hands-on experience in modern Data Analytics architectures encompassing data warehouse, data lake etc. designed and engineered in a cloud environment.
  • Proven professional working experience in Event Streaming Platforms and data pipeline orchestration tools like Apache Kafka, Fivetran, Apache Airflow, or similar tools
  • Proven professional working experience in any of the following: Databricks, Snowflake, BigQuery, Spark in any flavor, HIVE, Hadoop, Cloudera or RedShift.
  • Experience developing in a containerized local environment like Docker, Rancher, or Kubernetes preferred
  • Data Modeling
  • Build high-performing cloud data solutions to meet our analytical and BI reporting needs.
  • Design, implement, test, deploy, and maintain distributed, stable, secure, and scalable data intensive engineering solutions and pipelines in support of data and analytics projects on the cloud, including integrating new sources of data into our central data warehouse, and moving data out to applications and other destinations.
  • Identify, design, and implement internal process improvements, such as automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability, etc.
  • Build and enhance a shared data lake that powers decision-making and model building.
  • Partner with teams across the business to understand their needs and develop end-to-end data solutions.
  • Collaborate with analysts and data scientists to perform exploratory analysis and troubleshoot issues.
  • Manage and model data using visualization tools to provide the company with a collaborative data analytics platform.
  • Build tools and processes to help make the correct data accessible to the right people.
  • Participate in active rotational support role for production during or after business hours supporting business continuity.
  • Engage in collaboration and decision making with other engineers.
  • Design schema and data pipelines to extract, transform, and load (ETL) data from various sources into the data warehouse or data lake.
  • Create, maintain, and optimize database structures to efficiently store and retrieve large volumes of data.
  • Evaluate data trends and model simple to complex data solutions that meet day-to-day business demand and plan for future business and technological growth.
  • Implement data cleansing processes and oversee data quality to maintain accuracy.
  • Function as a key member of the team to drive development, delivery, and continuous improvement of the cloud-based enterprise data warehouse architecture.

AWSDockerPythonSQLAgileApache AirflowCloud ComputingETLHadoopKubernetesSnowflakeApache KafkaAzureData engineeringSparkScalaData visualizationData modelingData analytics

Posted 10 days ago
Apply
Apply
πŸ”₯ Sr Data Engineer
Posted 29 days ago

πŸ“ United States, Europe, India

πŸ” Software Development

  • Extensive experience in developing data and analytics applications in geographically distributed teams
  • Hands-on experience in using modern architectures and frameworks, structured, semi-structured and unstructured data, and programming with Python
  • Hands-on SQL knowledge and experience with relational databases such as MySQL, PostgreSQL, and others
  • Hands-on ETL knowledge and experience
  • Knowledge of commercial data platforms (Databricks, Snowflake) or cloud data warehouses (Redshift, BigQuery)
  • Knowledge of data catalog and MDM tooling (Atlan, Alation, Informatica, Collibra)
  • CICD pipeline for continuous deployment (CloudFormation template)
  • Knowledge of how machine learning / A.I. workloads are implemented in batch and streaming, including the preparing of datasets, training models, and using pre-trained models
  • Exposure to software engineering processes that can be applied to Data Ecosystems
  • Excellent analytical and troubleshooting skills
  • Excellent communication skills
  • BS. in Computer Science or equivalent
  • Design and develop our best-in-class cloud platform, working on all parts of the code stack from front-end, REST and asynchronous APIs, back-end application logic, SQL/NoSQL databases and integrations with external systems
  • Develop solutions across the data and analytics stack from ETL and Streaming data
  • Design and develop reusable libraries
  • Enhance strong processes in Data Ecosystem
  • Write unit and integration tests

PythonSQLApache AirflowCloud ComputingETLMachine LearningSnowflakeAlgorithmsApache KafkaData engineeringData StructuresCommunication SkillsAnalytical SkillsCI/CDRESTful APIsDevOpsMicroservicesExcellent communication skillsData visualizationData modelingData analyticsData management

Posted 29 days ago
Apply
Apply
πŸ”₯ Sr. Data Engineer
Posted about 1 month ago

πŸ“ United States

πŸ’Έ 150000.0 - 165000.0 USD per year

πŸ” Healthcare

🏒 Company: TranscarentπŸ‘₯ 251-500πŸ’° $126,000,000 Series D 10 months agoPersonal HealthHealth CareSoftware

  • You are entrepreneurial and mission-driven and can present your ideas with clarity and confidence.
  • You are a high-agency person. You refuse to accept undue constraints and the status quo and will not rest until you figure things out.
  • Advanced expertise in python and dbt for data pipelines.
  • Advanced working SQL knowledge and experience working with relational databases.
  • Experience building and optimizing big data pipelines, architectures, and data sets. A definite plus with healthcare experience.
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
  • Strong analytic skills related to working with unstructured datasets.
  • Build processes supporting data transformation, data structures, metadata, dependency, and workload management.
  • A successful history of manipulating, processing, and extracting value from large disconnected datasets.
  • Working knowledge of message queuing, stream processing, and highly scalable β€˜big data’ data stores.
  • Strong project management and organizational skills.
  • Experience supporting and working with cross-functional teams in a dynamic environment.
  • Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
  • Be a data champion and seek to empower others to leverage the data to its full potential.
  • Create and maintain optimal data pipeline architecture with high observability and robust operational characteristics.
  • Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for optimal data extraction, transformation, and loading using SQL, python, and dbt from various sources.
  • Work with stakeholders, including the Executive, Product, Clinical, Data, and Design teams, to assist with data-related technical issues and support their data infrastructure needs.
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.

PythonSQLApache AirflowETLKafkaSnowflakeData engineering

Posted about 1 month ago
Apply
Apply
πŸ”₯ Sr. Data Engineer
Posted 2 months ago

πŸ“ USA, Canada, Mexico

🧭 Full-Time

πŸ’Έ 175000.0 USD per year

πŸ” Digital tools for hourly employees

🏒 Company: TeamSenseπŸ‘₯ 11-50πŸ’° Seed about 1 year agoInformation ServicesInformation TechnologySoftware

  • Bachelor's or Master's degree in Computer Science, Software Engineering, or a related technical field.
  • 7+ years of professional experience in software engineering including 5+ years of experience in data engineering.
  • Proven expertise in building and managing scalable data platforms.
  • Proficiency in Python.
  • Strong knowledge of SQL, data modeling, data migration and database systems such as PostgreSQL and MongoDB.
  • Exceptional problem-solving skills optimizing data systems.
  • As a Senior Data Engineer, your primary responsibility is to contribute to the design, development, and maintenance of a scalable and reliable data platform.
  • Analyze the current database and warehouse.
  • Design and develop scalable ETL/ELT pipelines to support data migration.
  • Build and maintain robust, scalable, and high-performing data platforms, including data lakes and/or warehouses.
  • Implement data engineering best practices and design patterns.
  • Guide design reviews for new features impacting data.

PostgreSQLPythonSQLETLMongoDBData engineeringData modeling

Posted 2 months ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 115000.0 - 145000.0 USD per year

πŸ” Media and Entertainment

  • 5+ years of relevant experience
  • Experience in Data Modeling, Data Quality, ETL
  • 2+ years with AWS tech stack
  • Experience with Python, Spark, and Scala
  • Build data pipelines for internal & external datasets
  • Educate business partners on architecture and capabilities
  • Write documentation and architecture diagrams

AWSPythonSQLETLSnowflakeSparkLinuxScalaData modeling

Posted 3 months ago
Apply
Apply
πŸ”₯ Sr. Data Engineer
Posted 4 months ago

πŸ“ United States

🧭 Full-Time

πŸ’Έ 115000.0 - 145000.0 USD per year

πŸ” Data Engineering

  • 5+ years experience in data engineering
  • Strong knowledge of ETL/ELT development principles
  • Deep experience with Python and SQL
  • Experience with Airflow or similar orchestration engines
  • Understanding of CI/CD principles in data engineering
  • Design and scale data pipelines across various source systems
  • Implement data modeling and warehousing principles
  • Collaborate with teams to understand data requirements
  • Interface with technology teams for data extraction and transformation
  • Create documentation for products

PythonSQLApache AirflowCloud ComputingETLMachine LearningData engineeringCI/CDData modeling

Posted 4 months ago
Apply
Apply
πŸ”₯ Sr. Data Engineer
Posted 5 months ago

πŸ“ United States

🧭 Contract

🏒 Company: Two95 International Inc.

  • Bachelor’s degree in Computer Science, Computer Information Systems, Engineering, Statistics, or a closely related field (foreign education equivalent accepted).
  • Experience with AWS services for data and analytics.
  • 5 years of experience in data ingestion, extraction, and integration.
  • 5+ years of hands-on experience with the Mark Logic framework.
  • The role focuses on data ingestion, extraction, and integration.
  • Utilizing AWS services for data and analytics.
  • Implementing solutions using the Mark Logic framework.

AWSAmazon Web ServicesData engineering

Posted 5 months ago
Apply