Apply

Senior Data Engineer

Posted about 2 months agoViewed

View full description

πŸ’Ž Seniority level: Senior, 8+ years

πŸ“ Location: United States

🏒 Company: Avalore, LLC

πŸ—£οΈ Languages: English

⏳ Experience: 8+ years

πŸͺ„ Skills: PythonSQLApache AirflowArtificial IntelligenceETLMachine LearningAPI testingData engineering

Requirements:
  • Master’s or PhD in statistics, mathematics, computer science, or related field.
  • 8+ years of experience as a Data Engineer within the IC.
  • Outstanding communication skills, influencing abilities, and client focus.
  • Professional proficiency in English is required.
  • Current, active Top Secret security clearance.
  • Applicants must be currently authorized to work in the United States on a full-time basis.
Responsibilities:
  • Develops and documents data pipelines for ingest, transformation, and preparation of data for AI applications.
  • Designs scalable technologies such as streaming and transformation, joining disparate data sets for predictive analytics.
  • Develops API interfaces for accessibility.
  • Leads technical efforts and guides development teams.
Apply

Related Jobs

Apply

πŸ“ United States, Canada

🧭 Regular

πŸ’Έ 125000.0 - 160000.0 USD per year

πŸ” Digital driver assistance services

🏒 Company: AgeroπŸ‘₯ 1001-5000πŸ’° $4,750,000 over 2 years agoAutomotiveInsurTechInformation TechnologyInsurance

  • Bachelor's degree in a technical field and 5+ years or Master's degree with 3+ years of industry experience.
  • Extensive experience with Snowflake or other cloud-based data warehousing solutions.
  • Expertise in ETL/ELT pipelines using tools like Airflow, DBT, Fivetran.
  • Proficiency in Python for data processing and advanced SQL for managing databases.
  • Solid understanding of data modeling techniques and cost management strategies.
  • Experience with data quality frameworks and deploying data solutions in the cloud.
  • Familiarity with version control systems and implementing CI/CD pipelines.
  • Develop and maintain ETL/ELT pipelines to ingest data from diverse sources.
  • Monitor and optimize cloud costs while performing query optimization in Snowflake.
  • Establish modern data architectures including data lakes and warehouses.
  • Apply dimensional modeling techniques and develop transformations using DBT or Spark.
  • Write reusable and efficient code, and develop data-intensive UIs and dashboards.
  • Implement data quality frameworks and observability solutions.
  • Collaborate cross-functionally and document data flows, processes, and architecture.

AWSPythonSQLApache AirflowDynamoDBETLFlaskMongoDBSnowflakeFastAPIPandasCI/CDData modeling

Posted 1 day ago
Apply
Apply

πŸ“ United States of America

🧭 Full-Time

πŸ’Έ 110000.0 - 160000.0 USD per year

πŸ” Insurance industry

🏒 Company: Verikai_External

  • Bachelor's degree or above in Computer Science, Data Science, or a related field.
  • At least 5 years of relevant experience.
  • Proficient in SQL, Python, and data processing frameworks such as Spark.
  • Hands-on experience with AWS services including Lambda, Athena, Dynamo, Glue, Kinesis, and Data Wrangler.
  • Expertise in handling large datasets using technologies like Hadoop and Spark.
  • Experience working with PII and PHI under HIPAA constraints.
  • Strong commitment to data security, accuracy, and compliance.
  • Exceptional ability to communicate complex technical concepts to stakeholders.
  • Design, build, and maintain robust ETL processes and data pipelines for large-scale data ingestion and transformation.
  • Manage third-party data sources and customer data to ensure clean and deduplicated datasets.
  • Develop scalable data storage systems using cloud platforms like AWS.
  • Collaborate with data scientists and product teams to support data needs.
  • Implement data validation and quality checks, ensuring accuracy and compliance with regulations.
  • Integrate new data sources to enhance the data ecosystem and document data strategies.
  • Continuously optimize data workflows and research new tools for the data infrastructure.

AWSPythonSQLDynamoDBETLSpark

Posted 8 days ago
Apply
Apply

πŸ“ South Africa, Mauritius, Kenya, Nigeria

πŸ” Technology, Marketplaces

  • BSc degree in Computer Science, Information Systems, Engineering, or related technical field or equivalent work experience.
  • 3+ years related work experience.
  • Minimum of 2 years experience building and optimizing β€˜big data’ data pipelines, architectures and maintaining data sets.
  • Experienced in Python.
  • Experienced in SQL (PostgreSQL, MS SQL).
  • Experienced in using cloud services: AWS, Azure or GCP.
  • Proficiency in version control, CI/CD and GitHub.
  • Understanding/experience in Glue and PySpark highly desirable.
  • Experience in managing data life cycle.
  • Proficiency in manipulating, processing and architecting large disconnected data sets for analytical requirements.
  • Ability to maintain and optimise processes supporting data transformation, data structures, metadata, dependency and workload management.
  • Good understanding of data management principles - data quality assurance and governance.
  • Strong analytical skills related to working with unstructured datasets.
  • Understanding of message queuing, stream processing, and highly scalable β€˜big data’ datastores.
  • Strong attention to detail.
  • Good communication and interpersonal skills.
  • Suggest efficiencies and execute on implementation of internal process improvements in automating manual processes.
  • Implement enhancements and new features across data systems.
  • Improve streamline processes within data systems with support from Senior Data Engineer.
  • Test CI/CD process for optimal data pipelines.
  • Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Highly efficient in ETL processes.
  • Develop and conduct unit tests on data pipelines as well as ensuring data consistency.
  • Develop and maintain automated monitoring solutions.
  • Support reporting and analytics infrastructure.
  • Maintain data quality and data governance as well as upkeep of overall maintenance of data infrastructure systems.
  • Maintain data warehouse and data lake metadata, data catalogue, and user documentation for internal business users.
  • Ensure best practice is implemented and maintained on database.

AWSPostgreSQLPythonSQLETLGitCI/CD

Posted 21 days ago
Apply
Apply
πŸ”₯ Senior Data Engineer
Posted about 1 month ago

πŸ“ US

πŸ’Έ 103200.0 - 128950.0 USD per year

πŸ” Genetics and healthcare

🏒 Company: NateraπŸ‘₯ 1001-5000πŸ’° $250,000,000 Post-IPO Equity over 1 year agoπŸ«‚ Last layoff almost 2 years agoWomen'sBiotechnologyMedicalGeneticsHealth Diagnostics

  • BS degree in computer science or a comparable program or equivalent experience.
  • 8+ years of overall software development experience, ideally in complex data management applications.
  • Experience with SQL and No-SQL databases including Dynamo, Cassandra, Postgres, Snowflake.
  • Proficiency in data technologies such as Hive, Hbase, Spark, EMR, Glue.
  • Ability to manipulate and extract value from large datasets.
  • Knowledge of data management fundamentals and distributed systems.
  • Work with other engineers and product managers to make design and implementation decisions.
  • Define requirements in collaboration with stakeholders and users to create reliable applications.
  • Implement best practices in development processes.
  • Write specifications, design software components, fix defects, and create unit tests.
  • Review design proposals and perform code reviews.
  • Develop solutions for the Clinicogenomics platform utilizing AWS cloud services.

AWSPythonSQLAgileDynamoDBSnowflakeData engineeringPostgresSparkData modelingData management

Posted about 1 month ago
Apply
Apply
πŸ”₯ Senior Data Engineer
Posted about 2 months ago

πŸ“ United States, United Kingdom, Spain, Estonia

πŸ” Identity verification

🏒 Company: VeriffπŸ‘₯ 501-1000πŸ’° $100,000,000 Series C about 3 years agoπŸ«‚ Last layoff over 1 year agoArtificial Intelligence (AI)Fraud DetectionInformation TechnologyCyber SecurityIdentity Management

  • Expert-level knowledge of SQL, particularly with Redshift.
  • Strong experience in data modeling with an understanding of dimensional data modeling best practices.
  • Proficiency in data transformation frameworks like dbt.
  • Solid programming skills in languages used in data engineering, such as Python or R.
  • Familiarity with orchestration frameworks like Apache Airflow or Luigi.
  • Experience with data from diverse sources including RDBMS and APIs.
  • Collaborate with business stakeholders to design, document, and implement robust data models.
  • Build and optimize data pipelines to transform raw data into actionable insights.
  • Fine-tune query performance and ensure efficient use of data warehouse infrastructure.
  • Ensure data reliability and quality through rigorous testing and monitoring.
  • Assist in migrating from batch processing to real-time streaming systems.
  • Expand support for various use cases including business intelligence and analytics.

PythonSQLApache AirflowETLData engineeringJSONData modeling

Posted about 2 months ago
Apply
Apply
πŸ”₯ Senior Data Engineer
Posted about 2 months ago

πŸ“ USA

🧭 Full-Time

πŸ’Έ 190000.0 - 220000.0 USD per year

πŸ” B2B data / Data as a Service (DaaS)

🏒 Company: People Data LabsπŸ‘₯ 101-250πŸ’° $45,000,000 Series B about 3 years agoDatabaseArtificial Intelligence (AI)Developer APIsMachine LearningAnalyticsB2BSoftware

  • 5-7+ years industry experience with strategic technical problem-solving.
  • Strong software development fundamentals.
  • Experience with Python.
  • Expertise in Apache Spark (Java, Scala, or Python-based).
  • Proficiency in SQL.
  • Experience building scalable data processing systems.
  • Familiarity with data pipeline orchestration tools (e.g., Airflow, dbt).
  • Knowledge of modern data design and storage patterns.
  • Experience working in Databricks.
  • Familiarity with cloud computing services (e.g., AWS, GCP, Azure).
  • Experience in data warehousing technologies.
  • Understanding of modern data storage formats and tools.
  • Build infrastructure for ingestion, transformation, and loading of data using Spark, SQL, AWS, and Databricks.
  • Create an entity resolution framework for merging billions of entities into clean datasets.
  • Develop CI/CD pipelines and anomaly detection systems to enhance data quality.
  • Provide solutions to undefined data engineering problems.
  • Assist Engineering and Product teams with data-related technical issues.

AWSPythonSQLKafkaAirflowData engineeringPandasCI/CD

Posted about 2 months ago
Apply
Apply

πŸ“ Paris, New York, San Francisco, Sydney, Madrid, London, Berlin

πŸ” Communication technology

  • Passionate about data engineering.
  • Experience in designing and developing data infrastructure.
  • Technical skills to solve complex challenges.
  • Play a crucial role in designing, developing, and maintaining data infrastructure.
  • Collaborate with teams across the company to solve complex challenges.
  • Improve operational efficiency and lead business towards strategic goals.
  • Contribute to engineering efforts that enhance customer journey.

AWSPostgreSQLPythonSQLApache AirflowETLData engineering

Posted 3 months ago
Apply
Apply

πŸ“ US

🧭 Full-Time

πŸ” Cloud integration technology

🏒 Company: Cleo (US)

  • 5-7+ years of experience in data engineering focusing on AI/ML models.
  • Hands-on expertise in data transformation and building data pipelines.
  • Leadership experience in mentoring data engineering teams.
  • Strong experience with cloud platforms and big data technologies.
  • Lead the design and build of scalable, reliable, and efficient data pipelines.
  • Set data infrastructure strategy for data warehouses and lakes.
  • Hands-on data transformation for AI/ML models.
  • Build data structures and manage metadata.
  • Implement data quality controls.
  • Collaborate with cross-functional teams to meet data requirements.
  • Optimize ETL processes for AI/ML.
  • Ensure data pipelines support model training needs.
  • Define data governance practices.

LeadershipArtificial IntelligenceETLMachine LearningStrategyData engineeringData StructuresMentoring

Posted 3 months ago
Apply
Apply

πŸ“ ANY STATE

πŸ” Data and technology

  • 5+ years of experience making contributions in the form of code.
  • Experience with algorithms and data structures and knowing when to apply them.
  • Experience with machine learning techniques to develop better predictive and clustering models.
  • Experience working with high-scale systems.
  • Experience creating powerful machine learning tools for experimentation and productionalization at scale.
  • Experience in data engineering and warehousing to develop ingestion engines, ETL pipelines, and organizing data for consumption.
  • Be a senior member of the team by contributing to the architecture, design, and implementation of EMS systems.
  • Mentor junior engineers and promote their growth.
  • Lead technical projects and manage planning, execution, and success of complex technical projects.
  • Collaborate with other engineering, product, and data science teams to ensure optimal product development.

PythonSQLETLGCPKubeflowMachine LearningAlgorithmsData engineeringData scienceData StructuresTensorflowCollaborationScala

Posted 3 months ago
Apply
Apply

πŸ“ USA

🧭 Full-Time

πŸ” Energy analytics and forecasting

  • Senior level experience within data engineering with primary focus using Python.
  • Experience with cloud-based infrastructure (Kubernetes/Docker) and data services (GCP, AWS, Azure, etc.).
  • Proven track record of delivering results that impact the business through building data pipelines.
  • Experience working on complex large codebases with a focus on refactoring and enhancements.
  • Experience building data monitoring pipelines with a focus on scalability.
  • Rebuilding systems to identify more efficient ways to process data.
  • Automate the entire forecasting pipeline, including data collection, preprocessing, model training, and deployment.
  • Continuously monitor system performance and optimize data processing workflows to reduce latency and improve efficiency.
  • Set up real-time monitoring for data feeds to detect anomalies or issues promptly.
  • Utilize distributed computing and parallel processing to handle large-scale data.
  • Design your data infrastructure to be scalable to accommodate future growth in data volume and sources.

AWSDockerPythonGCPKubernetesAzureData engineering

Posted 3 months ago
Apply