Apply

Data Engineer

Posted 18 days agoViewed

View full description

πŸ“ Location: United States, Canada

πŸ” Industry: Healthcare AI

🏒 Company: Flagler HealthπŸ‘₯ 11-50πŸ’° Non-equity Assistance over 1 year agoLife ScienceHealth Care

πŸ—£οΈ Languages: English

πŸͺ„ Skills: PythonSQLETLGitMongoDBAPI testingData engineeringNosqlSparkCI/CDData modeling

Requirements:
  • Proven experience working with Databricks and Spark compute.
  • Proficient in Python, including object-oriented programming and API development.
  • Familiarity with NoSQL (MongoDB preferred), including querying, data modeling, and optimization.
  • Strong problem-solving skills and ability to debug and optimize data processing tasks.
  • Experience with large-scale data processing and distributed systems.
Responsibilities:
  • Develop, manage, and optimize data pipelines on the Databricks platform.
  • Debug and troubleshoot Spark applications to ensure reliability and performance.
  • Implement best practices for Spark compute and optimize workloads.
  • Write clean, efficient, and reusable Python code using object-oriented programming principles.
  • Design and build APIs to support data integration and application needs.
  • Develop scripts and tools to automate data processing and workflows.
  • Integrate, query, and manage data within MongoDB.
  • Ensure efficient storage and retrieval processes tailored to application requirements.
  • Optimize MongoDB performance for large-scale data handling.
  • Work closely with data scientists, analysts, and other stakeholders to understand data needs and deliver solutions.
  • Proactively identify and address technical challenges related to data processing and system design.
Apply

Related Jobs

Apply

πŸ“ United States

🧭 Contract

πŸ” Software Development

🏒 Company: Velir

  • Proven experience as a Data Engineer or related role, with a focus on deploying data infrastructure in AWS.
  • Strong programming skills in Python, SQL, and Terraform.
  • Deep knowledge of data warehousing and ETL/ELT processes.
  • Intermediate / expert proficiency with common data integration / orchestration platforms (e.g., Fivetran, Azure Data Factory, Apache Airflow)
  • Hands-on experience with cloud data warehouses, Databricks experience is preferred but not necessary.
  • Deep familiarity with the AWS Well-Architected Framework and how to deploy infrastructure that adheres to the framework.
  • Experience managing data infrastructure via Terraform or similar IaC framework.
  • Data Architecture Design
  • Data Warehousing
  • Data Quality Assurance
  • Scalability and Performance Optimization
  • Data Security
  • Recommending and implementing data engineering tools and technologies

AWSPythonSQLApache AirflowETLKafkaData engineeringSparkTerraformData modeling

Posted about 6 hours ago
Apply
Apply
πŸ”₯ Lead Data Engineer
Posted about 10 hours ago

πŸ“ United States

🧭 Full-Time

πŸ’Έ 152500.0 - 178000.0 USD per year

πŸ” Software Development

  • 10+ years of professional software development or data engineering experience (10+ with a STEM B.S. or 8+ with a relevant Master's degree)
  • Strong proficiency in Python and familiarity with Java and Bash scripting
  • Hands-on experience implementing database technologies, messaging systems, and stream computing software (e.g., PostgreSQL, PostGIS, MongoDB, DuckDB, KsqlDB, RabbitMQ)
  • Experience with data fabric development using publish-subscribe models (e.g., Apache NiFi, Apache Pulsar, Apache Kafka and Kafka-based data service architecture)
  • Proficiency with containerization technologies (e.g., Docker, Docker-Compose, RKE2, Kubernetes, and Microk8s)
  • Experience with version control systems (e.g., Git), CI/CD tools (e.g., Jenkins), and collaborative development workflows
  • Strong knowledge of data modeling and database optimization techniques
  • Familiarity with data serialization languages (e.g., JSON, GeoJSON, YAML, XML)
  • Excellent problem-solving and analytical skills that have been applied to high visibility, important data engineering projects
  • Strong communication skills and ability to lead the work of other engineers in a collaborative environment
  • Demonstrated experience in coordinating team activities, setting priorities, and managing tasks to ensure balanced workloads and effective team performance
  • Experience managing and mentoring development teams in an Agile environment
  • Ability to make effective architecture decisions and document them clearly
  • Must be a US Citizen and eligible to obtain and maintain a US Security Clearance
  • Develop and continuously improve a data service that underpins cloud-based applications
  • Support data and database modeling efforts
  • Contribute to the development and maintenance of reusable component libraries and shared codebase
  • Participate in the entire software development lifecycle, including requirement gathering, design, development, testing, and deployment, using an agile, iterative process
  • Collaborate with developers, designers, testers, project managers, product owners, and project sponsors to integrate the data service to end user applications
  • Communicate tasking estimation and progress regularly to a development lead and product owner through appropriate tools
  • Ensure seamless integration between database and messaging systems and the frontend / UI they support
  • Ensure data quality, reliability, and performance through code reviews and effective testing strategies
  • Write high-quality code, applying best practices, coding standards, and design patterns
  • Team with other developers, fostering a culture of continuous learning and professional growth

AWSDockerLeadershipPostgreSQLPythonSoftware DevelopmentSQLAgileBashCloud ComputingGitJavaJenkinsKubernetesMongoDBRabbitmqApache KafkaData engineeringCommunication SkillsCI/CDProblem SolvingRESTful APIsMentoringTerraformMicroservicesJSONData visualizationTeam managementAnsibleData modelingSoftware EngineeringData analyticsData management

Posted about 10 hours ago
Apply
Apply

πŸ“ United States

πŸ’Έ 126100.0 - 168150.0 USD per year

🏒 Company: firstamericancareers

  • 5+ years of development experience with any of the following software languages: Python or Scala, and SQL (we use SQL & Python) with cloud experience (Azure preferred or AWS).
  • Hands-on data security and cloud security methodologies.
  • Experience in configuration and management of data security to meet compliance and CISO security requirements.
  • Experience creating and maintaining data intensive distributed solutions (especially involving data warehouse, data lake, data analytics) in a cloud environment.
  • Hands-on experience in modern Data Analytics architectures encompassing data warehouse, data lake etc. designed and engineered in a cloud environment.
  • Proven professional working experience in Event Streaming Platforms and data pipeline orchestration tools like Apache Kafka, Fivetran, Apache Airflow, or similar tools
  • Proven professional working experience in any of the following: Databricks, Snowflake, BigQuery, Spark in any flavor, HIVE, Hadoop, Cloudera or RedShift.
  • Experience developing in a containerized local environment like Docker, Rancher, or Kubernetes preferred
  • Build high-performing cloud data solutions to meet our analytical and BI reporting needs.
  • Design, implement, test, deploy, and maintain distributed, stable, secure, and scalable data intensive engineering solutions and pipelines in support of data and analytics projects on the cloud, including integrating new sources of data into our central data warehouse, and moving data out to applications and other destinations.
  • Identify, design, and implement internal process improvements, such as automating manual processes, optimizing data delivery, and re-designing infrastructure for greater scalability, etc.
  • Build and enhance a shared data lake that powers decision-making and model building.
  • Partner with teams across the business to understand their needs and develop end-to-end data solutions.
  • Collaborate with analysts and data scientists to perform exploratory analysis and troubleshoot issues.
  • Manage and model data using visualization tools to provide the company with a collaborative data analytics platform.
  • Build tools and processes to help make the correct data accessible to the right people.
  • Participate in active rotational support role for production during or after business hours supporting business continuity.
  • Engage in collaboration and decision making with other engineers.
  • Design schema and data pipelines to extract, transform, and load (ETL) data from various sources into the data warehouse or data lake.
  • Create, maintain, and optimize database structures to efficiently store and retrieve large volumes of data.
  • Evaluate data trends and model simple to complex data solutions that meet day-to-day business demand and plan for future business and technological growth.
  • Implement data cleansing processes and oversee data quality to maintain accuracy.
  • Function as a key member of the team to drive development, delivery, and continuous improvement of the cloud-based enterprise data warehouse architecture.

AWSDockerPythonSQLAgileApache AirflowCloud ComputingETLHadoopKubernetesSnowflakeApache KafkaAzureData engineeringSparkScalaData visualizationData modelingData analytics

Posted 2 days ago
Apply
Apply

πŸ“ United States, Canada

πŸ” Software Development

🏒 Company: OverstoryπŸ‘₯ 1-10E-Commerce

  • Approximately 5 years of experience in Data Engineering with at least one experience in a startup environment
  • Product-minded and able to demonstrate significant impact you have had on a business through the application of technology
  • Proven experience of data engineering across the following (or similar) technologies: Python, data orchestration platforms (Airflow, Luigi, Dagster, etc…), data quality frameworks, data lakes/warehouses
  • Ability to design and implement scalable and resilient data systems
  • Excellent communication skills and ability to collaborate effectively in a cross-functional team environment
  • Passion for learning and staying updated with evolving technologies and industry trends
  • Owning day-to-day operational responsibilities of deliveries our analysis to the customers
  • Developing data-driven solutions to customer problems that our products aren’t solving for yet
  • Building new and improving existing technologies such as:
  • Automation of the analysis for all customers, leading to faster implementation of Overstory’s recommendations
  • Metrics to identify what are the time bottlenecks in the current flow of analysis, therefore helping all Overstory teams identify areas of improvements
  • Visualization of status and progress of the analysis for internal use
  • Working on performance & scalability of our pipelines ensuring that our tech can handle our growth

PythonSQLCloud ComputingGCPAmazon Web ServicesData engineeringCommunication SkillsAnalytical SkillsRESTful APIsData visualizationData modeling

Posted 3 days ago
Apply
Apply

πŸ“ USA

🧭 Full-Time

πŸ” Software Development

🏒 Company: VirtuousπŸ‘₯ 11-50Information and Communications Technology (ICT)Information ServicesInformation Technology

  • 5+ years of experience in data engineering, including ETL processing, relational database tools, data warehousing, database architecture, SQL Server
  • Expert in SQL and a data manipulation language, preferably Python
  • Comprehensive understanding of the ETL process
  • 5+ years of experience in data engineering, including ETL processing, relational database tools, data warehousing, database architecture, SQL Server
  • Expert in SQL and a data manipulation language, preferably Python
  • Organize information, manage tasks and projects to support the business needs of our nonprofit customers, pre and/or post sale

PythonSQLETLData engineeringCI/CDRESTful APIsData modelingData analyticsData management

Posted 3 days ago
Apply
Apply

πŸ“ United States, Canada

🧭 Full-Time

πŸ” Software Development

🏒 Company: CollegeVineπŸ‘₯ 51-100πŸ’° $24,000,000 over 2 years agoHigher EducationArtificial Intelligence (AI)SaaSGenerative AIEnterprise Software

  • 10+ years of software engineering experience (at least 5 in data)
  • Proficiency in Python, Scala, and of course SQL
  • Deep expertise with Spark or a similar distributed processing framework, having built and tuned production workloads in the framework
  • Experience designing, communicating, and implementing data platforms (ideally from the ground up)
  • Extremely comfortable managing complex data projects and stakeholder expectations (you’ll frequently be working directly with CollegeVine’s C-level).
  • Own problems end-to-end, deliver results quickly, and be willing to pick up whatever knowledge you're missing to get the job done
  • Partner with subject matter experts across the company including security, product, design, and customer success
  • Drive our data architecture and engineering decisions, bringing your strong experience and knowledge to bear
  • Focus on solving the problems at hand, not just writing code (although that’s typically how CollegeVine delivers solutions)
  • Mentor other engineers (data or otherwise) on effective data/distributed systems engineering practices

LeadershipPythonSQLCloud ComputingETLMachine LearningAlgorithmsData engineeringData StructuresRDBMSREST APISparkCI/CDProblem SolvingMentoringScalaData visualizationData modelingSoftware Engineering

Posted 4 days ago
Apply
Apply

πŸ“ United States

πŸ’Έ 136000.0 - 190000.0 USD per year

πŸ” Crypto

🏒 Company: GeminiπŸ‘₯ 501-1000πŸ’° $1,000,000 Secondary Market over 2 years agoπŸ«‚ Last layoff about 2 years agoCryptocurrencyWeb3Financial ServicesFinanceFinTech

  • 5+ years experience in data engineering with data warehouse technologies
  • 5+ years experience in custom ETL design, implementation and maintenance
  • 5+ years experience with schema design and dimensional data modeling
  • Experience building real-time data solutions and processes
  • Advanced skills with Python and SQL are a must
  • Experience with one or more MPP databases(Redshift, Bigquery, Snowflake, etc)
  • Experience with one or more ETL tools(Informatica, Pentaho, SSIS, Alooma, etc)
  • Strong computer science fundamentals including data structures and algorithms
  • Strong software engineering skills in any server side language, preferable Python
  • Experienced in working collaboratively across different teams and departments
  • Strong technical and business communication
  • Design, architect and implement best-in-class Data Warehousing and reporting solutions
  • Lead and participate in design discussions and meetings
  • Mentor data engineers and analysts
  • Design, automate, build, and launch scalable, efficient and reliable data pipelines into production using Python
  • Build real-time data and reporting solutions
  • Design, build and enhance dimensional models for Data Warehouse and BI solutions
  • Research new tools and technologies to improve existing processes
  • Develop new systems and tools to enable the teams to consume and understand data more intuitively
  • Partner with engineers, project managers, and analysts to deliver insights to the business
  • Perform root cause analysis and resolve production and data issues
  • Create test plans, test scripts and perform data validation
  • Tune SQL queries, reports and ETL pipelines
  • Build and maintain data dictionary and process documentation

AWSPythonSQLApache AirflowCloud ComputingETLKafkaSnowflakeAlgorithmsData engineeringData StructuresCI/CDData modeling

Posted 4 days ago
Apply
Apply

πŸ“ United States, Canada

🧭 Full-Time

πŸ” SaaS

🏒 Company: MangomintπŸ‘₯ 51-100πŸ’° $35,000,000 Series B 6 months agoManagement Information SystemsBeautySoftware

  • 3+ years of experience in data engineering or a related role
  • Proficiency in SQL and Python for data pipelines and automation
  • Experience with dbt (or similar data modeling tools)
  • Familiarity with Snowflake (or other cloud data warehouses)
  • Knowledge of APIs and experience integrating various data sources
  • Experience with CRM and business systems (Salesforce, Outreach, Stripe, etc.)
  • Strong problem-solving skills and ability to take ownership of projects
  • Ability to work independently in a small, fast-paced startup environment
  • Effective communication skills to translate business needs into technical solutions
  • Design, develop, and maintain ETL/ELT data pipelines using Snowflake, dbt, Prefect, and other modern data tools
  • Automate data workflows to improve efficiency and reliability
  • Integrate CRM and other business systems to support cross-functional needs
  • Develop data enrichment pipelines to power our sales process
  • Build internal data tools to drive data-driven decision making
  • Work directly with stakeholders to define requirements and implement data solutions that support business objectives
  • Ensure data integrity, governance, and security best practices are upheld
  • Support analytics and reporting efforts by building dashboards and data models in dbt and Sigma

AWSPythonSQLETLSnowflakeCRMData modeling

Posted 4 days ago
Apply
Apply

πŸ“ USA

🧭 Full-Time

πŸ’Έ 160000.0 - 182000.0 USD per year

πŸ” Adtech

🏒 Company: tvScientificπŸ‘₯ 11-50πŸ’° $9,400,000 Convertible Note about 1 year agoInternetAdvertising

  • 7+ years of experience in data engineering.
  • Proven experience building data infrastructure using Spark with Scala.
  • Familiarity with data lakes, cloud warehouses, and storage formats.
  • Strong proficiency in AWS services.
  • Expertise in SQL for data manipulation and extraction.
  • Bachelor's degree in Computer Science or a related field.
  • Design and implement robust data infrastructure using Spark with Scala.
  • Collaborate with our cross-functional teams to design data solutions that meet business needs.
  • Build out our core data pipelines, store data in optimal engines and formats, and feed our machine learning models.
  • Leverage and optimize AWS resources.
  • Collaborate closely with the Data Science team.

AWSSQLCloud ComputingETLMachine LearningData engineeringSparkScalaData modeling

Posted 4 days ago
Apply
Apply

πŸ“ United States, Canada

🧭 Full-Time

πŸ’Έ 110000.0 - 130000.0 USD per year

πŸ” Software Development

🏒 Company: CerosπŸ‘₯ 101-250πŸ’° $100,000,000 Private over 4 years agoAdvertisingContent CreatorsContent MarketingGraphic DesignSoftware

  • 5+ years of experience in data engineering, focusing on AWS Redshift and ETL pipeline development.
  • Strong expertise in SQL performance tuning, schema management, and query optimization.
  • Experience designing and maintaining ETL pipelines using AWS Glue, Matillion, or similar tools.
  • Proficiency in JavaScript/TypeScript, with experience building custom ETL workflows and integrations.
  • Hands-on experience with Python for data automation and scripting.
  • Strong understanding of data warehousing best practices, ensuring high-quality, scalable data models.
  • Experience with data monitoring and alerting tools such as AWS CloudWatch and New Relic.
  • Ability to work independently in a fast-paced environment, collaborating across teams to support data-driven initiatives.
  • Own and lead the management of AWS Redshift, ensuring optimal performance, disk usage, and cost efficiency.
  • Design and maintain scalable ETL pipelines using AWS Glue, Lambda, and Matillion to integrate data from Mixpanel, CRM platforms, and customer engagement tools.
  • Optimize SQL-based data transformations and Redshift queries to improve performance and reliability.
  • Automate data offloading and partition management, leveraging AWS services like S3 and external schemas.
  • Ensure version control and documentation of all Redshift queries, ETL processes, and AWS configurations through a centralized GitHub repository.
  • Develop monitoring and alerting for data pipelines using CloudWatch and other observability tools to ensure high availability and early issue detection.
  • Implement and maintain data quality checks and governance processes to ensure accuracy and consistency across foundational tables.
  • Collaborate with AI engineers and business stakeholders to enhance data accessibility and reporting for internal teams.
  • Maintain and optimize BI dashboards in Metabase and HubSpot, ensuring accuracy and efficiency of business reporting.
  • Manage key integrations between Redshift and external platforms, including Mixpanel, HubSpot, and Census, optimizing data accessibility and performance.
  • Administer AWS infrastructure supporting Redshift, ensuring efficient resource utilization, IAM security, and cost management.
  • Automate repetitive data tasks using Python and scripting to enhance data processes and improve team efficiency.

AWSPythonSQLETLGitJavascriptTypeScriptAmazon Web ServicesAPI testingData engineeringREST APICI/CDAnsibleData modelingData analyticsData management

Posted 4 days ago
Apply