Apply

Data Engineer

Posted 12 days agoViewed

View full description

πŸ’Ž Seniority level: Middle, 3+ years

πŸ“ Location: India

🏒 Company: WeekdayπŸ‘₯ 1-10πŸ’° over 3 years agoE-CommerceFashion

⏳ Experience: 3+ years

Requirements:
  • Proficiency in SQL, Python, or Scala for data processing.
  • Hands-on experience with Apache Airflow, Kafka, or dbt.
  • Knowledge of cloud-based data solutions like AWS Redshift, Google BigQuery, or Azure Synapse.
  • Strong understanding of data modeling, warehousing, and database optimization.
  • Ability to work independently and manage multiple projects in a fast-paced startup environment.
Responsibilities:
  • Design, develop, and maintain scalable ETL (Extract, Transform, Load) pipelines for efficient data processing.
  • Optimize data architectures for reliability, scalability, and performance.
  • Work closely with data scientists, analysts, and software engineers to integrate data solutions into products and services.
  • Implement monitoring and validation processes to ensure data integrity and quality.
  • Work with cloud-based data platforms (AWS, GCP, Azure) and database technologies (SQL, NoSQL, data warehouses).
  • Develop and maintain comprehensive documentation for data workflows and processes.
  • Contribute to the continuous improvement of data engineering best practices.
Apply

Related Jobs

Apply
πŸ”₯ Sr. Data Engineer
Posted about 2 hours ago

πŸ” Software Development

  • Experience with AWS Big Data services
  • Experience with Snowflake
  • Experience with dbt
  • Experience with Python
  • Experience with distributed systems
  • Design and build scalable data systems that support advanced analytics and business intelligence.
  • Develop and maintain data pipelines.
  • Implement data management best practices.
  • Work closely with Product Managers, Data Analysts, and Software Developers to support data-driven decision-making.
Posted about 2 hours ago
Apply
Apply

πŸ“ Lithuania

πŸ’Έ 4000.0 - 6000.0 EUR per month

πŸ” Software Development

🏒 Company: Softeta

  • 4+ years of experience as a Data Engineer
  • Experience with Azure (Certifications are a Plus)
  • Experience with Databricks, Azure Data Lake, Data Factory and Apache Airflow
  • CI/CD or infrastructure as code
  • Knowledge of Medallion Architecture or Multihop architecture
  • Experience developing and administering ETL processes in the Cloud (Azure, AWS or GCP) environment
  • Strong programming skills in Python and SQL
  • Strong problem-solving and analytical skills
  • Design, develop, and maintain data pipelines and ETL processes
  • Data modeling, data cleansing
  • Automating data processing workflows using tools such as Airflow or other workflow management tools
  • Optimizing the performance of databases, including designing and implementing data structures and using indexes appropriately
  • Implement data quality and data governance processes
  • Being a data advocate and helping unlock business value by using data

PythonSQLApache AirflowETLAzureData engineeringCI/CDData modeling

Posted about 2 hours ago
Apply
Apply
πŸ”₯ Lead/Senior Data Engineer
Posted about 4 hours ago

πŸ“ United States, Latin America, India

πŸ” Software Development

🏒 Company: phDataπŸ‘₯ 501-1000πŸ’° $2,499,997 Seed about 7 years agoInformation ServicesAnalyticsInformation Technology

  • 4+ years as a hands-on Data Engineer and/or Software Engineer
  • Experience with software development life cycle, including unit and integration testing
  • Programming expertise in Java, Python and/or Scala
  • Experience with core cloud data platforms including Snowflake, AWS, Azure, Databricks and GCP
  • Experience using SQL and the ability to write, debug, and optimize SQL queries
  • Client-facing written and verbal communication skills
  • Design and implement data solutions
  • Help ensure performance, security, scalability, and robust data integration
  • Develop end-to-end technical solutions into production
  • Multitask, prioritize, and work across multiple projects at once
  • Create and deliver detailed presentations
  • Detailed solution documentation (e.g. including POCS and roadmaps, sequence diagrams, class hierarchies, logical system views, etc.)

AWSPythonSoftware DevelopmentSQLCloud ComputingData AnalysisETLGCPJavaKafkaSnowflakeAzureData engineeringSparkCommunication SkillsCI/CDProblem SolvingAgile methodologiesRESTful APIsDocumentationScalaData modeling

Posted about 4 hours ago
Apply
Apply
πŸ”₯ Senior Data Engineer
Posted about 5 hours ago

πŸ“ United States

πŸ’Έ 144000.0 - 180000.0 USD per year

πŸ” Software Development

🏒 Company: HungryrootπŸ‘₯ 101-250πŸ’° $40,000,000 Series C almost 4 years agoArtificial Intelligence (AI)Food and BeverageE-CommerceRetailConsumer GoodsSoftware

  • 5+ years of experience in ETL development and data modeling
  • 5+ years of experience in both Scala and Python
  • 5+ years of experience in Spark
  • Excellent problem-solving skills and the ability to translate business problems into practical solutions
  • 2+ years of experience working with the Databricks Platform
  • Develop pipelines in Spark (Python + Scala) in the Databricks Platform
  • Build cross-functional working relationships with business partners in Food Analytics, Operations, Marketing, and Web/App Development teams to power pipeline development for the business
  • Ensure system reliability and performance
  • Deploy and maintain data pipelines in production
  • Set an example of code quality, data quality, and best practices
  • Work with Analysts and Data Engineers to enable high quality self-service analytics for all of Hungryroot
  • Investigate datasets to answer business questions, ensuring data quality and business assumptions are understood before deploying a pipeline

AWSPythonSQLApache AirflowData MiningETLSnowflakeAlgorithmsAmazon Web ServicesData engineeringData StructuresSparkCI/CDRESTful APIsMicroservicesJSONScalaData visualizationData modelingData analyticsData management

Posted about 5 hours ago
Apply
Apply

πŸ“ United States

πŸ’Έ 135000.0 - 155000.0 USD per year

πŸ” Software Development

🏒 Company: JobgetherπŸ‘₯ 11-50πŸ’° $1,493,585 Seed about 2 years agoInternet

  • 8+ years of experience as a data engineer, with a strong background in data lake systems and cloud technologies.
  • 4+ years of hands-on experience with AWS technologies, including S3, Redshift, EMR, Kafka, and Spark.
  • Proficient in Python or Node.js for developing data pipelines and creating ETLs.
  • Strong experience with data integration and frameworks like Informatica and Python/Scala.
  • Expertise in creating and managing AWS services (EC2, S3, Lambda, etc.) in a production environment.
  • Solid understanding of Agile methodologies and software development practices.
  • Strong analytical and communication skills, with the ability to influence both IT and business teams.
  • Design and develop scalable data pipelines that integrate enterprise systems and third-party data sources.
  • Build and maintain data infrastructure to ensure speed, accuracy, and uptime.
  • Collaborate with data science teams to build feature engineering pipelines and support machine learning initiatives.
  • Work with AWS cloud technologies like S3, Redshift, and Spark to create a world-class data mesh environment.
  • Ensure proper data governance and implement data quality checks and lineage at every stage of the pipeline.
  • Develop and maintain ETL processes using AWS Glue, Lambda, and other AWS services.
  • Integrate third-party data sources and APIs into the data ecosystem.

AWSNode.jsPythonSQLETLKafkaData engineeringSparkAgile methodologiesScalaData modelingData management

Posted about 7 hours ago
Apply
Apply

πŸ’Έ 120000.0 - 150250.0 USD per year

πŸ” Software Development

🏒 Company: Spring HealthπŸ‘₯ 1001-5000πŸ’° $100,000,000 Series E 7 months agoMental HealthArtificial Intelligence (AI)mHealthWellnessHealth Care

  • Proficiency with SQL and Python
  • Knowledge of, preferably experience with, orchestration platforms (like dbt, Airflow, Dagster, etc.), git, Terraform, cloud databases (like Snowflake, Redshift, Postgres, etc.), and cloud infrastructure platforms (like AWS, GCP, Azure, etc.)
  • 2+ years of experience working on a platform, infrastructure, analytics engineering or data engineering team
  • Leaves a trail - has a habit of documentation & tracking work using tools like JIRA, Asana, Monday, etc.
  • Ability to work as a team to distill problems into composable chunks
  • Learn to build & maintain our data platform stack, including Airflow, dbt, Snowflake, AWS, CircleCI, DataDog, git, and Terraform
  • Write extensible & maintainable code to support data engineering & data platform use cases like dimensional modeling, data pipelines, access controls, and platform configuration & deployment pipelines.
  • Help drive shared understanding & best practices on the Data Foundations team around infrastructure-as-code
  • Work with key internal stakeholders on the Security and IT teams to improve & fulfill operational infrastructure needs, like role-based access and data masking
  • Maintain & improve CI/CD pipelines that ensure efficient, effective, and secure development for the Data Foundations team
Posted about 13 hours ago
Apply
Apply

🧭 Part-Time

πŸ’Έ 700000000.0 - 900000000.0 COP per year

πŸ” Software Development

🏒 Company: Newrich Network

  • 4 + years experience in Data Engineering
  • 3+ years of experience working on Apache Spark applications using Python (PySpark) or Scala
  • Experience creating spark jobs that work on at least 1 billion records
  • Strong knowledge of ETL architecture and standards
  • Software development experience working with Apache Airflow, Spark, MongoDB, MySQL
  • Strong SQL knowledge
  • Strong command of Python
  • Experience creating data pipelines in a production system
  • Proven experience in building/operating/maintaining fault tolerant and scalable data processing integrations using AWS
  • Experience using Docker or Kubernetes is a plus
  • Ability to identify and resolve problems associated with production grade large scale data processing workflows
  • Experience with crafting and maintaining unit tests and continuous integration.
  • Passion for crafting Intelligent data pipelines that teams love to use
  • Strong capacity to handle numerous projects are a must
  • Collaborate with Data architects, Enterprise architects, Solution consultants and Product engineering teams to gather customer data integration requirements, conceptualize solutions & build required technology stack
  • Collaborate with enterprise customer's engineering team to identify data sources, profile and quantify quality of data sources, develop tools to prepare data and build data pipelines for integrating customer data sources and third party data sources.
  • Develop new features and improve existing data integrations with customer data ecosystem
  • Encourage the team to think out-of-the-box and overcome engineering obstacles while incorporating new innovative design principles.
  • Collaborate with a Project Manager to bill and forecast time for product owner solutions
  • Building data pipelines
  • Reconciling missed data
  • Acquire datasets that align with business needs
  • Develop algorithms to transform data into useful, actionable information
  • Build, test, and maintain database pipeline architectures
  • Collaborate with management to understand company objectives
  • Create new data validation methods and data analysis protocols
  • Ensure compliance with data governance and security policies
Posted 1 day ago
Apply
Apply

πŸ“ Thailand, Philippines

πŸ” Fintech

🏒 Company: EnvissoπŸ‘₯ 11-50CreditComplianceTransaction ProcessingFinancial Services

  • 5+ years of work experience in data engineering.
  • Strong skills in SQL and Python.
  • Experience designing, building and maintaining data models and data pipelines.
  • Experience working with cloud based architecture.
  • Great communication skills with a diverse team of varying technical ability.
  • Create and maintain scalable data pipelines to ingest, transform and serve global payments and risk data.
  • Manage and maintain the data platform, including data pipelines and environments.
  • Collaborate with cross-functional teams of data scientists, software engineers, product managers and business leads, to understand requirements and deliver appropriate solutions.
  • Take ownership of a data area, building subject matter expertise and cultivating trust with stakeholders.
  • Mentor junior members, and grow a strong data culture across the team and organisation.

PythonSQLCloud ComputingETLData engineeringCommunication SkillsData modeling

Posted 1 day ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 100000.0 - 120000.0 USD per year

πŸ” IT

🏒 Company: Adswerve, Inc

  • Extensive experience with Google Cloud Platform (GCP) services, including BigQuery, Dataform, Cloud Storage, Cloud Functions, Cloud Composer, Dataflow, and Pub/Sub
  • In depth understanding of SQL, including the optimization of queries and data transformations
  • Comfortable with Javascript and Python programming languages
  • Excellent communication and collaboration skills
  • Solid understanding of data warehouse, data modeling and data design concepts
  • Experience with ELT processes and data transformation techniques
  • Experience with version control systems (e.g., Git)
  • Strong analytical and problem-solving skills as well as the ability to decompose complex problems
  • Proven track record of managing workloads to consistently meet project deadlines
  • Develop scalable and efficient data architectures to support enterprise applications, analytics, and reporting.
  • Design, develop, and maintain efficient and scalable ELT pipelines on the Google Cloud Platform
  • Partner with business stakeholders to understand their business needs and data requirements, and translate those needs into clear, actionable technical solutions that directly address business needs.
  • Create and maintain detailed documentation, including architecture diagrams, standards, and models.
  • Ensure data security, integrity, and availability across the organization.
  • Manage, maintain and develop custom built ELT pipelines for systems unsupported by the organizations integration platform
  • Manage, maintain and optimize the data infrastructure on the Google Cloud Platform
  • Implement and manage data validation and quality check to ensure accuracy, consistency and completeness of data across pipelines.
  • Set up and manage monitoring and alerting for data pipelines and infrastructure to proactively identify failures and performance issues.

PythonSQLGCPGitJavascriptData engineeringRESTful APIsData visualizationData modeling

Posted 1 day ago
Apply
Apply

πŸ“ United States

πŸ” Software Development

🏒 Company: ge_externalsite

  • Exposure to industry standard data modeling tools (e.g., ERWin, ER Studio, etc.).
  • Exposure to Extract, Transform & Load (ETL) tools like Informatica or Talend
  • Exposure to industry standard data catalog, automated data discovery and data lineage tools (e.g., Alation, Collibra, TAMR etc., )
  • Hands-on experience in programming languages like Java, Python or Scala
  • Hands-on experience in writing SQL scripts for Oracle, MySQL, PostgreSQL or HiveQL
  • Experience with Big Data / Hadoop / Spark / Hive / NoSQL database engines (i.e. Cassandra or HBase)
  • Exposure to unstructured datasets and ability to handle XML, JSON file formats
  • Work independently as well as with a team to develop and support Ingestion jobs
  • Evaluate and understand various data sources (databases, APIs, flat files etc. to determine optimal ingestion strategies
  • Develop a comprehensive data ingestion architecture, including data pipelines, data transformation logic, and data quality checks, considering scalability and performance requirements.
  • Choose appropriate data ingestion tools and frameworks based on data volume, velocity, and complexity
  • Design and build data pipelines to extract, transform, and load data from source systems to target destinations, ensuring data integrity and consistency
  • Implement data quality checks and validation mechanisms throughout the ingestion process to identify and address data issues
  • Monitor and optimize data ingestion pipelines to ensure efficient data processing and timely delivery
  • Set up monitoring systems to track data ingestion performance, identify potential bottlenecks, and trigger alerts for issues
  • Work closely with data engineers, data analysts, and business stakeholders to understand data requirements and align ingestion strategies with business objectives.
  • Build technical data dictionaries and support business glossaries to analyze the datasets
  • Perform data profiling and data analysis for source systems, manually maintained data, machine generated data and target data repositories
  • Build both logical and physical data models for both Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) solutions
  • Develop and maintain data mapping specifications based on the results of data analysis and functional requirements
  • Perform a variety of data loads & data transformations using multiple tools and technologies.
  • Build automated Extract, Transform & Load (ETL) jobs based on data mapping specifications
  • Maintain metadata structures needed for building reusable Extract, Transform & Load (ETL) components.
  • Analyze reference datasets and familiarize with Master Data Management (MDM) tools.
  • Analyze the impact of downstream systems and products
  • Derive solutions and make recommendations from deep dive data analysis.
  • Design and build Data Quality (DQ) rules needed

AWSPostgreSQLPythonSQLApache AirflowApache HadoopData AnalysisData MiningErwinETLHadoop HDFSJavaKafkaMySQLOracleSnowflakeCassandraClickhouseData engineeringData StructuresREST APINosqlSparkJSONData visualizationData modelingData analyticsData management

Posted 2 days ago
Apply

Related Articles

Posted 12 days ago

Why remote work is such a nice opportunity?

Why is remote work so nice? Let's try to see!

Posted 7 months ago

Insights into the evolving landscape of remote work in 2024 reveal the importance of certifications and continuous learning. This article breaks down emerging trends, sought-after certifications, and provides practical solutions for enhancing your employability and expertise. What skills will be essential for remote job seekers, and how can you navigate this dynamic market to secure your dream role?

Posted 7 months ago

Explore the challenges and strategies of maintaining work-life balance while working remotely. Learn about unique aspects of remote work, associated challenges, historical context, and effective strategies to separate work and personal life.

Posted 7 months ago

Google is gearing up to expand its remote job listings, promising more opportunities across various departments and regions. Find out how this move can benefit job seekers and impact the market.

Posted 7 months ago

Learn about the importance of pre-onboarding preparation for remote employees, including checklist creation, documentation, tools and equipment setup, communication plans, and feedback strategies. Discover how proactive pre-onboarding can enhance job performance, increase retention rates, and foster a sense of belonging from day one.