Apache Hadoop Jobs

Find remote positions requiring Apache Hadoop skills. Browse through opportunities where you can utilize your expertise and grow your career.

Apache Hadoop
11 jobs found. to receive daily emails with new job openings that match your preferences.
11 jobs found.

Set alerts to receive daily emails with new job openings that match your preferences.

Apply

📍 Ukraine

  • 4+ years of experience in software/data engineering, data architecture, or a related field.
  • Strong programming skills in at least one language: Java, Scala, Python, or Go.
  • Experience with SQL and data modeling.
  • Hands-on experience with Apache Big Data frameworks such as Hadoop, Hive, Spark, Airflow, etc.
  • Proficiency in AWS cloud services.
  • Strong understanding of distributed systems, large-scale data processing, and data storage/retrieval.
  • Experience with data governance, security, and compliance is a plus.
  • Familiarity with CI/CD and DevOps practices is a plus.
  • Excellent communication and problem-solving skills.

  • Design, build, and maintain scalable and reliable data storage solutions.
  • Optimize and scale the platform for increasing data volumes and user requests.
  • Improve data storage, retrieval, query performance, and overall system performance.
  • Collaborate with data scientists, analysts, and stakeholders for tailored solutions.
  • Ensure proper integration of data pipelines, analytics tools, and ETL processes.
  • Troubleshoot and resolve platform issues in a timely manner.
  • Develop monitoring and alerting systems to ensure platform reliability.
  • Participate in code reviews and design discussions.
  • Evaluate new technologies to enhance the data platform.

AWSPythonSQLApache AirflowApache HadoopKafkaKubernetesData engineeringScalaData modeling

Posted 8 days ago
Apply
Apply

📍 Estonia

🧭 Internship

🔍 Communications

🏢 Company: Twilio - University Programs

  • To be working towards a Bachelors, Masters, or PhD degree in computer science, computer engineering or a related field.
  • To have a keen interest in software development with several side projects, and perhaps are a part of the open source community.
  • To have a hungry entrepreneurial and 'can do' spirit, demonstrated by interest in learning new technologies.
  • To have explored writing code in languages such as Python, Java, Javascript, PHP, C, or C++.

  • Be a Software Engineer, not just an 'intern'.
  • Ship many different projects during your summer.
  • Learn from passionate engineers at Twilio who solve problems in distributed computing, real-time DSP, virtualization performance, distributed messaging, and more.
  • Take responsibility for core features and services that ship to users.
  • Embrace challenges, learn fast, and deliver great results.
  • Participate in code reviews, bug tracking, and project management.

PHPPythonSoftware DevelopmentAgileApache HadoopJavaJavascriptC++Spark

Posted 9 days ago
Apply
Apply

📍 US

💸 163000 - 189000 USD per year

🔍 Life Sciences

🏢 Company: Domino Data Lab👥 251-500💰 Series F over 2 years agoArtificial Intelligence (AI)Big DataMachine LearningAnalyticsEnterprise ApplicationsData MiningEnterprise SoftwareSoftware

  • 5+ years previously in a software engineering individual contributor role.
  • Experience in building scalable systems and managing high-performance back-end systems.
  • Cross-functional collaboration skills, integrating back-end systems with front-end and third-party services.
  • API development experience, particularly with RESTful APIs and gRPC.
  • Performance optimization abilities in cloud environments and with Docker and Kubernetes.
  • Testing and CI/CD experience for robust testing frameworks and maintaining pipelines.
  • Familiarity with distributed computing frameworks like Apache Hadoop, Spark, Ray.
  • Proficiency in back-end programming languages such as Python, Java, Scala, or Go.
  • Experience with service-oriented architectures and modular system design.
  • Knowledge of machine learning workflows and infrastructure is a plus.

  • Enhance Domino Datasets functionality to allow access via APIs.
  • Improve Git repository and version control support within Domino Projects and Workspaces.
  • Design and implement regulatory compliance features aligned with FDA requirements.
  • Build end-to-end audit logging for comprehensive workflow transparency.
  • Automate external data version tracking for third-party data sources.
  • Optimize workspace performance focusing on Kubernetes-based compute-on-demand IDE functionality.

DockerPythonApache HadoopGitHadoopJavaKubernetesMachine LearningGogRPCSparkCollaborationCI/CDRESTful APIsCompliance

Posted about 2 months ago
Apply
Apply
🔥 Data Architect
Posted 3 months ago

📍 Mexico

🧭 Full-Time

🔍 AI consulting and custom software development

🏢 Company: Creai

  • Proven experience as a Data Architect with expertise in Microsoft Azure data services.
  • Strong understanding of data management principles, data modeling, and data integration techniques.
  • Hands-on experience with Azure Data Factory, Azure SQL, Cosmos DB, Azure Synapse Analytics, and other Azure cloud-based data tools.
  • Proficiency in building and maintaining data pipelines, data lakes, and data warehouses.
  • Experience with ETL processes and tools to automate and optimize data flows.
  • Strong knowledge of SQL, as well as experience with NoSQL databases.
  • Familiarity with data governance, security standards, and best practices for data architecture in cloud environments.
  • Excellent problem-solving and analytical skills, with the ability to work in a fast-paced environment.
  • Bachelor’s degree in Computer Science, Data Science, Information Technology, or a related field, or equivalent experience.

  • Design and implement data architectures that support AI and machine learning solutions, ensuring scalability, reliability, and performance.
  • Lead data integration efforts, including data pipelines, data warehouses, and data lakes, using Microsoft Azure services.
  • Work closely with cross-functional teams to ensure that data architecture meets business and technical requirements.
  • Optimize database performance, troubleshoot issues, and ensure data security and governance in compliance with industry standards.
  • Implement ETL processes, and manage data storage solutions such as Azure SQL, Cosmos DB, or Data Lake Storage.
  • Leverage Azure services like Azure Data Factory, Azure Synapse Analytics, and Azure Databricks to build and maintain robust data pipelines.
  • Maintain and document data models, architectural guidelines, and best practices to ensure consistent data architecture across projects.
  • Monitor and optimize data architectures to improve efficiency and cost-effectiveness in Azure environments.
  • Stay updated on new Azure data services and industry best practices.

SQLApache AirflowApache HadoopArtificial IntelligenceETLHadoopMachine LearningMicrosoft AzureAirflowAzureData scienceNosqlSparkAnalytical Skills

Posted 3 months ago
Apply
Apply

📍 United States

🧭 Full-Time

🔍 Cyber-security

🏢 Company: Shuvel Digital👥 11-50SEOUX DesignWeb DevelopmentWeb AppsWeb Design

  • Experience in machine learning and data science.
  • Knowledge of cyber-security data and statistical analysis.
  • Understanding of ETL data hygiene methods.
  • Familiarity with machine learning algorithms and frameworks.
  • Hands-on experience with Python, SQL, and data analysis tools.

  • Research, develop, architect, and integrate ML models and algorithms.
  • Collaborate with data scientists and other teams to address specific problems.
  • Design and implement data processing and ETL algorithms.
  • Analyze structured cyber-security data for insights.
  • Deploy ML solutions into production environments.

AWSDockerPythonSQLApache HadoopCloud ComputingData AnalysisETLHadoopJavaKafkaKerasKubernetesMachine LearningNumpyPyTorchAlgorithmsAzureData scienceGoRustSparkTensorflowAnalytical SkillsCollaborationProblem Solving

Posted 3 months ago
Apply
Apply
🔥 Software Engineer (DE)
Posted 4 months ago

📍 India

🔍 Product engineering, data engineering, B2B SaaS, IoT & Machine Learning

🏢 Company: Velotio Technologies

  • 1+ years of data engineering or equivalent knowledge and ability.
  • 1+ years of software engineering or equivalent knowledge and ability.
  • Strong proficiency in Python, Scala, or Java.
  • Experience designing and maintaining databases.
  • Good understanding of star/snowflake schema designs.
  • Experience with big data technologies like Spark and Hadoop.
  • Experience building ETL/ELT pipelines.
  • Hands-on experience with batch and stream data processing applications.

  • Design and build scalable data infrastructure for growing data needs.
  • Build applications for optimal extraction, cleaning, transformation, and loading of data.
  • Develop ETL/ELT pipelines and work with data infrastructure components.
  • Implement monitoring processes for data quality.
  • Collaborate with teams on data platform architecture.
  • Establish operational excellence in data engineering.

AWSPythonApache AirflowApache HadoopETLGCPJavaKafkaSnowflakeAzureSparkScala

Posted 4 months ago
Apply
Apply

📍 Japan

🔍 Telecommunications, IoT

🏢 Company: SORACOM

  • Javaを使った業務ロジックの実装経験
  • ビジネスに必要な機能を把握し、ビジネスサイドと共に適切な仕様を定義し、実装・テストする能力
  • 3年以上の業務システム設計/実装と運用経験
  • 自らの手を動かして試す・体験することが出来る姿勢
  • システム全体および利用するライブラリ・フレームワークを理解した上で実装する姿勢
  • 社内のグローバルチームと仕事を進めるための日本語力/英語力

  • 日本およびグローバルで提供するソラコム各サービスの課金・決済システムの設計・構築・運用
  • デバイスや通信機器の直販/出荷システムの設計・構築・運用
  • 社内利用の業務システムの設計・構築・運用
  • 利用者のフィードバックを元に、継続的にシステムの改善とリリースを行う
  • ソラコムのビジネス拡大をシステムで支える

Backend DevelopmentSoftware DevelopmentApache HadoopHadoopJavaSpringSpring BootSpark

Posted 4 months ago
Apply
Apply

📍 Canada

🔍 Multicloud solutions and technology services

🏢 Company: Rackspace👥 1001-5000💰 Private over 7 years ago🫂 Last layoff almost 2 years agoIaaSBig DataCloud ComputingCloud Infrastructure

  • Proven track record in designing and implementing scalable ML inference systems.
  • Hands-on experience with deep learning frameworks such as TensorFlow, Keras, or Spark MLlib.
  • Solid foundation in machine learning algorithms, natural language processing, and statistical modeling.
  • Strong understanding of computer science concepts including algorithms and distributed systems.
  • Proficiency and recent experience in Java is required.
  • Experience in Apache Hadoop ecosystem (Oozie, Pig, Hive, Map Reduce).
  • Expertise in public cloud services, particularly GCP and Vertex AI.
  • Understanding of LLM architectures and model optimization techniques.

  • Architect and optimize existing data infrastructure for machine learning and deep learning models.
  • Collaborate with cross-functional teams to translate business objectives into engineering solutions.
  • Own development and operation of high-performance inference systems for various models.
  • Provide technical leadership and mentorship to the engineering team.

LeadershipPythonApache HadoopGCPHadoopJavaKerasMachine LearningC++AlgorithmsData StructuresSparkTensorflowC (Programming language)

Posted 5 months ago
Apply
Apply
🔥 Senior MLOPs Engineer
Posted 5 months ago

🧭 Full-Time

🔍 Multicloud solutions

🏢 Company: Rackspace👥 1001-5000💰 Private over 7 years ago🫂 Last layoff almost 2 years agoIaaSBig DataCloud ComputingCloud Infrastructure

  • Proven track record in designing and implementing cost-effective and scalable ML inference systems.
  • Hands-on experience with leading deep learning frameworks such as TensorFlow, Keras, or Spark MLlib.
  • Solid foundation in machine learning algorithms, natural language processing, and statistical modeling.
  • Strong grasp of fundamental computer science concepts like algorithms, distributed systems, data structures, and database management.
  • Experience in Apache Hadoop ecosystem (Oozie, Pig, Hive, Map Reduce).
  • Expertise in public cloud services, particularly in GCP and Vertex AI.
  • Proficient in applying model optimization techniques (distillation, quantization, hardware acceleration).
  • Recent experience in Java.
  • In-depth understanding of LLM architectures, parameter scaling, and deployment trade-offs.
  • Technical degree: Bachelor's degree in Computer Science or Master's degree with relevant industry experience.
  • Specialization in Machine Learning is preferred.

  • Architect and optimize existing data infrastructure for machine learning and deep learning models.
  • Collaborate with cross-functional teams to translate business objectives into engineering solutions.
  • Own end-to-end development and operation of high-performance, cost-effective inference systems.
  • Provide technical leadership and mentorship to the engineering team.

LeadershipPythonApache HadoopGCPHadoopJavaKerasMachine LearningC++AlgorithmsData StructuresSparkTensorflowCommunication SkillsC (Programming language)

Posted 5 months ago
Apply
Apply

📍 US

💸 116100 - 198440 USD per year

🔍 Cloud Solutions

🏢 Company: Rackspace👥 1001-5000💰 Private over 7 years ago🫂 Last layoff almost 2 years agoIaaSBig DataCloud ComputingCloud Infrastructure

  • Proficiency in the Hadoop ecosystem including Map Reduce, Oozie, Hive, Pig, HBase, and Storm.
  • Strong programming skills with Java, Python, and Spark.
  • Knowledge in public cloud services, particularly in GCP.
  • Experience in Infrastructure and Applied DevOps principles, including CI/CD and IaC like Terraform.
  • Ability to tackle complex challenges with innovative solutions.
  • Effective communication skills in a remote work setting.

  • Develop scalable and robust code for large scale batch processing systems using Hadoop, Oozie, Pig, Hive, Map Reduce, Spark (Java), Python, HBase.
  • Develop, manage, and maintain batch pipelines supporting Machine Learning workloads.
  • Leverage GCP for scalable big data processing and storage solutions.
  • Implement automation/DevOps best practices for CI/CD and IaC.

PythonApache HadoopGCPHadoopJavaMachine LearningSparkTerraform

Posted 5 months ago
Apply
Shown 10 out of 11