Apache Hadoop Jobs

Find remote positions requiring Apache Hadoop skills. Browse through opportunities where you can utilize your expertise and grow your career.

Apache Hadoop
13 jobs found. to receive daily emails with new job openings that match your preferences.
13 jobs found.

Set alerts to receive daily emails with new job openings that match your preferences.

Apply

📍 USA

💸 176000.0 - 207000.0 USD per year

🔍 Cybersecurity

🏢 Company: Abnormal Security👥 501-1000💰 $250,000,000 Series D 6 months agoArtificial Intelligence (AI)EmailInformation TechnologyCyber SecurityNetwork Security

  • 5+ years of experience as a data engineer or similar role, with hands-on experience in building data-focused solutions.
  • Expertise in ETL, data pipeline design, and data engineering tools and technologies (e.g., Apache Spark, Hadoop, Airflow, Kafka).
  • Experience with maintaining real-time and near real-time data pipelines or streaming services at high scale.
  • Experience with maintaining large scale distributed systems on cloud platforms such as AWS, GCP, or Azure.
  • Background in implementing data quality frameworks, including validation, monitoring, and anomaly detection.
  • Proven ability to collaborate effectively with cross-functional teams.
  • Excellent problem-solving skills and ability to work independently in a fast-paced environment.
  • Architect, design, build, and deploy backend ETL jobs and infrastructure that support a world-class Detection Engine.
  • Ownership projects that enable us to meet ambitious goals, including scaling components of Detection’s Data Pipeline by 10x.
  • Own real-time, near real-time streaming pipelines and online feature serving services.
  • Collaborate closely with MLE and Data Science teams, distilling feedback and executing strategy.
  • Coach and mentor junior engineers through 1on1s, pair programming, code reviews, and design reviews.

AWSApache AirflowApache HadoopETLGCPKafkaAzureData engineering

Posted 9 days ago
Apply
Apply

📍 United States

🧭 Full-Time

💸 50.0 - 90.0 USD per hour

🔍 Government, Defense, Technology

🏢 Company: INTECON

  • Master’s degree in Data Science, Computer Science, Statistics, Applied Mathematics, Engineering, or a related field.
  • Bachelor’s degree in a related field.
  • 5+ years of experience in data science, machine learning, or advanced analytics.
  • Expertise in data analysis, database management, and statistical modeling techniques.
  • Strong programming skills in Python, R, SQL, and other data analytics languages.
  • Experience in machine learning, predictive analytics, and AI-driven decision-making.
  • Proficiency in data visualization tools such as Tableau or Power BI.
  • Hands-on experience with big data technologies and cloud platforms.
  • Knowledge of data governance and DoD data policies.
  • Familiarity with AFWERX and SBIR/STTR programs is a plus.
  • Analyze large datasets to extract meaningful insights and trends.
  • Develop statistical models, machine learning algorithms, and AI-driven insights.
  • Perform quantitative and qualitative research for venture funding and commercialization initiatives.
  • Utilize Python, R, and SQL for data manipulation.
  • Design data pipelines and ETL processes.
  • Create interactive dashboards using Tableau or Power BI.
  • Develop predictive analytics models for assessments and strategies.
  • Collaborate with stakeholders to align data insights with objectives.
  • Translate complex data findings into actionable reports.
  • Ensure data governance and compliance.

AWSPythonSQLApache HadoopData AnalysisETLMachine LearningTableauAzureSparkData visualization

Posted 9 days ago
Apply
Apply

📍 United States, India

🧭 Contract, Part-Time, Full-Time

🔍 Life sciences

🏢 Company: ValGenesis👥 501-1000💰 $24,000,000 Private almost 4 years agoPharmaceuticalMedical DeviceSoftware

  • Bachelor’s or Master’s in Computer Science, Data Science, or related field.
  • 8+ years in AI/ML solution development.
  • Proven software development experience in life sciences or regulated industries.
  • Strong analytical thinking and problem-solving skills.
  • Excellent communication and collaboration abilities.
  • Knowledge of life sciences validation processes and regulatory compliance.
  • Build scalable AI/ML models for document classification, intelligent search, and predictive analytics.
  • Implement image processing solutions for visual inspections and anomaly detection.
  • Define AI architecture and select technologies from open-source and commercial offerings.
  • Deploy AI/ML solutions in cloud-based environments with a focus on high availability and security.
  • Mentor a team of AI/ML engineers, fostering collaborative research and development.

AWSDockerPostgreSQLPythonSQLApache HadoopArtificial IntelligenceCloud ComputingGitImage ProcessingJenkinsKubernetesMachine LearningMongoDBNumpyOpenCVPyTorchTableauAzurePandasSparkTensorflowCI/CDComplianceData visualization

Posted 12 days ago
Apply
Apply

📍 Philippines

🧭 Full-Time

🔍 Healthcare and Technology

🏢 Company: Theoria Medical👥 1001-5000Electronic Health Record (EHR)HospitalHealth CareHome Health Care

  • Bachelor's or Master's degree in Data Science, Computer Science, or a related field.
  • 10+ years of experience in data science and analytics, including team leadership.
  • Proficiency in tools like Python, R, SQL, and big data platforms.
  • Strong communication and stakeholder management skills.
  • Must be punctual and adhere to the company's Time and Attendance policy.
  • Must be able to remain sitting for the majority of their shift.
  • Lead, mentor, and develop a high-performing team of data scientists, analysts, and engineers.
  • Guide the team in working on data requests and business intelligence initiatives.
  • Foster a culture of collaboration, continuous learning, and innovation within the team.
  • Collaborate closely with department heads and executive leadership to identify data-driven opportunities and priorities.
  • Design and implement scalable processes and frameworks for data collection, storage, and analysis.
  • Develop and execute a comprehensive data science and analytics strategy to support long-term business objectives.
  • Establish and enforce data governance policies to ensure security, privacy, and compliance with regulations.
  • Define and track key performance indicators (KPIs) for the data science and analytics function.

PythonSQLApache HadoopBusiness IntelligenceETLMachine LearningData scienceData analytics

Posted 15 days ago
Apply
Apply

📍 United States

🧭 Full-Time

💸 200000.0 - 255000.0 USD per year

🔍 Blockchain intelligence and financial crime prevention

🏢 Company: TRM Labs👥 101-250💰 $70,000,000 Series B about 2 years agoCryptocurrencyComplianceBlockchainBig Data

  • Academic background in a quantitative field such as Computer Science, Mathematics, Engineering, or Physics.
  • Strong knowledge of algorithm design and data structures with practical application experience.
  • Experience optimizing large-scale distributed data processing systems like Apache Spark, Apache Hadoop, Dask, and graph databases.
  • Experience in converting academic research into products with a history of collaborating on feature releases.
  • Strong programming experience in Python and SQL.
  • Excellent communication skills for technical and non-technical audiences.
  • Delivery-oriented with the ability to lead feature development from start to finish.
  • Autonomous ownership of work, capable of moving swiftly and efficiently.
  • Knowledge of basic graph theory concepts.
  • Designing and implementing graph algorithms that analyze large cryptocurrency transaction networks at multi-blockchain scale.
  • Researching new graph-native technology to evaluate benefit to data science and data engineering teams at TRM.
  • Collaborating with cryptocurrency investigators to identify key user stories and requirements for new graph algorithms and features.
  • Understanding and refining TRM’s risk models to assign risk scores to addresses.
  • Communicating complex implementation details to various audiences from investigators to data engineers.
  • Integrating with diverse data inputs ranging from raw blockchain data to model outputs.

PythonSQLApache HadoopAlgorithmsData engineeringData scienceData Structures

Posted 26 days ago
Apply
Apply
🔥 Data Platform Engineer
Posted 29 days ago

📍 Ukraine

  • 4+ years of experience in software/data engineering, data architecture, or a related field.
  • Strong programming skills in at least one language: Java, Scala, Python, or Go.
  • Experience with SQL and data modeling.
  • Hands-on experience with Apache Big Data frameworks such as Hadoop, Hive, Spark, Airflow, etc.
  • Proficiency in AWS cloud services.
  • Strong understanding of distributed systems, large-scale data processing, and data storage/retrieval.
  • Experience with data governance, security, and compliance is a plus.
  • Familiarity with CI/CD and DevOps practices is a plus.
  • Excellent communication and problem-solving skills.
  • Design, build, and maintain scalable and reliable data storage solutions.
  • Optimize and scale the platform for increasing data volumes and user requests.
  • Improve data storage, retrieval, query performance, and overall system performance.
  • Collaborate with data scientists, analysts, and stakeholders for tailored solutions.
  • Ensure proper integration of data pipelines, analytics tools, and ETL processes.
  • Troubleshoot and resolve platform issues in a timely manner.
  • Develop monitoring and alerting systems to ensure platform reliability.
  • Participate in code reviews and design discussions.
  • Evaluate new technologies to enhance the data platform.

AWSPythonSQLApache AirflowApache HadoopKafkaKubernetesData engineeringScalaData modeling

Posted 29 days ago
Apply
Apply

📍 Estonia

🧭 Internship

🔍 Communications

🏢 Company: Twilio - University Programs

  • To be working towards a Bachelors, Masters, or PhD degree in computer science, computer engineering or a related field.
  • To have a keen interest in software development with several side projects, and perhaps are a part of the open source community.
  • To have a hungry entrepreneurial and 'can do' spirit, demonstrated by interest in learning new technologies.
  • To have explored writing code in languages such as Python, Java, Javascript, PHP, C, or C++.
  • Be a Software Engineer, not just an 'intern'.
  • Ship many different projects during your summer.
  • Learn from passionate engineers at Twilio who solve problems in distributed computing, real-time DSP, virtualization performance, distributed messaging, and more.
  • Take responsibility for core features and services that ship to users.
  • Embrace challenges, learn fast, and deliver great results.
  • Participate in code reviews, bug tracking, and project management.

PHPPythonSoftware DevelopmentAgileApache HadoopJavaJavascriptC++Spark

Posted about 1 month ago
Apply
Apply

📍 US

💸 163000 - 189000 USD per year

🔍 Life Sciences

🏢 Company: Domino Data Lab👥 251-500💰 Series F over 2 years agoArtificial Intelligence (AI)Big DataMachine LearningAnalyticsEnterprise ApplicationsData MiningEnterprise SoftwareSoftware

  • 5+ years previously in a software engineering individual contributor role.
  • Experience in building scalable systems and managing high-performance back-end systems.
  • Cross-functional collaboration skills, integrating back-end systems with front-end and third-party services.
  • API development experience, particularly with RESTful APIs and gRPC.
  • Performance optimization abilities in cloud environments and with Docker and Kubernetes.
  • Testing and CI/CD experience for robust testing frameworks and maintaining pipelines.
  • Familiarity with distributed computing frameworks like Apache Hadoop, Spark, Ray.
  • Proficiency in back-end programming languages such as Python, Java, Scala, or Go.
  • Experience with service-oriented architectures and modular system design.
  • Knowledge of machine learning workflows and infrastructure is a plus.
  • Enhance Domino Datasets functionality to allow access via APIs.
  • Improve Git repository and version control support within Domino Projects and Workspaces.
  • Design and implement regulatory compliance features aligned with FDA requirements.
  • Build end-to-end audit logging for comprehensive workflow transparency.
  • Automate external data version tracking for third-party data sources.
  • Optimize workspace performance focusing on Kubernetes-based compute-on-demand IDE functionality.

DockerPythonApache HadoopGitHadoopJavaKubernetesMachine LearningGogRPCSparkCollaborationCI/CDRESTful APIsCompliance

Posted 3 months ago
Apply
Apply
🔥 Data Architect
Posted 4 months ago

📍 Mexico

🧭 Full-Time

🔍 AI consulting and custom software development

🏢 Company: Creai

  • Proven experience as a Data Architect with expertise in Microsoft Azure data services.
  • Strong understanding of data management principles, data modeling, and data integration techniques.
  • Hands-on experience with Azure Data Factory, Azure SQL, Cosmos DB, Azure Synapse Analytics, and other Azure cloud-based data tools.
  • Proficiency in building and maintaining data pipelines, data lakes, and data warehouses.
  • Experience with ETL processes and tools to automate and optimize data flows.
  • Strong knowledge of SQL, as well as experience with NoSQL databases.
  • Familiarity with data governance, security standards, and best practices for data architecture in cloud environments.
  • Excellent problem-solving and analytical skills, with the ability to work in a fast-paced environment.
  • Bachelor’s degree in Computer Science, Data Science, Information Technology, or a related field, or equivalent experience.
  • Design and implement data architectures that support AI and machine learning solutions, ensuring scalability, reliability, and performance.
  • Lead data integration efforts, including data pipelines, data warehouses, and data lakes, using Microsoft Azure services.
  • Work closely with cross-functional teams to ensure that data architecture meets business and technical requirements.
  • Optimize database performance, troubleshoot issues, and ensure data security and governance in compliance with industry standards.
  • Implement ETL processes, and manage data storage solutions such as Azure SQL, Cosmos DB, or Data Lake Storage.
  • Leverage Azure services like Azure Data Factory, Azure Synapse Analytics, and Azure Databricks to build and maintain robust data pipelines.
  • Maintain and document data models, architectural guidelines, and best practices to ensure consistent data architecture across projects.
  • Monitor and optimize data architectures to improve efficiency and cost-effectiveness in Azure environments.
  • Stay updated on new Azure data services and industry best practices.

SQLApache AirflowApache HadoopArtificial IntelligenceETLHadoopMachine LearningMicrosoft AzureAirflowAzureData scienceNosqlSparkAnalytical Skills

Posted 4 months ago
Apply
Apply

📍 United States

🧭 Full-Time

🔍 Cyber-security

🏢 Company: Shuvel Digital👥 11-50SEOUX DesignWeb DevelopmentWeb AppsWeb Design

  • Experience in machine learning and data science.
  • Knowledge of cyber-security data and statistical analysis.
  • Understanding of ETL data hygiene methods.
  • Familiarity with machine learning algorithms and frameworks.
  • Hands-on experience with Python, SQL, and data analysis tools.
  • Research, develop, architect, and integrate ML models and algorithms.
  • Collaborate with data scientists and other teams to address specific problems.
  • Design and implement data processing and ETL algorithms.
  • Analyze structured cyber-security data for insights.
  • Deploy ML solutions into production environments.

AWSDockerPythonSQLApache HadoopCloud ComputingData AnalysisETLHadoopJavaKafkaKerasKubernetesMachine LearningNumpyPyTorchAlgorithmsAzureData scienceGoRustSparkTensorflowAnalytical SkillsCollaborationProblem Solving

Posted 4 months ago
Apply
Shown 10 out of 13