Spark Jobs

Find remote positions requiring Spark skills. Browse through opportunities where you can utilize your expertise and grow your career.

Spark
184 jobs found. to receive daily emails with new job openings that match your preferences.
184 jobs found.

Set alerts to receive daily emails with new job openings that match your preferences.

Apply
πŸ”₯ Staff Applied Scientist
Posted about 3 hours ago

πŸ“ United States, Canada

🧭 Full-Time

πŸ’Έ 162400.0 - 223300.0 CAD per year

πŸ” Software Development

  • Masters or PhD in Computer Science or other quantitative field (e.g., Applied Math, Engineering, Computer Science, Physics)
  • 8+ years experience as a Scientist or Machine Learning Engineer
  • Proficiency in self-serving with data for experiments and model training at scale
  • Proficient with Spark, Ray, or a similar framework
  • Coding in python or similar
  • Strong functional knowledge of the iterative machine learning product development process
  • Experienced in developing and shipping production code
  • Ability to distill informal or ambiguous customer and business requirements into crisp problem definitions
  • Proven ability to communicate verbally and in writing to technical peers and leadership teams with various levels of technical knowledge
  • Experience coaching and mentoring scientists
  • Lead design and implementation of critical AI product initiatives
  • Develop both tactical AI solutions as well as more strategic and longer term research
  • Work with petabyte-scale data from customer operations including text, transactions, diagnostics, sensor, camera, and location data
  • Partner across business units to explore and prototype new AI experiences
  • Stay connected to industry and academic research and adopt novel technology that suits Samsara’s needs.
  • Champion, role model, and embed Samsara’s cultural principles (Focus on Customer Success, Build for the Long Term, Adopt a Growth Mindset, Be Inclusive, Win as a Team) as we scale globally and across new offices

AWSLeadershipProject ManagementPythonSQLArtificial IntelligenceCloud ComputingData AnalysisData MiningGitImage ProcessingMachine LearningNumpyAlgorithmsData scienceData StructuresREST APIPandasSparkTensorflowCommunication SkillsAnalytical SkillsCollaborationCI/CDProblem SolvingMentoringLinuxDevOpsWritten communicationMicroservicesData visualizationData modelingSoftware EngineeringCustomer Success

Posted about 3 hours ago
Apply
Apply
πŸ”₯ Staff Applied Scientist
Posted about 3 hours ago

πŸ“ United States, Canada

🧭 Full-Time

πŸ’Έ 165200.0 - 295000.0 USD per year

πŸ” IoT, AI

  • Masters or PhD in Computer Science or other quantitative field
  • 8+ years experience as a Scientist or Machine Learning Engineer
  • Proficiency in self-serving with data for experiments and model training at scale
  • Proficient with Spark, Ray, or a similar framework
  • Strong functional knowledge of the iterative machine learning product development process
  • Lead design and implementation of critical AI product initiatives
  • Develop both tactical AI solutions and longer-term research
  • Work with petabyte-scale data from customer operations

PythonMachine LearningSpark

Posted about 3 hours ago
Apply
Apply

πŸ“ Canada

🧭 Full-Time

πŸ’Έ 98400.0 - 137800.0 CAD per year

πŸ” Software Development

  • A degree in Computer Science or Engineering, and senior-level experience in developing and maintaining software or an equivalent level of education or work experience, and a track record of substantial contributions to software projects with high business impact.
  • Experience planning and leading a team using Scrum agile methodology ensuring timely delivery and continuous improvement.
  • Experience liaising with various business stakeholders to understand their data requirements and convey the technical solutions.
  • Experience with data warehousing and data modeling best practices.
  • Passionate interest in data engineering and infrastructure; ingestion, storage and compute in relational, NoSQL, and serverless architectures
  • Experience developing data pipelines and integrations for high volume, velocity and variety of data.
  • Experience writing clean code that performs well at scale; ideally experienced with languages like Python, Scala, SQL and shell script.
  • Experience with various types of data stores, query engines and data frameworks, e.g. PostgreSQL, MySQL, S3, Redshift, Presto/Athena, Spark and dbt.
  • Experience working with message queues such as Kafka and Kinesis
  • Experience with ETL and pipeline orchestration such as Airflow, AWS Glue
  • Experience with JIRA in managing sprints and roadmaps
  • Lead development and maintenance of scalable and efficient data pipeline architecture
  • Work within cross-functional teams, including Data Science, Analytics, Software Development, and business units, to deliver data products and services.
  • Collaborate with business stakeholders and translate requirements into scalable data solutions.
  • Monitor and communicate project statuses while mitigating risk and resolving issues.
  • Work closely with the Senior Manager to align team priorities with business objectives.
  • Assess and prioritize the team's work, appropriately delegating to others and encouraging team ownership.
  • Proactively share information, actively solicit feedback, and facilitate communication, within teams and other departments.
  • Design, write, test, and deploy high quality scalable code.
  • Maintain high standards of security, reliability, scalability, performance, and quality in all delivered projects.
  • Contribute to shape our technical roadmap as we scale our services and build our next generation data platform.
  • Build, support and lead a high performance, cohesive team of developers, in close partnership with the Senior Manager, Data Analytics.
  • Participate in the hiring process, with an aim of attracting and hiring the best developers.
  • Facilitate ongoing development conversations with your team to support their learning and career growth.

AWSLeadershipPostgreSQLProject ManagementPythonSoftware DevelopmentSQLAgileETLKafkaKubernetesMySQLSCRUMJiraAirflowAlgorithmsData engineeringData StructuresSparkCommunication SkillsCI/CDMentoringCoachingScalaTeam managementData modelingData analytics

Posted about 3 hours ago
Apply
Apply

πŸ“ United States, Canada

🧭 Full-Time

πŸ” Software Development

  • 4+ years of backend development experience
  • Proficiency in Python or Typescript
  • Strong experience with AWS services
  • Familiarity with data engineering tools
  • Solid understanding of ELT/ETL processes
  • Build new features for the REST API
  • Work with product management, designers, and QA team
  • Optimize application performance
  • Develop features using AWS tools
  • Automate deployments and CI/CD pipelines

AWSBackend DevelopmentNode.jsApache AirflowDynamoDBETLKafkaTypeScriptREST APISparkCI/CDTerraform

Posted about 6 hours ago
Apply
Apply
πŸ”₯ Staff Data Engineer
Posted about 6 hours ago

πŸ“ United States

πŸ” Software Development

  • 5+ years of work experience as a data engineer/full stack engineering, coding in Python.
  • 5+ years of experience building web scraping tools in python, using Beautiful Soup, Scrapy, Selenium, or similar tooling
  • 3-5 years of deployment experience with CI/CD
  • Strong experience of HTML, CSS, JavaScript, and browser behavior.
  • Experience with RESTful APIs and JSON/XML data formats.
  • Knowledge of cloud platforms and containerization technologies (e.g., Docker, Kubernetes).
  • Advanced understanding of how at least one big data processing technology works under the hood (e.g. Spark / Hadoop / HDFS / Redshift / BigQuery / Snowflake)
  • Use modern tooling to build robust, extensible, and performant web scraping platform
  • Build thoughtful and reliable data acquisition and integration solutions to meet business requirements and data sourcing needs.
  • Deliver best in class infrastructure solutions for flexible and repeatable applications across disparate sources.
  • Troubleshoot, improve and scale existing data pipelines, models and solutions
  • Build upon data engineering's CI/CD deployments, and infrastructure-as-code for provisioning AWS and 3rd party (Apify) services.

AWSDockerPythonSQLApache AirflowHTMLCSSJavascriptKubernetesData engineeringSeleniumSparkCI/CDRESTful APIsJSON

Posted about 6 hours ago
Apply
Apply
πŸ”₯ Staff Data Engineer
Posted about 6 hours ago

πŸ“ United States, Canada

🧭 Full-Time

πŸ’Έ 177000.0 - 239000.0 USD per year

πŸ” Software Development

  • 8+ years of professional software engineering experience
  • 7+ years building data processing applications
  • Hands-on experience with Java, Scala, Python
  • Experience with Big Data engines like Hive, Spark
  • Familiarity with AWS, Azure, or GCP
  • Design, develop, and automate data processing systems
  • Build data engineering strategy
  • Mentor Analytics & Data Engineers
  • Establish architectural roadmaps

AWSPythonJavaMySQLSnowflakeApache KafkaData engineeringPostgresSparkScalaData modeling

Posted about 6 hours ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ’Έ 190000.0 - 215000.0 USD per year

πŸ” SaaS Security

  • 8+ years of professional software engineering experience
  • Hands-on development in Python or Go
  • Experience with streaming platforms like Kafka
  • Proficiency in cloud-native development
  • Strong understanding of SaaS and security operations is a plus
  • Architect & implement high-throughput data pipelines
  • Drive technical direction and system design
  • Lead end-to-end execution of complex projects
  • Mentor and coach junior and mid-level engineers
  • Champion engineering excellence and reliability

PythonCloud ComputingKafkaSoftware ArchitectureClickhouseData engineeringGoSparkCI/CDMicroservicesSaaS

Posted about 23 hours ago
Apply
Apply
πŸ”₯ Data Engineer
Posted 1 day ago

πŸ“ United States

πŸ’Έ 201571.5 - 240000.0 USD per year

🏒 Company: MercariπŸ‘₯ 101-250πŸ’° $46,933,777 almost 7 years agoπŸ«‚ Last layoff 7 months agoInternetMarketplaceE-CommerceMobile

  • Bachelor of Science degree in Computer Science or closely related field of study and five (5) years of experience as a Data Engineer, Data Specialist, or related occupation where required experience gained.
  • 5 years of experience in the following: Confluence; JIRA; GIT; CI/CD pipelines; Agile methodologies; Java; Python; Data Modeling or Data Warehouse; ETL: Apache Airflow; Container: Docker or Kubernetes; API: gRPC, Tensorflow Serving, or Flask (REST); Database: MySQL, Postgres, Oracle, SqlServer, or Google Spanner; Distributed Processing: Apache Beam or Apache Spark; Machine Learning: Tensorflow, Keras, or Scikit-Learn, etc.; and Cloud: Google Cloud (BigQuery, Google Dataflow, or Google Dataproc, etc.).
  • Design, build, and operate ETL pipelines at scale.
  • Automate processes related to data products and machine learning products.
  • Design data structure for data products.
  • Build knowledge graphs, flow charts, and system diagrams for problem analysis.
  • Develop and operate API/tools related to data products and machine learning products.
  • Provide technical solutions using Big Data technologies and create technical design documents.
  • Design and develop Data Platform using Python, Spark, and BigQuery.
  • Build Devops platform for Continuous Integration/Continuous deployment stack for ETL applications teams.
  • Profile, debug, and optimize apps.

DockerPythonApache AirflowCloud ComputingETLFlaskGitJavaKerasKubernetesMachine LearningMySQLOracleJiraData engineeringgRPCPostgresSparkTensorflowCI/CDAgile methodologiesDevOpsData modelingConfluence

Posted 1 day ago
Apply
Apply

πŸ“ Latin America

🧭 Full-Time

πŸ” Insurance Industry

🏒 Company: NearsureπŸ‘₯ 501-1000Staffing AgencyOutsourcingSoftware

  • Bachelor's Degree in Computer Science or related field
  • 5+ years experience with Python and Scala for data engineering
  • 5+ years experience with AWS or GCP
  • 3+ years experience with Kubernetes
  • Expert in SQL programming
  • Design, develop, maintain, and enhance data engineering solutions
  • Build scalable data pipelines with focus on quality
  • Ingest and transform structured and unstructured data
  • Automate existing code and processes

AWSPythonSQLApache AirflowETLKubernetesSparkScala

Posted 1 day ago
Apply
Apply

πŸ“ United States

πŸ’Έ 152000.0 - 213000.0 USD per year

πŸ” Financial Services

🏒 Company: GeminiπŸ‘₯ 501-1000πŸ’° $1,000,000 Secondary Market over 2 years agoπŸ«‚ Last layoff about 2 years agoCryptocurrencyWeb3Financial ServicesFinanceFinTech

  • 4+ years of work experience in analytics and data science domain focusing on financial services-related business problems.
  • 3+ years of experience deploying statistical and machine learning models in production.
  • 2+ years of experience in integrating data science models into applications.
  • Proven experience in developing and deploying ML models at scale, with a deep understanding of model lifecycle management.
  • Knowledge and experience of crypto exchange trading, financial markets, or banking.
  • Extensive knowledge of ML frameworks (Sagemaker or ML Flow) , libraries, data structures, data modeling, and software architecture.
  • Advanced skills with SQL are a must.
  • Proficient in Python.
  • Experience with one or more big data tools and technologies like Snowflake, Databricks, S3, Hadoop, Spark.
  • Experienced in working collaboratively across different teams and departments.
  • Strong technical and business communication.
  • Design and develop Trust & Safety machine learning and AI models to optimize across fraud, crypto exchange trading, and anti money laundering.
  • Distill complex models and analysis into compelling insights for our stakeholders and executives.
  • Analyze large and complex datasets to identify patterns for feature engineering, trends, and anomalies and develop predictive models that can be used for decision-making.
  • Collaborate with software developers to design and implement machine learning systems that can improve the speed and accuracy of the machine learning models.
  • Monitor and analyze the performance of our machine learning models and systems and make necessary improvements to ensure their effectiveness.
  • Stay up-to-date with data science tools and methodologies in technology and financial domain.
  • Perform root cause analysis and resolve production and data issues.

AWSPythonSQLData AnalysisGitMachine LearningMLFlowSnowflakeSoftware ArchitectureAlgorithmsData scienceData StructuresSparkCommunication SkillsAnalytical SkillsProblem SolvingRESTful APIsData modeling

Posted 1 day ago
Apply
Shown 10 out of 184