Prometheus Jobs

Find remote positions requiring Prometheus skills. Browse through opportunities where you can utilize your expertise and grow your career.

Prometheus
167 jobs found. to receive daily emails with new job openings that match your preferences.
167 jobs found.

Set alerts to receive daily emails with new job openings that match your preferences.

Apply

πŸ“ India

πŸ” Data and cloud solutions

  • A minimum of 2 years of experience in a similar role.
  • Ability to operate in a 24x7 operational environment, including shifts, weekends, and holidays.
  • GCP background experience in a Linux environment.
  • Familiarity with Windows, Linux, and Network administration concepts.
  • Ability to handle stressful situations calmly.
  • Excellent English communication skills, both verbal and written.
  • Ability to multitask and respond quickly to issues.
  • A curious mindset to diagnose problems.
  • SQL Server knowledge is an asset.
  • Experience with monitoring tools such as ICINGA, Zabbix, Prometheus, Grafana.
  • Knowledge of Shell Scripting/Programming for system analysis and process improvements.
  • Relevant courses and certifications like MCSE, CCNA.

  • Network Operations Center (NOC) support in a 24x7 environment.
  • Monitor application performance and system alerts using various monitoring tools.
  • Respond to incidents and escalate issues according to established protocols.
  • Trigger operational procedures for various types of production incidents.
  • Participate actively in customer incidents by providing data, communication and updates.
  • Maintain knowledge of ITSM Tools and operational procedures.
  • Coordinate and administer solutions based on needs.
  • Recommend best practices for operational improvements.
  • Configure and maintain servers at database and infrastructure level.
  • Troubleshoot operational problems and perform daily checks.
  • Communicate status and planning with customers and team members.
  • Participate in rotational shift pattern and on-call coverage.
  • Learn database technologies and provide support.

GCPZabbixGrafanaPrometheusLinux

Posted 44 minutes ago
Apply
Apply

πŸ“ Spain

🧭 Full-Time

πŸ’Έ 90000.0 - 110000.0 USD per year

πŸ” Machine Learning

🏒 Company: Constructor

  • Experience in designing, developing & maintaining high-load distributed real-time services (in cloud).
  • Proficiency in Infrastructure as Code (IaC) tools like CloudFormation or Terraform.
  • Experience with MLOps for delivering, loading, and serving ML models.
  • Hands-on experience with CI/CD pipelines.
  • Proficiency with Python and familiarity with compiled languages like C, Rust, or Go.
  • Experience in server-side coding for web services and knowledge of API design principles.
  • Skilled in observability tools like Prometheus and Grafana.
  • Familiarity with Service-Oriented Architecture and communication protocols like Protobuf.
  • Experience with NoSQL and relational databases, distributed systems, and caching solutions.
  • Experience with major public cloud platforms like AWS, Azure, GCP.

  • Design, deliver & maintain high-load real-time web services in collaboration with Ranking team engineers.
  • Build, deploy, and support robust machine learning-based real-time systems for search and browse experiences.
  • Collaborate with business partners to develop and update ranking functionalities.
  • Optimize performance of the ranking service and ensure quick processing of requests.
  • Enhance signals delivery and retrieval for machine learning model inference.
  • Communicate with stakeholders within and outside the team.

AWSPythonFastAPIGoGrafanaPrometheusREST APIRedisNosqlRustCI/CDTerraform

Posted about 19 hours ago
Apply
Apply
πŸ”₯ DevOps Engineer
Posted about 21 hours ago

πŸ“ Slovakia, Ukraine

πŸ” EdTech, Fintech, eCommerce, Pharma

🏒 Company: Altamira.ai

  • Experience: Minimum 4 years in DevOps, SRE, or similar roles.
  • Cloud Expertise: Hands-on experience with AWS, Azure, and GCP.
  • Automation Skills: Proficiency in scripting languages like Python, Bash, or Shell.
  • Containerization: In-depth knowledge of Docker and Kubernetes.
  • IaC Tools: Practical experience with Terraform, Ansible, and CloudFormation.
  • Networking: Strong understanding of networking concepts.
  • Monitoring Expertise: Experience configuring observability solutions.
  • Security: Proven track record in implementing security measures.
  • Collaboration: Strong interpersonal and communication skills.

  • Cloud Infrastructure Management: Build, deploy, and manage cloud-based infrastructure.
  • Automation and CI/CD: Develop and optimize CI/CD pipelines for AI workloads.
  • Infrastructure as Code (IaC): Utilize tools for consistent infrastructure deployments.
  • Containerization and Orchestration: Implement containerized environments using Docker and Kubernetes.
  • Monitoring and Observability: Set up and manage monitoring, logging, and alerting solutions.
  • Security and Compliance: Implement best practices in security and compliance.
  • Collaboration: Partner with cross-functional teams to align infrastructure with project needs.
  • Continuous Improvement: Research and adopt emerging DevOps tools and methodologies.

AWSDockerBashGCPKubernetesAzureGrafanaPrometheusCI/CDTerraformNetworkingAnsible

Posted about 21 hours ago
Apply
Apply

πŸ“ Slovakia, Ukraine, Middle East

πŸ” EdTech, Fintech, eCommerce, Pharma

  • Minimum 4 years in DevOps, SRE, or similar roles with a focus on cloud environments.
  • Hands-on experience with AWS, Azure, and GCP, including Kubernetes and AI/ML services.
  • Proficiency in scripting languages like Python, Bash, or Shell, and tools like Jenkins.
  • In-depth knowledge of Docker and Kubernetes for orchestration.
  • Practical experience with Terraform, Ansible, and CloudFormation.
  • Strong networking knowledge, including DNS and load balancing.
  • Experience with observability solutions like Grafana or Datadog.
  • Track record in implementing security measures for cloud and CI/CD.

  • Build, deploy, and manage cloud-based infrastructure on platforms like AWS, Azure, and GCP for large-scale AI applications.
  • Develop and optimize CI/CD pipelines for automated software delivery.
  • Utilize IaC tools like Terraform and Ansible for consistent infrastructure deployments.
  • Implement Docker and Kubernetes environments for efficient resource utilization.
  • Setup monitoring and observability solutions to maintain system performance.
  • Enforce security and compliance best practices across infrastructure and processes.
  • Collaborate with AI, data science, and development teams for project alignment.
  • Research and adopt new DevOps tools and methodologies.

AWSDockerPythonBashGCPKubernetesAzureGrafanaPrometheusCI/CDTerraformNetworkingAnsible

Posted 1 day ago
Apply
Apply

πŸ“ US, Ireland

πŸ” E-commerce

  • Proven experience leading high-impact projects with cross-functional teams and setting architectural direction.
  • Extensive experience designing, optimizing, and implementing scalable APIs, services, and applications.
  • Exceptional skills in verbal, written, and interpersonal communication.
  • Proficiency in server-side programming languages (e.g., Go, Python, Java) and database design.
  • Experience with MVC frameworks (e.g., Django, .NET, Spring) and relevant tools.

  • Lead the design and deployment of scalable, resilient software services that handle millions of requests daily, ensuring they meet SLAs and complex business needs.
  • Optimize label generation and streamline refund processes.
  • Mentor team members while driving technical excellence.
  • Participate in peer reviews and testing, contributing to high-quality standards through automated test suites.
  • Provide effective on-call support to address system incidents.

AWSDockerPythonDjangoKubernetesGoGrafanagRPCPrometheusREST APICI/CDMicroservices

Posted 1 day ago
Apply
Apply
πŸ”₯ Devops Engineer
Posted 2 days ago

πŸ“ India

πŸ” Software Testing

  • Strong knowledge in Python and Bash (or similar Unix shell).
  • Working experience with Ansible, Terraform, Docker, Kubernetes, Prometheus, and AWS.
  • Good to have virtualization tools like KVM, ESXi.
  • Good knowledge of Linux operating systems and networking concepts.
  • Drive to understand complex infrastructure environments.
  • Aggressive problem diagnosis and creative problem-solving skills.
  • Startup mentality, high willingness to learn, hardworking attitude.
  • Experience of 1 - 3 years.

  • Work on AWS Kubernetes to manage global clusters.
  • Identify areas of improvement in frameworks and tools and evolve reporting systems.
  • Lead incident response efforts with cross-functional teams to resolve issues.
  • Participate in on-call rotation for infrastructure issue resolution.
  • Collaborate with stakeholders to implement effective solutions.
  • Support customers during onboarding to optimize performance and reliability.
  • Enhance platform capabilities with a focus on reliability and scalability.

AWSDockerPythonBashGCPKubernetesPrometheusCI/CDLinuxTerraformAnsible

Posted 2 days ago
Apply
Apply

πŸ“ US

🧭 Full-Time

πŸ’Έ 175000.0 - 215000.0 USD per year

πŸ” CTV advertising

🏒 Company: tvScientificπŸ‘₯ 11-50πŸ’° $9,400,000 Convertible Note 11 months agoInternetAdvertising

  • Strong proficiency in automation tools and scripting languages (Python, Bash).
  • Extensive experience with cloud platforms (especially AWS) and infrastructure as code (Terraform, CloudFormation).
  • In-depth knowledge of CI/CD tools (Jenkins, Github Actions) and practices.
  • Familiarity with containerization technologies (Docker, Kubernetes) and orchestration.
  • Familiarity with deployment methodologies (e.g. Blue-Green, Canary) and tools (e.g. Argo Rollout).
  • Experience with monitoring and logging tools (CloudWatch, OpenTelemetry, Grafana, Prometheus, ELK stack).
  • Solid understanding of networking concepts and security best practices.
  • Excellent problem-solving and troubleshooting skills.
  • Strong communication and collaboration abilities, especially in cross-functional teams.
  • Bachelor's degree in Computer Science, Engineering, or a related field.

  • Collaborate across teams to optimize software development.
  • Implement and maintain scalable, reliable, and secure infrastructure as code practices.
  • Design, build, and manage CI/CD pipelines for automated testing and deployment.
  • Evaluate and recommend tools for improved DevOps efficiency.
  • Provide technical leadership and mentorship for continuous improvement.
  • Proactively address bottlenecks, performance issues, and security vulnerabilities.
  • Participate in on-call rotations and incident response for system reliability.
  • Monitor system metrics and troubleshoot issues for optimal performance.

AWSDockerPythonBashJenkinsKubernetesGrafanaPrometheusCI/CDTerraform

Posted 2 days ago
Apply
Apply
πŸ”₯ DevOps Engineer
Posted 4 days ago

πŸ“ Americas

πŸ’Έ 2000.0 - 8000.0 USD per month

🏒 Company: AltScore

  • 2+ years of experience in DevOps, system administration, or cloud engineering.
  • Expertise in cloud platforms, including provisioning and managing resources.
  • Experience with containerization tools like Docker and Kubernetes.
  • Strong understanding of infrastructure automation.
  • Familiarity with CI/CD tools for deployment pipelines, preferably GitHub actions.
  • Proficient in scripting and automation using Bash and Python.
  • Strong understanding of network configurations, firewalls, load balancers, and DNS management.
  • Experience in monitoring, alerting, and logging systems, such as Prometheus, Grafana, ELK stack.
  • Knowledge of security practices related to cloud infrastructure, including IAM and encryption.
  • Strong problem-solving, debugging, and performance optimization skills.
  • Excellent communication skills in Spanish and English.
  • Ability to work independently and remotely in a fast-paced environment.

  • Design, implement, and maintain scalable infrastructure for cloud-based applications.
  • Automate deployments and CI/CD pipelines to ensure seamless integration and continuous delivery.
  • Collaborate with development teams to define infrastructure requirements and best practices.
  • Manage cloud resources ensuring optimal cost, performance, and security.
  • Troubleshoot issues related to system performance, network configurations, and uptime.
  • Implement monitoring, alerting, and logging solutions for system reliability.
  • Ensure infrastructure security and compliance with best practices.
  • Develop disaster recovery plans and redundancy measures.
  • Optimize workflows to improve efficiency and reduce manual intervention.
  • Participate in on-call rotations for operational support.
  • Stay updated with industry trends to improve DevOps practices.

AWSDockerPythonBashGCPKubernetesAzureGrafanaPrometheusCI/CDTerraformNetworkingTroubleshooting

Posted 4 days ago
Apply
Apply

πŸ“ United States

🧭 Contract

πŸ’Έ 50.0 - 60.0 USD per hour

πŸ” Cloud Infrastructure

🏒 Company: Third Eye SoftwareπŸ‘₯ 11-50ConsultingInformation TechnologyRecruitingSoftware

  • 3-5 years of hands-on professional experience in a Cloud, Infrastructure, or Systems Engineering role.
  • Proficiency with Google Cloud Platform (GCP) services, including deployment and management of resources.
  • Expertise with Kubernetes for deploying, managing, and maintaining production clusters.
  • Strong proficiency with Terraform for infrastructure-as-code practices.
  • Experience with monitoring and logging tools such as Prometheus and Grafana.
  • Familiarity with CI/CD tools like GitHub Actions.
  • Knowledge of networking concepts and protocols such as network setup, IPs, and namespaces.
  • Strong problem-solving skills and attention to detail.
  • Outstanding communication skills for effective teamwork.
  • Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).

  • Design, set up, and maintain cloud-based infrastructure including clusters, namespaces, networks, and IP management.
  • Support the development and optimization of internal tools, improving developer onboarding and automating workflows.
  • Contribute to backend automation, CI/CD pipelines, and tools to enhance productivity and reliability.
  • Work closely with cross-functional teams to address technical challenges and support project deliverables.
  • Provide expertise in GCP deployments and ensure smooth migration processes.
  • Troubleshoot and resolve issues with GCP services, Kubernetes deployments, Terraform configurations, and other cloud technologies.
  • Create and maintain documentation for best practices, troubleshooting procedures, and internal training.
  • Collaborate with team leads to align infrastructure strategies with project goals.

GCPKubernetesGrafanaPrometheusCI/CDTerraformNetworking

Posted 4 days ago
Apply
Apply

πŸ“ Armenia

🧭 Full-Time

πŸ” Technology

  • Minimum 5 years of experience as a DevOps or SRE engineer.
  • Proven experience with Azure cloud architectures.
  • Proficiency in Kubernetes and Docker/Linux services.
  • Familiarity with monitoring tools: Prometheus, Grafana, OpsGenie.
  • Experience with .NET Core and ASP.NET Core applications.
  • Strong knowledge of Cosmos DB (both Mongo API & SQL API) and MS SQL Server.
  • Expertise in Terraform.
  • Experience with CI/CD tools and Azure Networking concepts.
  • Excellent communication skills, ability to manage tasks and projects independently.
  • Experience with Azure IoT Hub and EventHub is an added advantage.

  • Design, implement, and maintain highly scalable and available systems across Azure cloud architectures.
  • Regularly test and implement disaster recovery (DR) plans.
  • Configure and enhance monitoring and alerting processes using Prometheus, Grafana, and OpsGenie.
  • Develop dashboards to visualize system performance and reliability metrics.
  • Use Terraform for infrastructure provisioning and management.
  • Support the development team in ongoing projects.
  • Communicate with the customer’s DevOps team to discuss requirements and collaborate on implementations.
  • Enhance release management and CI/CD processes.
  • Improve system security based on security team recommendations.
  • Document system support processes and design, write and test runbooks for operational tasks and incident response.

DockerKubernetesMicrosoft SQL ServerAzureGrafana.NET corePrometheusCI/CDTerraform

Posted 4 days ago
Apply
Shown 10 out of 167