Apply

Senior Site Reliability Engineer (SRE)

Posted 3 months agoViewed

View full description

💎 Seniority level: Senior, 5-7 years

📍 Location: United States, Canada, U.S. time zones

🔍 Industry: Site Reliability Engineering

🗣️ Languages: English

⏳ Experience: 5-7 years

🪄 Skills: PythonDebugging

Requirements:
  • 5-7 years in Site Reliability Engineering
  • Experience with DFR, FMEA, MTBF methodologies
  • Proficiency with monitoring tools like DataDog, PagerDuty
  • Strong coding skills in languages used in SRE
Responsibilities:
  • Identify and resolve complex bugs
  • Write and maintain code for system reliability
  • Investigate complex system issues
  • Design and build fault-tolerant systems
  • Develop and maintain reliability standards
Apply

Related Jobs

Apply

📍 US, Portugal

🧭 Full-Time

🔍 Health Technology

  • Proficiency in programming languages such as Python, Go, Javascript.
  • 5+ years of experience with cloud platforms such as AWS, Google Cloud, or Azure.
  • Strong understanding of Linux/Unix systems and networking.
  • Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
  • Knowledge of CI/CD pipelines and tools (e.g., Jenkins, GitLab CI).
  • Proficiency with relational and NoSQL databases (e.g., MySQL, PostgreSQL, Redis, Elasticsearch).
  • Willingness to collaborate and share knowledge with colleagues.
  • Ability to take responsibility for work and demonstrate accountability.
  • Develop and maintain monitoring and alerting solutions.
  • Respond to incidents, troubleshoot issues, and perform root cause analysis.
  • Automate repetitive tasks and improve deployment processes.
  • Develop and maintain tools to support infrastructure and applications.
  • Analyze system performance and implement optimizations to improve efficiency and reduce latency.
  • Ensure systems are secure and compliant with relevant standards and regulations.
  • Maintain comprehensive documentation of systems and processes.
  • Share knowledge and best practices with team members.
  • Ensure the reliability, performance, and scalability of databases.
  • Perform database optimization, maintenance, and troubleshooting.

AWSDockerPostgreSQLPythonElasticSearchJavascriptJenkinsKubernetesMySQLAzureGoGrafanaPrometheusRedisNosqlCI/CD

Posted 4 months ago
Apply