Apply

Senior Site Reliability Engineer (SRE)

Posted 4 months agoViewed

View full description

💎 Seniority level: Senior, 5+ years

📍 Location: US, Portugal

🔍 Industry: Health Technology

🗣️ Languages: English

⏳ Experience: 5+ years

🪄 Skills: AWSDockerPostgreSQLPythonElasticSearchJavascriptJenkinsKubernetesMySQLAzureGoGrafanaPrometheusRedisNosqlCI/CD

Requirements:
  • Proficiency in programming languages such as Python, Go, Javascript.
  • 5+ years of experience with cloud platforms such as AWS, Google Cloud, or Azure.
  • Strong understanding of Linux/Unix systems and networking.
  • Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
  • Knowledge of CI/CD pipelines and tools (e.g., Jenkins, GitLab CI).
  • Proficiency with relational and NoSQL databases (e.g., MySQL, PostgreSQL, Redis, Elasticsearch).
  • Willingness to collaborate and share knowledge with colleagues.
  • Ability to take responsibility for work and demonstrate accountability.
Responsibilities:
  • Develop and maintain monitoring and alerting solutions.
  • Respond to incidents, troubleshoot issues, and perform root cause analysis.
  • Automate repetitive tasks and improve deployment processes.
  • Develop and maintain tools to support infrastructure and applications.
  • Analyze system performance and implement optimizations to improve efficiency and reduce latency.
  • Ensure systems are secure and compliant with relevant standards and regulations.
  • Maintain comprehensive documentation of systems and processes.
  • Share knowledge and best practices with team members.
  • Ensure the reliability, performance, and scalability of databases.
  • Perform database optimization, maintenance, and troubleshooting.
Apply

Related Jobs

Apply

📍 United States, Canada

🧭 Contract

🔍 Site Reliability Engineering

  • 5-7 years in Site Reliability Engineering
  • Experience with DFR, FMEA, MTBF methodologies
  • Proficiency with monitoring tools like DataDog, PagerDuty
  • Strong coding skills in languages used in SRE
  • Identify and resolve complex bugs
  • Write and maintain code for system reliability
  • Investigate complex system issues
  • Design and build fault-tolerant systems
  • Develop and maintain reliability standards

PythonDebugging

Posted 3 months ago
Apply