ApplySenior Site Reliability Engineer (SRE)
Posted 4 months agoViewed
View full description
💎 Seniority level: Senior, 5+ years
📍 Location: US, Portugal
🔍 Industry: Health Technology
🗣️ Languages: English
⏳ Experience: 5+ years
🪄 Skills: AWSDockerPostgreSQLPythonElasticSearchJavascriptJenkinsKubernetesMySQLAzureGoGrafanaPrometheusRedisNosqlCI/CD
Requirements:
- Proficiency in programming languages such as Python, Go, Javascript.
- 5+ years of experience with cloud platforms such as AWS, Google Cloud, or Azure.
- Strong understanding of Linux/Unix systems and networking.
- Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
- Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
- Knowledge of CI/CD pipelines and tools (e.g., Jenkins, GitLab CI).
- Proficiency with relational and NoSQL databases (e.g., MySQL, PostgreSQL, Redis, Elasticsearch).
- Willingness to collaborate and share knowledge with colleagues.
- Ability to take responsibility for work and demonstrate accountability.
Responsibilities:
- Develop and maintain monitoring and alerting solutions.
- Respond to incidents, troubleshoot issues, and perform root cause analysis.
- Automate repetitive tasks and improve deployment processes.
- Develop and maintain tools to support infrastructure and applications.
- Analyze system performance and implement optimizations to improve efficiency and reduce latency.
- Ensure systems are secure and compliant with relevant standards and regulations.
- Maintain comprehensive documentation of systems and processes.
- Share knowledge and best practices with team members.
- Ensure the reliability, performance, and scalability of databases.
- Perform database optimization, maintenance, and troubleshooting.
ApplyRelated Jobs
Apply📍 United States, Canada
🧭 Contract
🔍 Site Reliability Engineering
- 5-7 years in Site Reliability Engineering
- Experience with DFR, FMEA, MTBF methodologies
- Proficiency with monitoring tools like DataDog, PagerDuty
- Strong coding skills in languages used in SRE
- Identify and resolve complex bugs
- Write and maintain code for system reliability
- Investigate complex system issues
- Design and build fault-tolerant systems
- Develop and maintain reliability standards
PythonDebugging
Posted 3 months ago
Apply