ApplyMonitoring Support Engineer
Posted 29 days agoInactiveViewed
View full description
Requirements:
- Experience in querying SQL databases for analytical and diagnostic purposes
- Experience with building and maintaining modern monitoring, analysis and diagnostics systems, e.g. Prometheus, Zabbix, Nagios, Splunk, ELK, Grafana
- Scripting experience: eg Bash, Python or similar for writing new plugins for Nagios
- Experience with monitoring tools like Splunk, Datadog, OpsGenie or similar platforms.
- Strong knowledge of AWS CloudWatch and AWS Lambda for setting up alerts and monitoring workflows.
- Extensive Hands-on experience with Prometheus and Grafana for advanced monitoring and dashboard management.
- Proficiency in OpenSearch, Kibana, or equivalent log management tools like Graylog.
- Expertise in Kubernetes (K8s) monitoring, preferably with a Certified Kubernetes Administrator (CKA) certification.
- AWS Solutions Architect (AWS SA) certification preferred.
Responsibilities:
- Monitor ACS and Authentication systems and alerts to identify potential incidents.
- Respond promptly to alerts and investigate incidents.
- Deliver timely analysis, investigation and present key insights to internal stakeholders
- Collaborate with internal teams to resolve escalated issues quickly and efficiently.
- Assist in building and refining system alerts and Dashboards to improve detection capabilities.
- Contribute to enhancing incident response workflows and escalation protocols.
- Provide feedback to internal teams to improve system reliability and reduce false positives.
Apply