Apply

Senior Reliability Engineer

Posted 2024-10-08

View full description

💎 Seniority level: Senior, 7+ years

📍 Location: Canada

🔍 Industry: Financial technology

🏢 Company: Flinks

⏳ Experience: 7+ years

🪄 Skills: DockerProject ManagementGCPKibanaKubernetesC#StrategyGrafana.NETPrometheusDocumentation

Requirements:
  • Operationally focused with expertise in incident management and live production issue resolution.
  • Strong debugging and troubleshooting skills, particularly in large-scale applications performance optimization.
  • Proven experience in building and maintaining monitoring and alerting systems.
  • 7+ years of experience with .NET Framework (C#) for production stability.
  • Strong knowledge of Kubernetes, Docker, and cloud platforms like GCP.
  • Proficiency with monitoring tools such as Prometheus, Grafana, and Kibana.
  • Experience with incident ticketing/documentation tools like FreshDesk and Confluence.
  • Critical thinking ability to identify system weaknesses and innovate solutions.
  • Strong project management skills focused on scalability and stability.
  • ITIL Service Management certification (or equivalent) is highly desired.
  • Experience with PowerBI, web scraping, or Golang is a plus.
Responsibilities:
  • Provide live operational support for multiple client software applications, ensuring rapid restoration of services.
  • Develop and maintain code to quickly resolve production issues.
  • Own and resolve incidents, adhering to client SLA and internal SLO timelines.
  • Troubleshoot complex incidents and implement solutions to prevent recurrence.
  • Utilize data-driven approaches to prepare detailed analyses and reports.
  • Conduct deep technical analyses of product deficiencies and address client pain points.
  • Develop monitoring systems and implement robust alert mechanisms.
  • Provide guidance on improving operational system stability.
  • Lead initiatives that automate processes for operational efficiency.
  • Facilitate postmortem meetings following incidents.
  • Collaborate with cross-functional teams for rapid resolution of production issues.
  • Lead and motivate project teams to ensure quality standards.
  • Mentor reliability engineers and track their progress.
  • Participate in after-hours on-call support.
Apply