Senior DevOps Engineer

New
C
CheckmateRestaurant Technology
India, 2 PM to 11 PM ISTFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page

Job Details

Languages
English
Experience
8+ years
Required Skills
AWSBashCI/CDLinuxDevOpsTerraformAnsibleGitHub ActionsDatadog

Requirements

  • 8+ years of professional DevOps, infrastructure, or platform engineering experience in production environments
  • Hands-on proficiency with Terraform for infrastructure provisioning — writing modules, managing state, and working across environments
  • Deep familiarity with AWS — including compute (EC2, ECS), storage (S3, RDS), networking (VPC, Route 53, CloudFront), and IAM
  • Experience with Ansible for configuration management and automation across server fleets or container environments
  • Strong understanding of CI/CD principles and hands-on experience building or maintaining pipelines (GitHub Actions, GitLab CI, CircleCI, or equivalent)
  • Experience with Linux system administration, shell scripting (Bash), and general infrastructure debugging
  • Demonstrated ability to work within an established infrastructure — understanding existing design decisions, following conventions, and improving incrementally rather than replacing wholesale
  • Solid grasp of security fundamentals: IAM least-privilege, secrets management, network access controls, and patching hygiene
  • Strong written and verbal communication skills in English — able to collaborate asynchronously across time zones and document work clearly
  • BSc in Computer Science, Engineering, or a related field — or equivalent professional experience
  • Must be comfortable working in 2 PM to 11 PM IST

Responsibilities

  • Design, implement, and maintain cloud infrastructure on AWS using Terraform and Ansible, following existing conventions and extending them thoughtfully.
  • Manage and support AWS services across our stack including EC2, ECS, RDS, S3, IAM, VPC, CloudFront, and related services.
  • Maintain and improve infrastructure-as-code practices, ensuring consistency, reproducibility, and auditability across environments.
  • Participate in capacity planning and cost optimization, identifying opportunities to improve resource efficiency without compromising reliability.
  • Build, maintain, and improve CI/CD pipelines (GitHub Actions or equivalent) to support reliable, automated delivery across development, staging, and production environments.
  • Work with engineering teams to improve build speed, deployment safety, and rollback capabilities.
  • Support blue/green and canary deployment strategies as appropriate for our platform needs.
  • Participate in on-call rotation and own production incidents end-to-end — from detection through root cause analysis, resolution, and post-mortem.
  • Use observability tooling (Datadog, CloudWatch, or equivalent) to monitor system health, establish alerting thresholds, and proactively surface issues before they impact customers.
  • Contribute to runbooks, incident documentation, and process improvements that reduce mean time to resolution over time.
  • Apply security best practices across infrastructure — IAM policy scoping, secrets management, network segmentation, vulnerability patching, and access controls.
  • Support compliance and audit requirements by maintaining clear documentation and ensuring infrastructure changes are tracked and reviewable.
  • Work closely with the senior engineer on the team to learn existing systems deeply and contribute to architectural improvements over time.
  • Proactively identify areas for improvement — tooling, automation gaps, manual processes, reliability risks — and raise them constructively with the team.
  • Document infrastructure clearly so that other engineers can understand and operate the systems they depend on.
View Full Description & ApplyYou'll be redirected to the employer's site
View details
Apply Now