Senior DevOps Engineer

New

CheckmateRestaurant Technology

India, 2 PM to 11 PM ISTFull-TimeSenior

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

Languages: English
Experience: 8+ years
Required Skills: AWSBashCI/CDLinuxDevOpsTerraformAnsibleGitHub ActionsDatadog

Requirements

8+ years of professional DevOps, infrastructure, or platform engineering experience in production environments
Hands-on proficiency with Terraform for infrastructure provisioning — writing modules, managing state, and working across environments
Deep familiarity with AWS — including compute (EC2, ECS), storage (S3, RDS), networking (VPC, Route 53, CloudFront), and IAM
Experience with Ansible for configuration management and automation across server fleets or container environments
Strong understanding of CI/CD principles and hands-on experience building or maintaining pipelines (GitHub Actions, GitLab CI, CircleCI, or equivalent)
Experience with Linux system administration, shell scripting (Bash), and general infrastructure debugging
Demonstrated ability to work within an established infrastructure — understanding existing design decisions, following conventions, and improving incrementally rather than replacing wholesale
Solid grasp of security fundamentals: IAM least-privilege, secrets management, network access controls, and patching hygiene
Strong written and verbal communication skills in English — able to collaborate asynchronously across time zones and document work clearly
BSc in Computer Science, Engineering, or a related field — or equivalent professional experience
Must be comfortable working in 2 PM to 11 PM IST

Responsibilities

Design, implement, and maintain cloud infrastructure on AWS using Terraform and Ansible, following existing conventions and extending them thoughtfully.
Manage and support AWS services across our stack including EC2, ECS, RDS, S3, IAM, VPC, CloudFront, and related services.
Maintain and improve infrastructure-as-code practices, ensuring consistency, reproducibility, and auditability across environments.
Participate in capacity planning and cost optimization, identifying opportunities to improve resource efficiency without compromising reliability.
Build, maintain, and improve CI/CD pipelines (GitHub Actions or equivalent) to support reliable, automated delivery across development, staging, and production environments.
Work with engineering teams to improve build speed, deployment safety, and rollback capabilities.
Support blue/green and canary deployment strategies as appropriate for our platform needs.
Participate in on-call rotation and own production incidents end-to-end — from detection through root cause analysis, resolution, and post-mortem.
Use observability tooling (Datadog, CloudWatch, or equivalent) to monitor system health, establish alerting thresholds, and proactively surface issues before they impact customers.
Contribute to runbooks, incident documentation, and process improvements that reduce mean time to resolution over time.
Apply security best practices across infrastructure — IAM policy scoping, secrets management, network segmentation, vulnerability patching, and access controls.
Support compliance and audit requirements by maintaining clear documentation and ensuring infrastructure changes are tracked and reviewable.
Work closely with the senior engineer on the team to learn existing systems deeply and contribute to architectural improvements over time.
Proactively identify areas for improvement — tooling, automation gaps, manual processes, reliability risks — and raise them constructively with the team.
Document infrastructure clearly so that other engineers can understand and operate the systems they depend on.

View Full Description & ApplyYou'll be redirected to the employer's site