Senior DevOps Engineer (remote) - AWS Cloud Hosting Platform
P
PlatinumlistEvent Ticketing
India, Pakistan, Nigeria, Belarus, KenyaFull-TimeSenior
Salary not disclosed
Job Details
- Languages
- English
- Experience
- 10+ years
- Required Skills
- AWSDockerPHPPythonSQLAmazon RDSBashCloud ComputingMySQLNginxCI/CDLinuxDevOpsTerraformNetworkingTroubleshooting
Requirements
- 10+ years of experience in a similar role.
- Strong hands-on AWS in production (VPC, IAM, EC2, ALB/NLB, Auto Scaling, S3, CloudFront, Route53, CloudWatch/CloudTrail, WAF; Aurora/RDS).
- Proven experience designing/operating high-load web systems with strict uptime requirements.
- IaC and automation mindset (Terraform/CloudFormation/CDK + scripting Bash/Python).
- Production MySQL on AWS (Aurora/RDS): backups & restores, read replicas, monitoring, performance troubleshooting.
- Ability to troubleshoot production web stacks (Nginx + PHP-FPM) and identify bottlenecks.
- Containers and deployment automation (ECS/EKS, Docker; scaling and rollout patterns).
- Solid Linux + networking fundamentals (DNS, TLS, routing, LB, troubleshooting).
- Observability practices and incident management experience.
- Must be reachable for critical production incidents; occasional after-hours support may be required.
Responsibilities
- Own production reliability on AWS (availability, latency, throughput, capacity, incident response).
- Architect and operate scalable infrastructure (multi-AZ, DR strategy).
- Build and maintain Infrastructure as Code (Terraform/CloudFormation/CDK) and Git workflows.
- Improve CI/CD pipelines and deployment strategies.
- Implement strong observability (metrics, logs, traces, alerting, dashboards, SLO/SLI).
- Own database operations on AWS (Aurora/RDS MySQL: backups, restores, performance troubleshooting, capacity planning).
- Improve caching and traffic handling.
- Harden security posture (IAM, secrets management, patching, WAF, audit trails).
- Drive adoption of relevant AWS managed services.
- Drive cloud cost efficiency (FinOps).
- Lead post-incident reviews (RCA, corrective actions, prevention).