Staff DevOps Security Engineer
New
Fully remote role across Brazil and other LATAM countriesFull-TimeStaff
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Languages
- Fluent English (B2+ or higher)
- Experience
- 8+ years
- Required Skills
- AWSGCPKubernetesTerraformGitLabDatadogMLOps
Requirements
- 8+ years of experience in DevOps, Cloud Engineering, or SRE roles in SaaS or data-intensive environments
- Strong expertise in AWS with working knowledge of or ability to quickly ramp on GCP
- Proven experience implementing SRE principles including SLOs, SLIs, on-call practices, and incident management
- Deep experience with CI/CD pipelines, especially GitLab, and strong proficiency in Infrastructure as Code (Terraform preferred)
- Solid understanding of observability tooling such as Datadog, Prometheus, Grafana, and OpenTelemetry
- Hands-on experience with Kubernetes and container orchestration in production environments
- Experience building or supporting MLOps pipelines and infrastructure for ML workloads is highly valued
- Strong background in DevSecOps practices and cloud security implementation, including compliance frameworks like SOC 2
- AI-forward mindset with active use of AI tools to improve engineering efficiency and automation
- Strong communication skills with ability to document architecture decisions and align cross-functional teams
- Fluent English (B2+ or higher) required for daily collaboration
- Experience in AdTech, MarTech, or high-volume data platforms is a strong plus
Responsibilities
- Architect and scale multi-cloud infrastructure across AWS (primary) and GCP, supporting large-scale AI and data workloads
- Lead DevSecOps execution by implementing security controls, SOC 2 compliance requirements, and cloud security best practices
- Define and drive SRE practices, including SLO/SLI monitoring, incident response, postmortems, and error budget frameworks
- Build and optimize CI/CD pipelines using GitLab and GitOps methodologies to ensure safe and efficient deployments
- Design and implement observability solutions using metrics, logs, traces, and monitoring tools to improve system reliability
- Support and optimize MLOps infrastructure for machine learning pipelines and model deployment across platforms such as Vertex AI and SageMaker
- Manage containerized workloads using Kubernetes (EKS/GKE) and ECS with a focus on scalability and self-healing systems
- Collaborate across engineering, product, and data teams to ensure clear execution, dependency alignment, and architectural clarity
- Automate infrastructure processes and integrate AI-driven tools to reduce operational toil and improve delivery speed
View Full Description & ApplyYou'll be redirected to the employer's site