Site Reliability Engineer, Cloud Cost Utilization
GitLab hires new team members in countries around the world. All of our roles are remoteFull-Time
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- AWSGCPGrafanaPrometheusTerraformAnsible
Requirements
- Hands-on experience with cloud cost management in GCP and/or AWS.
- Familiarity with cloud billing data, pricing models, and optimization approaches.
- Experience implementing cloud resource tagging and labeling strategies.
- Experience with infrastructure as code, specifically Terraform and Ansible.
- Familiarity with observability tooling, including Grafana.
- Experience working across technical and business functions (Engineering, Finance).
- Ability to explain technical cost data to non-engineering audiences.
- Familiarity with or interest in adopting the FinOps FOCUS specification.
- Self-directed work style suitable for an asynchronous, fully remote environment.
Responsibilities
- Design and maintain cloud resource tagging and labeling strategies across GCP and AWS.
- Develop tooling and pipelines to ingest, normalize, and report on cloud billing data using the FOCUS specification.
- Automate cost anomaly detection, forecasting, and alerting to help engineering teams respond to infrastructure spend changes.
- Contribute to observability and monitoring stacks (Prometheus, Loki, Grafana, Tempo, Mimir, ELK) with a focus on cost efficiency signals.
- Partner with Finance and Engineering leadership to support cloud cost forecasting and budget discussions.
- Serve as a subject matter expert for cloud cost attribution and tagging strategy.
- Collaborate with Finance and Compliance on audits and financial reporting related to cloud infrastructure.
- Contribute to infrastructure-as-code (Terraform, Ansible) to embed cost controls into provisioning.
View Full Description & ApplyYou'll be redirected to the employer's site