Site Reliability Engineer, Cloud Cost Utilization
G
GitLabDevSecOps
Remote, USFull-TimeMiddle
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- AWSGCPGrafanaPrometheusTerraformAnsible
Requirements
- Hands-on experience with cloud cost management in GCP and/or AWS, including billing data and pricing models.
- Familiarity with or interest in adopting the FinOps FOCUS specification for multi-cloud cost analysis.
- Experience designing or implementing cloud resource tagging and labeling strategies.
- Comfort working across technical and business functions including Engineering and Finance.
- Proficiency with infrastructure as code, specifically Terraform and Ansible.
- Familiarity with observability tooling such as Grafana and understanding the connection between reliability and cost signals.
- Ability to explain technical cost data clearly to non-engineering audiences.
- Ability to work effectively in a fully remote and asynchronous environment.
Responsibilities
- Design and maintain cloud resource tagging and labeling strategies across GCP and AWS to support accurate cost attribution.
- Develop tooling and pipelines to ingest, normalize, and report on cloud billing data using the FOCUS specification.
- Automate cost anomaly detection, forecasting, and alerting to enable engineering teams to respond to infrastructure spend changes.
- Contribute to observability and monitoring stacks (Prometheus, Loki, Grafana, Tempo, Mimir, ELK) with a focus on surfacing cost efficiency signals.
- Partner with Finance and Engineering leadership to support cloud cost forecasting and budget discussions.
- Act as a subject matter expert for cloud cost attribution and tagging strategy.
- Collaborate with Finance and Compliance teams on audits, certifications, and financial reporting.
- Contribute to infrastructure-as-code (Terraform, Ansible) to build cost controls into provisioning workflows.
View Full Description & ApplyYou'll be redirected to the employer's site