Software Engineer, Production Support
New
Open to remote work for candidates who are located in the EST or CST time zones in the US., EST or CST time zonesFull-TimeEntry
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Experience
- 2+ years of experience in software development, technical support, DevOps, or a related engineering role.
- Required Skills
- AWSPostgreSQLGCPMySQLRuby on RailsCommunication SkillsProblem SolvingLinuxDebuggingDatadog
Requirements
- Bachelor’s degree in Computer Science, Software Engineering, or equivalent practical experience.
- 2+ years of experience in software development, technical support, DevOps, or a related engineering role.
- Hands-on experience with Ruby on Rails (academic, internship, or professional).
- Comfort using a Rails console and understanding of Rails application structure.
- Basic working knowledge of relational databases (MySQL or PostgreSQL), including querying data.
- Strong problem-solving skills and the ability to debug issues methodically.
- Ability to learn new systems quickly and work effectively in a production environment.
- Clear written and verbal communication skills, especially during incident response.
- Willingness to explore and adopt AI tools responsibly to enhance productivity and innovation in your role
Responsibilities
- Troubleshoot and resolve production incidents across Rails framework services and cloud infrastructure, working from alerts, logs, metrics, and user-reported issues.
- Use interactive application access tools safely and effectively to inspect application state, diagnose issues, and validate fixes.
- Investigate and validate data directly in MySQL/PostgreSQL databases using read-only and controlled write access where appropriate.
- Create and maintain scripts, Rake tasks, and internal tools to streamline incident response, data verification, and operational workflows.
- Assist in incident response, including triage, escalation, documentation, and post-incident follow-ups.
- Collaborate with senior engineers and DevOps to identify root causes and propose long-term fixes.
- Build or enhance internal tools and dashboards that improve visibility into system health, data integrity, and operational risks.
- Monitor system health, key metrics, and operational risks using dashboards and APM tools such as Datadog, New Relic, and CloudWatch.
- Help improve runbooks, documentation, and operational playbooks for recurring issues.
- Gradually contribute to application code changes and bug fixes outside of active incident work.
View Full Description & ApplyYou'll be redirected to the employer's site