Apply

Principal Platform Engineer

Posted 1 day agoViewed

View full description

πŸ’Ž Seniority level: Principal, 7+ years

πŸ“ Location: United States

πŸ’Έ Salary: 145000.0 - 185000.0 USD per year

πŸ” Industry: Software Development

🏒 Company: DomainToolsπŸ‘₯ 11-50Web HostingSecurityInformation TechnologyCyber Security

πŸ—£οΈ Languages: English

⏳ Experience: 7+ years

πŸͺ„ Skills: AWSPythonBashCloud ComputingJenkinsKafkaKubernetesGoPrometheusCI/CDLinuxDevOpsTerraformNetworkingAnsibleScriptingSaaS

Requirements:
  • 7+ years of experience in Linux systems engineering roles supporting bare metal servers and virtualization/container platforms
  • 3+ years’ Kubernetes administration experience on Red Hat OpenShift.
  • Experience building and managing infrastructure in both public cloud and physical data center environments using IaC tools
  • 5+ years’ experience with enterprise monitoring and logging solutions like Prometheus, ELK, or similar
  • Proven ability to automate the right things in the simplest way possible (scripts, config management tools, CI pipelines, RHOS Operators, etc.)
  • Solid understanding of networking fundamentals and storage technologies
  • Competency in at least one high level programming language (i.e., Golang, Python, etc.)
  • Experience supporting customer-facing SaaS products
Responsibilities:
  • translate high level platform design into low level technical design and are responsible for implementing, administering, supporting, and patching their corresponding platforms.
  • Installs, configures, and monitors applications and services in the OpenShift cluster.
  • Continually assesses technical components to recommend platform improvements, translating high-level design and RHOS best practices into low-level technical configuration.
  • Ensures the ongoing stability, availability, performance, and security compliance of the platform to meet customer SLAs; authors and executes test cases to validate
  • Collaborates with software delivery teams and architects to build and support self-service mechanisms, CI/CD pipelines, and k8s operators that simplify and accelerate service delivery, in accordance with DevOps and Agile frameworks
  • Maintains the catalog of services for the platform in collaboration with Engineering.
  • Instruments and optimizes application, system, and cluster performance.
  • Forecasts and plans capacity increases to ensure resource availability for engineering teams while meeting budget targets.
  • Helps build and implement Disaster Recovery / Business Continuity plan; conducts related testing of recovery procedures.
  • Helps determine Platform roadmap, manage projects and ticket-based work; ensures these are clearly communicated with stakeholders at all levels.
  • Provides thought leadership on DevOps and Platform Engineering-centric system and process design, giving constructive input to engineers and leaders on proposals and best practices.
  • Builds internal documentation and artifacts describing the mechanisms used for deployment, monitoring, and operators.
  • Leads by showing: mentors and helps develop engineers in a highly demonstrative and collaborative way
  • Participates in an on-call rotation with fellow team members
Apply

Related Jobs

Apply

πŸ“ North America

🧭 Full-Time

πŸ” Blockchain Technology

🏒 Company: Helius

  • A minimum of 8 years of experience in a DevOps or Site Reliability Engineering role, preferably in a high-performance, low latency environment.
  • Experience managing and optimizing bare-metal server environments.
  • Expert scripting and programming skills (e.g., Bash, Python, Go).
  • Experience in Rust, Golang, Java, or a similar language.
  • Proficiency with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
  • Strong knowledge of automation tools and frameworks (e.g., Ansible, Terraform, Puppet, Chef).
  • Expertise in CI/CD tools and practices (e.g., Jenkins, GitLab CI, CircleCI).
  • Excellent problem-solving skills and the ability to troubleshoot complex issues.
  • Strong communication skills and the ability to collaborate effectively with cross-functional teams.
  • Ability to work independently and take ownership of projects from start to finish.
  • Design, implement, and manage automated systems for deploying, monitoring, and maintaining our bare-metal servers and services.
  • Develop and maintain CI/CD pipelines to streamline the deployment process.
  • Enhance the security of our infrastructure and networks by implementing best practices and proactive measures.
  • Monitor system performance, identify and resolve issues to ensure high availability and reliability.
  • Lead incident response and root cause analysis for system outages and issues.
  • Implement robust security measures to safeguard sensitive data and protect against cyber threats and attacks.
  • Collaborate with the engineering team to optimize performance and scalability of our services.
  • Establish and enforce policies and procedures to ensure compliance with industry standards and regulations.

AWSPythonBashBlockchainElasticSearchJenkinsKubernetes*NixGoGrafanaPrometheusRustCI/CDRESTful APIsDevOpsTerraformAnsibleScripting

Posted about 1 month ago
Apply
Apply

πŸ“ United States

🧭 Full-Time

πŸ” Software Development

🏒 Company: Global InfoTek, Inc.

  • Bachelor Degree in Computer Science, Mathematics, or equivalent experience
  • 3+ years developing production software in modern languages
  • 1+ years developing containerized services on orchestration platforms
  • 5+ years leading teams to build containerized services
  • 5+ years building developer platform services and APIs
  • Design and implement hybrid-infrastructure developer platforms
  • Develop platform APIs for scalability and repeatability
  • Maintain underlying services for containerized applications
  • Mentor junior developers and engineers

AWSPostgreSQLPythonSoftware DevelopmentAgileElasticSearchGitJavaKubernetesMicrosoft AzureMongoDBMySQLAmazon Web ServicesAzureGoCommunication Skills

Posted 5 months ago
Apply