Kafka Platform Engineer

New

This is a fully remote opportunity within the continental United StatesFull-TimeSenior

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

Experience: 5+ years
Required Skills: PythonBashApache KafkaGoGrafanaPrometheusTerraformAnsibleDatadog

Bachelor’s degree in Computer Science, Engineering, or a related technical field.
5+ years of hands-on experience operating Apache Kafka or Confluent Platform in production environments.
Deep understanding of Kafka internals including partitions, replication, ISRs, and consumer groups.
Strong expertise in Kafka security practices including SASL, mTLS, ACLs, and RBAC.
Experience with Kafka Connect, Schema Registry, Kafka Streams, or ksqlDB in enterprise environments.
Strong scripting and automation skills using Python, Bash, or Go.
Experience with Infrastructure as Code tools such as Terraform and Ansible.
Knowledge of observability and monitoring solutions for distributed systems and streaming platforms.
Familiarity with high availability, disaster recovery, and multi-region streaming architectures.
Excellent troubleshooting, communication, and documentation abilities.

Architect, deploy, and maintain large-scale Apache Kafka and Confluent Platform environments across cloud and on-premise infrastructures.
Design scalable partitioning, replication, and topic management strategies to optimize throughput, durability, and operational efficiency.
Implement and manage platform security using SASL, mTLS, ACLs, RBAC, and identity provider integrations.
Operate and optimize ecosystem components such as Schema Registry, Kafka Connect, ksqlDB, and Kafka Streams for production-grade streaming workloads.
Develop CI/CD and GitOps workflows for topic management, connectors, and infrastructure automation.
Build high-availability and disaster recovery strategies including multi-region replication and failover patterns.
Implement observability and monitoring solutions using tools such as Prometheus, Grafana, Datadog, and related platforms.
Collaborate with application teams to define best practices, onboarding standards, and reusable streaming patterns.
Lead incident response, troubleshooting, and post-incident reviews to improve operational resilience and platform reliability.
Mentor engineers through technical reviews, knowledge sharing, and engineering best practices while maintaining detailed technical documentation.

View Full Description & ApplyYou'll be redirected to the employer's site