Apply📍 United States, BC, ON, Canada
🧭 Full-Time
💸 $139,000 - $248,000 per year
🔍 Web development
- 5+ years of experience as a Data Infrastructure Engineer or in related roles like Platform Engineer, SRE, DevOps or Backend Engineer.
- Strong experience with provisioning and managing data infrastructure components like Kafka, Spark, and Airflow.
- Proficiency with cloud services and environments (compute, storage, networking, identity management, infrastructure as code, etc.).
- Experience with containerization technologies like Docker and Kubernetes.
- Expertise in infrastructure as code tools like Terraform and Pulumi.
- Solid understanding of networking concepts and configurations, including VPCs, load balancers, and endpoints.
- Experience with monitoring and logging tools.
- Strong problem-solving skills and attention to detail.
- Excellent communication and collaboration skills.
- Provision and deploy infrastructure using Pulumi for Kafka, Spark, Airflow, Athena, and other critical systems on AWS.
- Manage and maintain clusters, ensuring optimal performance and reliability, including implementing auto-scaling and right-sizing instances.
- Configure and manage VPCs, load balancers, and VPC endpoints for secure communication between internal and external services.
- Manage IAM roles, apply security patches, plan and execute version upgrades, and ensure compliance with regulations such as GDPR.
- Design and implement high-availability solutions across multiple zones and regions, including backups, multi-region replication, and disaster recovery plans.
- Oversee S3 data lake management, including file size management, compaction, encryption, and compression to maximize storage efficiency.
- Implement caching strategies, indexing, and query optimization to ensure efficient data retrieval and processing.
- Spearhead initiatives for optimizing performance, capacity planning, ensuring fault tolerance, and implementing failure recovery across all infrastructure components.
- Implement monitoring and logging using tools like Datadog, CloudWatch and OpenSearch.
- Develop services, tools and automation to simplify infrastructure complexity for other engineering teams, enabling them to focus on building great products.
- Participate in all engineering activities including incident response, interviewing, designing and reviewing technical specifications, code review, and releasing new functionality.
- Mentor, coach, and inspire a team of engineers of various levels.
AWSDockerKafkaKubernetesAirflowSparkCollaborationProblem SolvingTerraform
Posted 2024-09-25
Apply