We are seeking a highly skilled Senior Engineer - PlatformOps to join our team and take responsibility for providing exceptional infrastructure support. As a Senior Engineer, you will play a crucial role in debugging technical issues within a complex stack involving virtualization, containers, microservices, and more. Your expertise will be instrumental in ensuring the smooth operation of our cloud infrastructure and providing timely support to our customers.
What will you do? 🤔
Debug technical issues in a complex stack, including virtualization, containers, and microservices, to ensure the stability and performance of our platform.
Take ownership of customer cloud issues, effectively communicating with workgroups on behalf of end customers to resolve problems promptly.
Proactively manage service levels and abandon rates during business-critical needs, ensuring a seamless user experience.
Handle major incidents by coordinating with multiple teams, leading from the front as a Priority Incident Manager.
Implement proactive measures to manage outage events and drive incident response, ensuring blameless postmortems to foster continuous improvement.
Generate scripts and templates to automate resource provisioning, streamlining operational efficiency.
Stay up-to-date with the latest developments and challenges related to cloud infrastructure, providing insights to enhance our platform's performance and reliability.
What makes you a match for us? 😍
Expertise in one of the major cloud platforms - AWS, Azure, or GCP, with a deep understanding of its services and best practices.
Strong knowledge of Kubernetes and hands-on experience in managing Kubernetes clusters.
Working experience with one or more of the following tools: Argo CD, Argo Workflow, Loft vCluster, or Prometheus/Grafana.
Proficiency in one or more programming languages such as Java, Python, or Go, with the ability to create efficient scripts and automation.
Familiarity with databases like Cassandra, Elasticsearch, PostgreSQL, or Redis is a plus.
Experience with multi-tenancy on Kubernetes, enabling seamless management of resources for multiple clients.
Exceptional verbal, presentation, and written communication skills to effectively convey technical information to different audiences.
Availability to work during EST/EDT hours (7 PM to 4 AM IST) to align with the needs of our global operations.