Image Loading

Cloud Operations Engineer 2

Job Description

About The Role
As a Cloud Operations Engineer 2 , you'll play a pivotal role in ensuring the reliability, scalability, and performance of our systems. You will work closely with cross-functional teams to identify and resolve infrastructure issues, optimize system performance, and implement automation solutions to streamline processes. This role presents an exciting opportunity for individuals with 4+ years of experience in site reliability engineering to further develop their skills and make a significant impact in a fast-paced environment.

This opportunity is hybrid (Bangalore Based) with 3 days in office and 2 days work from home
What You'll Do

  • Troubleshooting and resolving issues: Identifying and addressing technical issues, such as performance bottlenecks or system failures, to minimize downtime and maintain service continuity.
  • Automation and scripting: Developing and implementing automation scripts and tools to streamline operational tasks, improve efficiency, and enhance scalability.
  • Collaborate with software development and operations teams to design, implement, and maintain highly available and scalable infrastructure solutions.
  • Monitor system performance, troubleshoot issues, and implement proactive measures to prevent downtime or service disruptions.
  • Develop and maintain tools for automation, monitoring, and deployment to improve efficiency and reliability.
  • Participate in incident response and post-mortem analysis to identify root causes and implement preventive measures.
  • Continuously evaluate and optimize system performance through capacity planning, performance tuning, and infrastructure upgrades.
  • Stay updated with industry best practices and emerging technologies to drive innovation and improve system reliability.
  • Work in a 24/7/365 service environment with On-call responsibilities
  • Willingness to work in 2 shifts to ensure coverage and overlap with both European and United States time zones. Flexibility in working hours may be required to participate in on-call rotations and address critical issues outside of regular business hours.

What You'll Bring

  • Bachelor's degree in Computer Science, Engineering, or a related field.
  • 4+ years of experience in site reliability engineering, infrastructure operations, or related roles.
  • Proficiency in scripting and automation using languages such as Python, Shell, or Go.
  • Hands-on experience with cloud platforms such as AWS, Azure, or GCP. - Strong understanding of Linux/Unix systems and networking concepts.
  • Experience with containerization technologies like Docker and orchestration tools like Kubernetes.
  • Knowledge of monitoring and logging tools such as Prometheus, Grafana, ELK Stack, or similar.
  • Excellent problem-solving skills and the ability to thrive in a fast-paced, collaborative environment.
  • Strong communication skills with the ability to effectively interact with cross-functional teams.

Bonus Points

  • Certification in cloud technologies (e.g., AWS Certified Solutions Architect, Azure Administrator Associate).
  • Experience with a Container orechstration platform (eg. Kubernetes)
  • Experience with infrastructure as code tools such as Terraform, Ansible, or Puppet.
  • Familiarity with CI/CD pipelines and version control systems like Git.

Skills

  • Python
  • Shell Scripting
  • GO
  • Cloud platform
  • Docker
  • SRE

Education

  • Master's Degree
  • Bachelor's Degree

Job Information

Job Posted Date

Oct 18, 2024

Experience

4 to 8 Years

Compensation (Annual in Lacs)

₹ Market Standard

Work Type

Permanent

Type Of Work

8 hour shift

Category

Information Technology

Copyright © 2022 All Rights Reserved. Saas Talent