Saas Talent

Senior Site Reliability Engineer

Job Description

About Us
Diligent is the global leader in modern governance, providing SaaS solutions across governance, risk, compliance, audit and ESG. Empowering more than 1 million users and 700,000 board members and leaders with a holistic view of their organization’s GRC practices so they can make better decisions, faster. No matter the challenge.

At Diligent, you are an agent of positive change. You are joining a team of passionate, smart, creative people who not only want to help build the software company of the future, but who want to make the world a more sustainable, equitable and better place. Be a part of a global community on a mission to make a real impact.

Learn more at diligent.com.

Position Overview:
We are seeking a dynamic and experienced IT Service Manager to enhance our Site Reliability Engineering (SRE) team. In this pivotal role, you will be instrumental in managing and refining our IT service delivery, with a strong focus on Incident, Change, and Problem management. Your primary responsibilities will involve leading the swift resolution of IT incidents, guiding the change management process, and proactively identifying and addressing recurrent IT issues. This will ensure the ongoing reliability and efficiency of our IT services.

Your role will also encompass vendor management, contributing to the development and maintenance of business continuity plans, and ensuring these strategies are aligned with our organizational risk tolerance. Effective communication with various stakeholders, collaborative decision-making, and creative problem-solving are key aspects of this position, as they are vital for maintaining service continuity and restoring normal operations promptly.

As a central figure in our SRE team, you will report directly to the Director of Site Reliability Engineering and play a crucial role in aligning IT service management with our broader organizational goals.

Key Responsibilities:

Incident Management:

Lead the response to IT incidents, ensuring timely and effective resolution. Coordinate across teams to minimize impact and restore service swiftly.
Develop and refine incident response protocols, ensuring they align with business needs and industry best practices.

Problem Management:

Proactively identify and analyse recurring IT issues. Work with teams to implement long-term solutions to prevent future incidents and enhance system reliability.
Collaborate with technical teams to understand root causes and track problem resolution progress.

Change Management:

Oversee the IT change management process, ensuring all changes are assessed, approved, implemented, and reviewed in a controlled manner.
Facilitate change advisory board meetings to evaluate the impact of proposed changes and make informed decisions.

Vendor Management:

Manage relationships with IT service vendors, making sure that renewals and cancelations happen in appropriate time windows.
Evaluate vendor performance regularly and work with procurement to negotiate terms to align with organizational objectives and IT strategies.

Business Continuity Planning:

Review and maintain comprehensive business continuity plans and procedures. Ensure these are up-to-date and aligned with organizational risk tolerance.
Conduct regular business impact analyses and lead drills to test and refine business continuity strategies.

Required Experience/Skills:

Proven Experience in Service Management: At least 3-5 years of experience in service management, preferably within a medium to large, complex IT environment.
Incident and Problem Management Skills: Demonstrated ability in managing and resolving IT incidents and problems. Experience in developing strategies to prevent future incidents.
Change Management Expertise: Solid experience in overseeing IT change management processes. Ability to facilitate change advisory board meetings and guide change initiatives.
Vendor Management Experience: Skills in managing vendor relationships, including contract negotiations and performance evaluations.
Business Continuity Planning: Knowledge and experience in developing and maintaining business continuity plans and procedures. Ability to conduct business impact analyses and lead testing drills.
Strong Analytical and Problem-Solving Skills: Aptitude for analysing complex issues, using data-driven insights for service improvement, and identifying creative solutions.
Effective Communication and Collaboration Skills: Excellent communication skills, capable of conveying technical information to non-technical stakeholders. As well as the ability to work effectively in cross-functional teams across various stakeholders, fostering cooperative and productive working relationships.
ITIL Certification: ITIL v3 Foundation or higher certification is preferred, with an emphasis on candidates who are updated to or familiar with ITIL 4 principles.
Experience in an ITIL-based Service Management Environment: Familiarity with ITIL methodologies, particularly in the context of incident, problem, and change management processes.

Skills

Service Management
IT Management
Vendor Management
ITIL
Analytical Skills
Problem Solving

Education

Master's Degree
Bachelor's Degree

Job Information

Job Posted Date

Mar 13, 2025

Experience

3 to 7 Years

Compensation (Annual in Lacs)

Best in the Industry

Work Type

Permanent

Type Of Work

8 hour shift