SaaS Talent

Senior Principal Engineer- SRE @ Accion Labs

14 Years of Experience

Pune, Maharashtra, India

90969*****

Expected Salary

47

Current Salary

38

Notice Period

60 Days

About

I am a passionate and highly experienced Principal Software Engineer (SRE) and Solution Architect with over 13 years of expertise spanning IT, Media, and Telecom domains. My educational background includes an MBA in International Business Management and a Bachelor's of Engineering (BE) in Electronics and Telecommunication. In my role as an SRE/CloudOps professional, I excel in seamlessly deploying code changes across diverse environments, including both cloud-based and on-premises servers, prioritizing smooth transitions with minimal disruptions. My core responsibilities encompass ensuring service availability, optimizing latency, enhancing efficiency, and achieving high-performance standards. I am also deeply involved in critical areas such as proactive monitoring, emergency response readiness, and strategic capacity planning initiatives. A defining skill of mine lies in defining and meticulously measuring Service Level Objectives (SLOs), a cornerstone in upholding our commitment to delivering top-tier quality and unwavering reliability. My work philosophy revolves around automation, embracing the automation of tasks and data collection processes. This approach not only drives efficiency but also serves as a potent catalyst for continuous improvement. My dedication extends to harnessing the full potential of data, ensuring comprehensive information and actionable insights are consistently at our disposal. Resource optimization holds a paramount place in my methodology, guaranteeing the efficiency of our deployments and ensuring we always have the requisite resources within reach. Additionally, I possess a discerning eye for gathering pertinent logs and metrics from strategic sources, extracting and presenting meaningful insights to drive informed decision-making. My extensive background underscores my prowess in architecting technical solutions that seamlessly align with a wide spectrum of business demands, underscoring my adaptability and innovative approach within the field. Specializing in the latest industry trends, my focus on media workflows spans encoding, transcoding, adaptive bitrate streaming, packaging, and origin for HLS and DASH, Multi-DRM, Dynamic Ad insertion, content delivery, content security, and video players.

Senior Principal Engineer- SRE

Accion Labs, Others, Information Technology & Services

Past Company 2

Exotel

Past Company 3

MediaKind

Companies Worked:

Accion Labs, Exotel, MediaKind, Ericsson, Gospell Digital Technology Pvt. Ltd., NETWORK 18 MEDIA & INVTS LTD, Signet Digital Pvt. Ltd.

Work History:

Job Title : Senior Principal Engineer- SRE
Company name : Accion Labs
Period : January 2023 - Present
Summary : This role includes:
- Led and contributed to the design, implementation, and optimization of my Client company's site reliability engineering (SRE) practices, ensuring high system availability and reliability.
- Spearheaded the transition to SRE best practices, resulting in a 99.99% uptime rate and improved user experience.
- Collaborated closely with cross-functional teams, including development, operations, and product management, to align SRE goals with business objectives.
- Worked collaboratively with development teams to introduce SLOs that improved system performance and availability.
- Proactively monitored, maintained, and scaled production systems, addressing incidents, and reducing downtime through effective incident management and post-incident analysis.
- Established an incident response framework that reduced mean time to resolution (MTTR) by 30%, minimising service disruption.
- Implemented and maintained automated monitoring and alerting systems to proactively identify performance bottlenecks and anomalies, ensuring quick issue resolution.
- Developed custom alerting thresholds that significantly reduced false positives, enhancing operational efficiency.
- Designed and executed disaster recovery plans and participated in fault tolerance and failover testing.
Conducted quarterly disaster recovery drills, validating system resilience and data integrity under various failure scenarios.
- Managed and maintained infrastructure as code (IaC) using tools like Terraform and Ansible, enabling infrastructure scalability and consistency.
- Established Git-based version control for infrastructure code, enhancing collaboration and traceability.
Worked on continuous integration and continuous deployment (CI/CD) pipelines, automating deployments and ensuring code reliability.
- Managing and deploying over 2.5k VPN servers meeting SLAs for security and network latency.
- Utilizing tools like Quicksight, Opsgenie, Grafana, and Prometheus to ensure healthy Infra.
Location : Pune, Maharashtra, India

Job Title : Site Reliability Operations Manager
Company name : Exotel
Period : June 2022 - January 2023
Summary : This role involved:
- Managed over 10 million messages per day, meeting SLAs for delivery rates and latency.
- Led site reliability and DevOps teams responsible for software release deployments, SLAs, and uptime.
- Increased revenue by 15% by scaling traffic for major customers, integrating them via major telcos such as Airtel, Vodafone, Jio, and BSNL, among others.
- Reduced TAT, improved MTTR, and MTTA, ensuring high availability at all stages of the SDLC.
- Managed site reliability operations, ensuring 99.99% uptime for messaging apps & API gateways.
- Collaborated with DevOps SREs and developer teams for the deployment of major releases.
- Used Ansible, Jenkins Docker for deploying new releases in both staging and production environments.
- Possessed extensive experience with Logstash pipelines and the ELK stack.
- Ensured that KPIs, critical business metrics available for 6-12 months using ELK and AWS CloudWatch.
- Ensured that the loads on the Singapore and Mumbai AWS Regions were equally distributed.
- Assisted key customers in creating client-specific dashboards showing the number of messages delivered versus failed and latencies associated with P90, P95, and P99 percentiles.
- Ensured that database queries minimised & presented through AWS Cloud watch dashboards.

Achievements:
- Led the Integration of Exotel's messaging apps with major Telcos in India .
- Improved MTTA, MTTR and TAT for all types of incidents/outages etc.
- Enhanced reliability by reducing latency and failure rates.
Location : Bengaluru, Karnataka, India

Job Title : Specialist (Service DevOps)
Company name : MediaKind
Period : February 2019 - May 2022
Summary : This role involved:
- Worked as IPTV/OTT site reliability engineer for handling production issues.
- Created Audit Methodologies to find RCAs on VSPP Incidents.
- Handled support incidents, worked as an on-call engineer in shifts.
- Integrated Streaming, Storage, Encoding, Play-out, Ingest & Transcoding.
- Being part of dev team of system metrics monitoring developed few modules.
- Facing customers, stake holders, multiple vendors and upper management.
- Working experience on Jenkins for continuous integrations through Git.
- Experience with Confluence, Jira , pager duty(PD) and creating various runbook.
- Worked on MSSQL, MYSQL, Varnish, SolidDb, PostgreSQL and Nginx.
- Upgrades, Patches, Migration, Troubleshooting, Backup, Disaster Recovery.
- Performance Monitoring and Fine-tuning on UNIX Red Hat Linux Systems.
- Working experience for deployment using Ansible.
- Monitoring tools Nagios, Prometheus, Grafana, Kibana, newrelic, ELK.
- Design and Implement Server Architecture for their Development, Staging and Production Environment.
- Constantly improve the performance and reliability of our streaming platforms.
- Work closely with customer support & engineering to deliver second-to-none SaaS and PaaS services.
- Maintain and enhance cloud management and monitoring tools.
- Participate in an on-call, operations support rotation and Capacity planning.

Achievements:
- Assisted in developing ELK based monitoring solution.
- Created python scripts for Django and Elastic search based monitoring.
- Successfully completed all SLA's.
- Created many KPIs in order to improve support and serviceability.
Location : Bengaluru, Karnataka

Job Title : Specialist
Company name : Ericsson
Period : April 2018 - January 2019
Summary : This job involved:
- Leading the analysis and resolution of technical issues arising with Ericsson Media VSPP Integrations.
- Troubleshooting system integration, compatibility, product level and customer ecosystem faults.
- Planning and implementing upgrade and expansion procedures.
- Monitoring the deployed system as neces

Certifications:

Title : EC- Council Certified Security Analyst
Period : December 2017 - November 2021
Summary : ECC97989904854
Issuing Authority : EC-Council

Title : RHCE
Period : March 2017 - Present
Summary : 160-135-448
Issuing Authority : Red Hat

Title : RHCSA
Period : June 2016 - Present
Summary : 160-135-448, redhat.com, https://www.redhat.com/wapps/certifications/tx/certificationList
Issuing Authority : Red Hat

Title : CCNA
Period : July 2010 - December 2010
Issuing Authority : RSTFORUM

Languages:

English (Full Professional), Hindi (Full Professional), Marathi , Urdu

Honors and awards:

Award : Employee of the Year 2014
Issuer : Gospell Digital Technology Pvt. Ltd.
Date : 3 2014
Summary : Got awarded for multiple Digitization projects completion.

Skills

Docker Products

Infrastructure

Troubleshooting

Continuous Delivery (CD)

Amazon Web Services (AWS)

Linux

Ansible

Jenkins

Proactive Monitoring

DevOps

Shell Scripting

Networking

Cloud Storage

Linux Server

Windows Server

Site Reliability Engineering

Service-Level Agreements (SLA)

SLO

MTTR

MTTA

Show More

Notes & Recommendation

Copyright © 2022 All Rights Reserved. Saas Talent