TechBlocks

Senior Site Reliability Engineer

Hyderabad Ahmedabad
GCP GKE Load Balancing VPN IAM Prometheus Grafana ELK Datadog Kubernetes Docker Terraform Helm PagerDuty OpsGenie Cloud Monitoring Cloud Armor MongoDB
Description

Sr. Site Reliability Engineer (SRE)

Location: Hyderabad; Ahmedabad

Department: Z-Michaels-DOE

Experience: 6+ Years

Job Title:  Senior Site Reliability Engineer (SRE)
Location: Hyderabad / Ahmedabad 
Employment Type: Full-Time 
Work Model - 3 Days from office

Job Overview 
Dynamic, motivated individuals deliver exceptional solutions for the production resiliency of the systems. The role incorporates aspects of software engineering and operations, DevOps skills to come up with efficient ways of managing and operating applications. The role will require a high level of responsibility and accountability to deliver technical solutions.

Summary:
As a Senior SRE, you will ensure platform reliability, incident management, and performance optimization. You'll define SLIs/SLOs, contribute to robust observability practices, and drive proactive reliability engineering across services.

Experience Required:
6–10 years of SRE or infrastructure engineering experience in cloud-native environments.

Mandatory:
Cloud: GCP (GKE, Load Balancing, VPN, IAM)
• Observability: Prometheus, Grafana, ELK, Datadog
• Containers & Orchestration: Kubernetes, Docker
• Incident Management: On-call, RCA, SLIs/SLOs
• IaC: Terraform, Helm
• Incident Tools: PagerDuty, OpsGenie

Nice to Have:
• GCP Monitoring, Skywalking
• Service Mesh, API Gateway
• GCP Spanner, MongoDB (basic) 

Scope:
Drive operational excellence and platform resilience
• Reduce MTTR, increase service availability
• Own incident and RCA processes

Roles and Responsibilities:

•Define and measure Service Level Indicators (SLIs), Service Level Objectives (SLOs), and manage error budgets across services.
• Lead incident management for critical production issues – drive root cause analysis (RCA) and postmortems.
• Create and maintain runbooks and standard operating procedures for high availability services.
• Design and implement observability frameworks using ELK, Prometheus, and Grafana; drive telemetry adoption.
• Coordinate cross-functional war-room sessions during major incidents and maintain response logs.
• Develop and improve automated system recovery, alert suppression, and escalation logic.
• Use GCP tools like GKE, Cloud Monitoring, and Cloud Armor to improve performance and security posture.
• Collaborate with DevOps and Infrastructure teams to build highly available and scalable systems.
• Analyze performance metrics and conduct regular reliability reviews with engineering leads.
• Participate in capacity planning, failover testing, and resilience architecture reviews.
TechBlocks
TechBlocks

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say