dunnhumby

Senior Site Reliability Engineer

Gurgaon
Terraform GCP Azure Prometheus Grafana Elastic New Relic Splunk
Description

Senior Engineer -SRE

Location: Gurgaon

Department: Demand - Engineering

dunnhumby is the global leader in Customer Data Science, partnering with the world’s most ambitious retailers and brands to put the customer at the heart of every decision. We combine deep insight, advanced technology, and close collaboration to help our clients grow, innovate, and deliver measurable value for their customers. 

dunnhumby employs nearly 2,500 experts in offices throughout Europe, Asia, Africa, and the Americas working for transformative, iconic brands such as Tesco, Coca-Cola, Nestlé, Unilever and Metro.

 

 

Tesco Media is building a world-class, self-serve B2B advertising platform that enables retailers and brands to plan, activate, and measure omnichannel retail media campaigns

Retail Media is transforming how advertisers connect with consumers through personalized and targeted campaigns across retailers' digital and physical touchpoints. Retail Media Measurement plays a pivotal role in ensuring the effectiveness of these campaigns, driving value for advertisers, retailers, and consumers alike

We are looking for a Senior Platform Reliability Engineer (SRE) to guide and improve the reliability, performance, and operations of our large, distributed systems.
In this senior role, you will help define our reliability strategy, drive automation, strengthen system resilience, and coach engineering teams on advanced SRE practices.

You will partner with platform, backend, security, and product teams to ensure our services are highly available, well‑monitored, and fault‑tolerant. You will lead improvements in observability, incident response, and system performance using tools like Prometheus, Grafana, Elastic, and New Relic.



 

Key Responsibilities

  • Design and implement monitoring and observability strategies across services and infrastructure.
  • Develop dashboards, alerts, and metrics to improve system visibility and reliability.
  • Define alerting standards to ensure alerts represent real service impact.
  • Lead troubleshooting efforts during complex incidents and support root cause analysis.
  • Improve monitoring coverage for critical services and platform components.
  • Partner with service owners to align monitoring with SLAs, KPIs, and operational goals.
  • Lead post-incident reviews and drive improvements in monitoring and detection.
  • Automate operational workflows and reliability processes.
  • Mentor engineers and promote operational excellence within the team.
  • Collaborate with engineering, operations, and product teams to improve platform reliability.
  • Maintain infrastructure provisioning practices using Terraform and Infrastructure as Code.

Required Experience

  • 6-8 years of experience in Site Reliability Engineering, platform operations, or infrastructure reliability.
  • Proven experience leading technical initiatives or guiding reliability practices within engineering teams.
  • Strong experience managing production environments and large-scale distributed systems.
  • Experience implementing Infrastructure as Code using Terraform.
  • Experience working with cloud platforms such as GCP or Azure.
  • Experience supporting high-availability systems in 24/7 operational environments.
  • Strong communication and stakeholder management skills.

Preferred Experience

  • Experience with observability platforms such as Grafana, Prometheus, Splunk, or New Relic.
  • Experience supporting Media, streaming, or SaaS platforms at scale.
  • Exposure to advanced monitoring practices such as predictive monitoring or AIOps.

What you can expect from us

We won’t just meet your expectations. We’ll defy them. So you’ll enjoy the comprehensive rewards package you’d expect from a leading technology company. But also, a degree of personal flexibility you might not expect.  Plus, thoughtful perks, like flexible working hours and your birthday off.

You’ll also benefit from an investment in cutting-edge technology that reflects our global ambition. But with a nimble, small-business feel that gives you the freedom to play, experiment and learn.

And we don’t just talk about diversity and inclusion. We live it every day – with thriving networks including dh Gender Equality Network, dh Proud, dh Family, dh One, dh Enabled and dh Thrive as the living proof.  We want everyone to have the opportunity to shine and perform at your best throughout our recruitment process. Please let us know how we can make this process work best for you. 

Our approach to Flexible Working

At dunnhumby, we value and respect difference and are committed to building an inclusive culture by creating an environment where you can balance a successful career with your commitments and interests outside of work.

We believe that you will do your best at work if you have a work / life balance. Some roles lend themselves to flexible options more than others, so if this is important to you please raise this with your recruiter, as we are open to discussing agile working opportunities during the hiring process.

For further information about how we collect and use your personal information please see our Privacy Notice which can be found (here)

dunnhumby
dunnhumby

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say