PepsiCo

Site Reliability Engineer

Hyderabad, Telangana
Python SQL MySQL MongoDB Cassandra PostgreSQL AppDynamics ELK Stack Grafana Splunk Dynatrace Kafka HTML CSS JavaScript React Angular Vue.js Java Spring Boot Azure AWS
Description

SRE Design & Support Engineer

Location: Hyderabad, Telangana, India

Overview We are looking for a self-driven, software engineering mindset SRE engineer to • Drive new shift left activities critical to apply Site Reliability Engineering (SRE) and quality assurance principles within the application design / Project roadmap that enablees resilient outcomes • Apply pre-emptive approach into production minimizing business impact, via SRE-driven orchestration of connecting all components of the ecosystem diagnosing anomalies prior to user & remediating through automation, The SRE design & support engineer is integral part of the global team with its main purpose to provide a delightful customer experience for the user of the global consumer, commercial, supply chain and enablement functions in the PepsiCo digital products application portfolio of 260+ applications, enabling a full SRE Practice incident prevention / proactive resolution model. The scope of this role is focussed on the cloud architecture application full stack devlopment, B2B pepsiconnect and Direct to Customer and other S&T roadmap applications. Ensures that PepsiCo DPA applications service performance, reliability and availability expected by our customers and internal groups It requires a blend of technical expertise on SRE tools, modern applications cloud architecture i.e. full stack, IT operations experience, and analytics & influence skills. Responsibilities Engage & influence product and engineering teams during the design and development phases to embed reliability and operability into new services defining & enforce events, logging, monitoring, and observability standards across applications. Ensuring non-functional requirements (NFRs) are embedded early including SLA/SLO/SLI and error budgets into the product’s offerings as part of the engineering solution Execute as Pro-active SRE Support engineer, preventing P1, P2, potential P3s, diagnosing any anomalies prior to any user and driving the necessary remediations across the teams involved in end-to-end ecosystem availability, performance and consumption of the cloud architected application ecosystem leveraging SRE Orchestration solutions Collaborates with Engineering & support teams, including participation in escalations, , and blameless postmortems, Work closely with customer-facing support teams to empower them with SRE insights and tooling. Observe, diagnose & improve the end-2-end ecosystem performance of the Modern architected application portfolio i.e. technical “understanding of interactions" of a full stack application alongside with peer SRE team member. Continuously optimize the L2/support operations work via SRE workflow automation Shape the SRE orchestration platform design with inputs from Production Operations, Business usage & Product and engineering teams. Actively engage and drive AI Ops adoption across teams Qualifications 8-11 years of work experience evolving to a SRE engineer with 3-5 years of experience in continuously improving and transforming IT operations ways of working Bachelor’s degree in Computer Science, Information Technology or a related field Proven experience as an SRE in designing the events diagnostics, performance measures and alert solutions to meet the SLA/SLO/SLIs. The ideal Engineer will be highly quantitative, have great judgment, able to connect dots across ecosytems, and efficiently work cross-functionally across teams to ensure SRE orchestrating solutions are meeting customer/end-user expectations The candidate will take a pragmatic approach resolving incidents, including the ability to systemically triangulate root causes and work effectively with external and internal teams to meet objectives. A strong expertise of SRE (Software Reliability Engineering) and IT Service Management (ITSM) processes with a track record for improving service offerings – pro-actively resolving incidents, providing a seamless customer/end-user experience and proactively identifying and mitigating areas of risk. Hands on experience in Python, SQL /No-SQl( MySQL, Mongo DB, Cassandra, Postgress), AppDynamics, ELK Stack Grafana, Splunk, Dynatrace, Kafka and any SRE Ops toolsets. A firm understanding of cloud archticture for distributed environments. Front-end technologies: HTML, CSS, JavaScript, and frameworks like React, Angular, or Vue.js. Back-end technologies: Server-side languages (Java, Spring Boot, and related technologies that build the server-side logic, APIs, and database interaction with MySQL, MongoDB, Cassandra, Couchbase) Infrastructure: Azure/AWS cloud platforms and/or Client / server environments. Prior experience involving in shaping transformation developing SRE solutions would be a plus.

Responsibilities

Engage & influence product and engineering teams during the design and development phases to embed reliability and operability into new services defining & enforce events, logging, monitoring, and observability standards across applications. Ensuring non-functional requirements (NFRs) are embedded early including SLA/SLO/SLI and error budgets into the product’s offerings as part of the engineering solution Execute as Pro-active SRE Support engineer, preventing P1, P2, potential P3s, diagnosing any anomalies prior to any user and driving the necessary remediations across the teams involved in end-to-end ecosystem availability, performance and consumption of the cloud architected application ecosystem leveraging SRE Orchestration solutions Collaborates with Engineering & support teams, including participation in escalations, , and blameless postmortems, Work closely with customer-facing support teams to empower them with SRE insights and tooling. Observe, diagnose & improve the end-2-end ecosystem performance of the Modern architected application portfolio i.e. technical “understanding of interactions" of a full stack application alongside with peer SRE team member. Continuously optimize the L2/support operations work via SRE workflow automation Shape the SRE orchestration platform design with inputs from Production Operations, Business usage & Product and engineering teams. Actively engage and drive AI Ops adoption across teams

Qualifications

8-11 years of work experience evolving to a SRE engineer with 3-5 years of experience in continuously improving and transforming IT operations ways of working Bachelor’s degree in Computer Science, Information Technology or a related field Proven experience as an SRE in designing the events diagnostics, performance measures and alert solutions to meet the SLA/SLO/SLIs. The ideal Engineer will be highly quantitative, have great judgment, able to connect dots across ecosytems, and efficiently work cross-functionally across teams to ensure SRE orchestrating solutions are meeting customer/end-user expectations The candidate will take a pragmatic approach resolving incidents, including the ability to systemically triangulate root causes and work effectively with external and internal teams to meet objectives. A strong expertise of SRE (Software Reliability Engineering) and IT Service Management (ITSM) processes with a track record for improving service offerings – pro-actively resolving incidents, providing a seamless customer/end-user experience and proactively identifying and mitigating areas of risk. Hands on experience in Python, SQL /No-SQl( MySQL, Mongo DB, Cassandra, Postgress), AppDynamics, ELK Stack Grafana, Splunk, Dynatrace, Kafka and any SRE Ops toolsets. A firm understanding of cloud archticture for distributed environments. Front-end technologies: HTML, CSS, JavaScript, and frameworks like React, Angular, or Vue.js. Back-end technologies: Server-side languages (Java, Spring Boot, and related technologies that build the server-side logic, APIs, and database interaction with MySQL, MongoDB, Cassandra, Couchbase) Infrastructure: Azure/AWS cloud platforms and/or Client / server environments. Prior experience involving in shaping transformation developing SRE solutions would be a plus.
PepsiCo
PepsiCo

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say