Procter & Gamble Company

Senior Site Reliability Engineer (Manufacturing IT Operations)

Philippines
R Swift AWS Azure GCP Python Docker Terraform C# Kubernetes SQL
Search for More Jobs Talk to a recruiter now 💪
Description

Job Location

Taguig City

Job Description

Overview of the job

As the Senior SRE Lead in the Manufacturing IT Operations – Incident Response Team, you will be responsible for leading incident response efforts, ensuring swift and effective resolution of critical system issues. You will also play a critical role in ensuring the reliability, scalability, and performance of our systems and services. Collaborating with cross-functional teams, you will design, implement, and automate robust systems, monitoring tools, and processes. Additionally, you will have responsibilities including leading the SRE team, managing time/schedule, managing SLOs and SLIs, managing reporting, and reporting directly to the IT Operations Director.

Your team

You will lead the SRE – Incident Response team, providing guidance, support, and mentorship to the team members as they navigate their roles. Collaborating closely with technically skilled professionals, including software engineers, DevOps specialists, Subject Matter Experts, and other SREs, you will foster a culture of technical expertise, continuous learning, and knowledge sharing, while encouraging innovation and embracing new ideas. In addition, you will directly collaborate with our site customers and users, ensuring their needs and expectations are met through reliable and high-performing systems. You will report directly to the IT Operations Director.

How success looks like

Success as an SRE Lead involves different areas of the role, including incident response, monitoring and reliability, effective collaboration with customers and users, and additional responsibilities as a leader:

  • Lead the swift response and resolution to critical incidents, ensuring minimal impact on system availability and user experience, while driving continuous improvement in incident management processes.
  • Ensure high system availability and reliability through robust monitoring, optimization of system architecture, and cross-functional collaboration to design and implement resilient systems.
  • Lead comprehensive monitoring solutions to gain real-time insights into system performance, enabling proactive incident response and continuous improvement of system visibility and resource optimization.
  • Collaborate directly with customers and users to understand their needs, proactively address concerns, and provide exceptional customer support to ensure reliable and performant systems that meet their expectations.
  • Lead the SRE team, providing guidance, support, and mentorship to team members, fostering a culture of technical excellence and continuous learning.
  • Manage time/schedule effectively to ensure coverage and support across the week, maintaining the reliability and availability of our systems.
  • Manage SLOs and SLIs, ensuring that the defined service level objectives and indicators are met or exceeded.
  • Oversee reporting, providing accurate and timely updates on incident response, system performance, and other relevant metrics to stakeholders.
  • Report directly to the IT Operations Director, providing insights, recommendations, and collaborating on strategic initiatives.

Responsibilities of the role

Team Leadership:

  • Lead the SRE team, providing guidance, support, and mentorship to foster a culture of technical excellence and continuous learning.
  • Manage time/schedule effectively to ensure coverage and support across the week, maintaining the reliability and availability of systems.
  • Oversee reporting, providing accurate and timely updates on incident response, system performance, and other relevant metrics to stakeholders.
  • Foster a collaborative and inclusive team culture, promoting effective communication, knowledge sharing, and professional development.

Incident Response:

  • Lead incident response efforts, swiftly resolving critical incidents to minimize downtime and user impact.
  • Implement effective incident management processes, ensuring clear communication, coordination, and documentation.
  • Conduct root cause analysis, implementing preventive measures, and driving continuous improvement.

Reliability:

  • Ensure high system availability through robust monitoring, alerting, and automated incident response systems.
  • Optimize system architecture and configurations for improved performance, scalability, and fault tolerance.
  • Collaborate cross-functionally to design and implement resilient systems using industry best practices.

Monitoring:

  • Implement comprehensive monitoring solutions, providing real-time insights into system performance and health.
  • Configure and manage monitoring tools, ensuring accurate and actionable alerts for proactive incident response.
  • Continuously evaluate and enhance monitoring strategies to improve system visibility and resource optimization.

Upskilling:

  • Stay updated with industry trends, technologies, and best practices in Site Reliability Engineering.
  • Continuously develop technical skills in system architecture, automation, cloud technologies, and incident response.
  • Share knowledge, mentor team members, and foster a culture of learning and upskilling.

Managing Customers:

  • Collaborate directly with users and customers to understand their needs and pain points.
  • Proactively address customer/user concerns, ensuring reliable and performant systems.
  • Provide exceptional customer support, communicate updates, resolutions, and gather feedback for continuous improvement.

Job Qualifications

Role Requirements

Technical Expertise and Experience:

  • Knowledge or familiarity in system administration, including Linux/Unix environments, cloud platforms (such as AWS, Azure, or GCP).
  • Experience with configuration management tools and infrastructure-as-code frameworks (e.g., Terraform).
  • Proficiency in at least one programming language (e.g., Python, C#) and experience with scripting for automation tasks.
  • Understanding of networking protocols, network infrastructures, load balancing, and DNS management.
  • Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes).
  • Familiarity with databases and proficiency in writing SQL queries.
  • Experience or familiarity with monitoring and observability tools (e.g., Prometheus, Grafana).
  • Knowledge of incident response methodologies, root cause analysis, and implementing preventive measures.
  • Understanding of security best practices and experience with implementing secure systems.
  • Experience in Manufacturing Execution Systems (e.g. Proficy) or Manufacturing Operations is a plus.
  • At least 7 years of experience in the industry preferably in Software Engineering, Software Development, SRE or DevOps or Technical Consulting

Soft Skills:

  • Strong problem-solving and troubleshooting skills, with the ability to analyze complex issues and devise effective solutions.
  • Excellent communication and collaboration skills to work effectively with cross-functional teams, stakeholders, and customers.
  • Strong leadership skills to guide and mentor the SRE team, fostering technical excellence and continuous learning.
  • Ability to manage time and schedule effectively, ensuring coverage and support across the week.
  • Strong attention to detail and commitment to delivering high-quality work.
  • Proactive and self-motivated, with a continuous learning mindset and a drive for staying updated with industry trends and technologies.
  • Ability to thrive under pressure and effectively manage incidents, ensuring timely resolutions and minimizing downtime.

This role requires a commitment to work a standard 5-day workweek, with 4 weekdays and at least one weekend day (Sunday or Saturday). The nature of the SRE Lead position necessitates coverage and support across the week, ensuring the reliability and availability of our systems. We value work-life balance and will strive to provide a predictable and manageable schedule within this framework, while still meeting the needs of our customers and maintaining the stability of our services

About us

We produce globally recognized brands and we grow the best business leaders in the industry. With a portfolio of trusted brands as diverse as ours, it is paramount our leaders are able to lead with courage the vast array of brands, categories and functions. We serve consumers around the world with one of the strongest portfolios of trusted, quality, leadership brands, including Always®, Ariel®, Gillette®, Head & Shoulders®, Herbal Essences®, Oral-B®, Pampers®, Pantene®, Tampax® and more. Our community includes operations in approximately 70 countries worldwide.

Visit http://www.pg.com to know more.

We are an equal opportunity employer and value diversity at our company. We do not discriminate against individuals on the basis of race, color, gender, age, national origin, religion, sexual orientation, gender identity or expression, marital status, citizenship, disability, HIV/AIDS status, or any other legally protected factor.

Job Schedule

Full time

Job Number

R000114678

Job Segmentation

Experienced Professionals (Job Segmentation)
Procter & Gamble Company
Procter & Gamble Company
Beauty Brand Marketing Cleaning Products Consumer Goods Cosmetics Nutrition Personal Health

0 applies

2 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 401 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • Salaries for the engineering jobs on our site range from $100K-$200K. On average, senior engineer positions on our EchoJobs are about $160K.
  • The EchoJobs positions have been sourced and vetted from the top companies to work for in the US as a software engineer, including LinkedIn and other reputable job sites. We also have syndicated jobs from companies that have just raised funding, as well as those that have great unique products and culture. From all of these sources, our founder, Morgan, has also resourced the company's authenticity in terms of their website, public appearance, and more.
  • Yes, our users asked us for just this, so now our search filters allow you to search for your top jobs via location, as well as by onsite, remote, or both. Approximately 30% of our jobs are remote, so you’ve got the best options for you!
  • We have not yet implemented this option, but are considering doing so in the future. For the moment, you would need to cancel your subscription, and resubscribe when you wanted to come back.
  • We add new jobs to EchoJobs every day! We scan our sources for the newest jobs, verify them, and post them to EchoJobs within minutes. We add about 2,000-3,000 new jobs for you each day!
  • From starting your job search to getting hired, the entire job search process can take us software engineers anywhere between 3-6 months. However, at EchoJobs, we’re striving to shorten this duration by finding the best, newest jobs for you, so you can do less job searching, and more applying.
  • We’d recommend checking EchoJobs daily, as we add new jobs to the site each day. Additionally, if you got a chance to read our previous email on “what makes EchoJobs different from any other job search tools,” we also recommended that you set a job alert based on your job filters, so if you get emails on those new jobs, you could be checking more than once per day.
  • If you decide to continue with us after the 1-month trial, we definitely recommend this, as we all know it usually takes 3-6 months to find a quality job as a software engineer these days. So to best support you, we just adjusted our membership options at EchoJobs to monthly, 3 months, or 12 months (this option is more for passive job seekers looking a little bit for the future if they want to come back to work or make a job switch potentially. This lets you see what’s out there in case an even better fit job becomes available.)
  • EchoJobs is truly the only job site of its kind. We want to be THE spot for you to find the best job for you, and haven’t encountered any other company doing this. Other job sites are in niches besides software engineering or focus on a small portion of engineering jobs (like a specific coding language). In the words of Morgan, our founder, “I think what makes EchoJobs different is the amount of jobs, frequency that we add new jobs (we add 2,000-3,000 new jobs daily!), and the powerful search engines to find exactly the job you want more easily and efficiently. We can provide you with the most jobs that are vetted by us, we’ll continually find more new jobs for you, and we make it easier for you to apply and get hired.

What Fellow Engineers Say