App Sustain & Ops Engineer
Location: MIGUEL HIDALGO, Distrito Federal, Mexico
Overview We Are PepsiCo Join PepsiCo and Dare for Better! We are the perfect place for curious people, thinkers and change agents. From leadership to front lines, we're excited about the future and working together to make the world a better place. Being part of PepsiCo means being part of one of the largest food and beverage companies in the world, with our iconic brands consumed more than a billion times a day in more than 200 countries. Our product portfolio, which includes 22 of the world's most iconic brands, such as Sabritas, Gamesa, Quaker, Pepsi, Gatorade and Sonrics, has been a part of Mexican homes for more than 116 years. A career at PepsiCo means working in a culture where all people are welcome. Here, you can dare to be you. No matter who you are, where you're from, or who you love, you can always influence the people around you and make a positive impact in the world. Know more: PepsiCoJobs Join PepsiCo, dare for better. Responsibilities The Opportunity Role is responsible for ensuring the overall stability of production application. Reliability, availability, scalability, and efficiency of our production systems and platforms. The Operations Engineer will collaborate with cross-functional teams—including Software Engineering, Service Reliability, Infrastructure, and Business Operations—to streamline processes, manage day-to-day operations, monitor system health, and quickly resolve incidents. Your Impact As App Sustain & Ops Engineer your scope would consist of: 1. System Reliability & Availability o Ensure production systems, applications, and infrastructure are reliable, performant, and available within agreed SLAs/OLAs. 2. Incident & Problem Management o Lead troubleshooting of critical incidents and drive timely resolution as part of Incident Management. Ensure the Root Cause Analysis is performed and help coordinate the implement permanent fixes on a timely basis. o Analyze priority incidents to generate insights and identify gaps in the alerting mechanisms. o Analyze market-specific issues and conduct comparative studies to determine why certain problems occur only in specific markets. 3. Monitoring & Alerting o Partner with the Service Reliability Engineering team to identify, develop and maintain proactive monitoring, alerting, and health checks to detect and prevent issues before business impact. o Assist the SRE team in identifying critical health checks for order flow, Order journey and user journeys to enable dedicated notifications for key steps. 4. Deployment & Change Operations o Partner with the Software Engineering team to support safe, efficient deployments and configuration changes, ensuring minimal disruption to business operations. o Provide insights on system performance and capacity trends; provide recommendations to the Software Engineering to implement improvements for scalability and efficiency. 5. Automation & Continuous Improvement o Identify manual operational tasks and automate processes to increase efficiency, reduce errors, and improve response times. o Identify recurring data anomalies through analysis and assist in determining effective technical and process-related solutions. o Review L2 team’s manual processes to uncover automation opportunities and implement technology-specific solutions aimed at improving productivity. 6. Collaboration with Engineering & Product Teams o Partner with development, infrastructure, and reliability engineering teams to design and deliver operable, scalable, and resilient solutions. 7. Operational Excellence & Documentation o Maintain runbooks, SOPs, and technical documentation; uphold IT controls, compliance, and audit readiness. 8. Risk & Security Management o Enforce operational security best practices, support vulnerability remediation, and contribute to disaster recovery and business continuity planning. Qualifications: • Bachelor’s degree in computer science, Information Technology, Engineering, or a related field (or equivalent experience). • 5+ years of experience in operations engineering, site reliability engineering, or systems administration. • Fluent in English and Spanish • Strong knowledge of Linux/Unix and/or Windows server environments. • Experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Datadog, Splunk, Nagios, AppDynamics, Full Story, Ignio). • Proficiency in at least one scripting/programming language (e.g., Python, Bash, PowerShell). • Familiarity with CI/CD pipelines, deployment automation, and configuration management (e.g., Jenkins, Ansible, Puppet, Chef). • Database - MySQL, MongoDB, Cassandra, Couchbase • Understanding of networking fundamentals (DNS, TCP/IP, load balancing, firewalls). • Hands-on experience with cloud platforms (AWS, Azure, GCP). • Experience working with Service Now. Qualifications Who Are We Looking For? If this is an opportunity that interests you, we encourage you to apply even if you do not meet 100% of the requirements. What can you expect from us: Opportunities to learn and develop every day through a wide range of programs. Internal digital platforms that promote self-learning. Development programs according to Leadership skills. Specialized training according to the role. Learning experiences with internal and external providers. We love to celebrate success, which is why we have recognition programs for seniority, behavior, leadership, moments of life, among others. Financial wellness programs that will help you reach your goals in all stages of life. A flexibility program that will allow you to balance your personal and work life, adapting your working day to your lifestyle. And because your family is also important to us, they can also enjoy benefits such as our Wellness Line, thousands of Agreements and Discounts, Scholarship programs for your children, Aid Plans for different moments of life, among others. We are an equal opportunity employer and value diversity at our company. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We respect and value diversity as a work force and innovation for the organization.Responsibilities
The Opportunity Role is responsible for ensuring the overall stability of production application. Reliability, availability, scalability, and efficiency of our production systems and platforms. The Operations Engineer will collaborate with cross-functional teams—including Software Engineering, Service Reliability, Infrastructure, and Business Operations—to streamline processes, manage day-to-day operations, monitor system health, and quickly resolve incidents. Your Impact As App Sustain & Ops Engineer your scope would consist of: 1. System Reliability & Availability o Ensure production systems, applications, and infrastructure are reliable, performant, and available within agreed SLAs/OLAs. 2. Incident & Problem Management o Lead troubleshooting of critical incidents and drive timely resolution as part of Incident Management. Ensure the Root Cause Analysis is performed and help coordinate the implement permanent fixes on a timely basis. o Analyze priority incidents to generate insights and identify gaps in the alerting mechanisms. o Analyze market-specific issues and conduct comparative studies to determine why certain problems occur only in specific markets. 3. Monitoring & Alerting o Partner with the Service Reliability Engineering team to identify, develop and maintain proactive monitoring, alerting, and health checks to detect and prevent issues before business impact. o Assist the SRE team in identifying critical health checks for order flow, Order journey and user journeys to enable dedicated notifications for key steps. 4. Deployment & Change Operations o Partner with the Software Engineering team to support safe, efficient deployments and configuration changes, ensuring minimal disruption to business operations. o Provide insights on system performance and capacity trends; provide recommendations to the Software Engineering to implement improvements for scalability and efficiency. 5. Automation & Continuous Improvement o Identify manual operational tasks and automate processes to increase efficiency, reduce errors, and improve response times. o Identify recurring data anomalies through analysis and assist in determining effective technical and process-related solutions. o Review L2 team’s manual processes to uncover automation opportunities and implement technology-specific solutions aimed at improving productivity. 6. Collaboration with Engineering & Product Teams o Partner with development, infrastructure, and reliability engineering teams to design and deliver operable, scalable, and resilient solutions. 7. Operational Excellence & Documentation o Maintain runbooks, SOPs, and technical documentation; uphold IT controls, compliance, and audit readiness. 8. Risk & Security Management o Enforce operational security best practices, support vulnerability remediation, and contribute to disaster recovery and business continuity planning. Qualifications: • Bachelor’s degree in computer science, Information Technology, Engineering, or a related field (or equivalent experience). • 5+ years of experience in operations engineering, site reliability engineering, or systems administration. • Fluent in English and Spanish • Strong knowledge of Linux/Unix and/or Windows server environments. • Experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Datadog, Splunk, Nagios, AppDynamics, Full Story, Ignio). • Proficiency in at least one scripting/programming language (e.g., Python, Bash, PowerShell). • Familiarity with CI/CD pipelines, deployment automation, and configuration management (e.g., Jenkins, Ansible, Puppet, Chef). • Database - MySQL, MongoDB, Cassandra, Couchbase • Understanding of networking fundamentals (DNS, TCP/IP, load balancing, firewalls). • Hands-on experience with cloud platforms (AWS, Azure, GCP). • Experience working with Service Now.Qualifications
Who Are We Looking For? If this is an opportunity that interests you, we encourage you to apply even if you do not meet 100% of the requirements. What can you expect from us: Opportunities to learn and develop every day through a wide range of programs. Internal digital platforms that promote self-learning. Development programs according to Leadership skills. Specialized training according to the role. Learning experiences with internal and external providers. We love to celebrate success, which is why we have recognition programs for seniority, behavior, leadership, moments of life, among others. Financial wellness programs that will help you reach your goals in all stages of life. A flexibility program that will allow you to balance your personal and work life, adapting your working day to your lifestyle. And because your family is also important to us, they can also enjoy benefits such as our Wellness Line, thousands of Agreements and Discounts, Scholarship programs for your children, Aid Plans for different moments of life, among others. We are an equal opportunity employer and value diversity at our company. We do not discriminate based on race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. We respect and value diversity as a work force and innovation for the organization.There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
