Pure Storage

Observability and Site Reliability Engineer

Santa Clara, CA US
USD 207k - 312k
R Python SQL Shell GCP Go AWS Azure Kubernetes
Description

BE PART OF BUILDING THE FUTURE.

What do NASA and emerging space companies have in common with COVID vaccine R&D teams or with Roblox and the Metaverse? 

The answer is data, -- all fast moving, fast growing industries rely on data for a competitive edge in their industries. And the most advanced companies are realizing the full data advantage by partnering with Pure Storage. Pure’s vision is to redefine the storage experience and empower innovators by simplifying how people consume and interact with data. With 11,000+ customers including 58% of the Fortune 500, we’ve only scratched the surface of our ambitions

Pure is blazing trails and setting records:

  • For ten straight years, Gartner has named Pure a leader in the Magic Quadrant 
  • Our customer-first culture and unwavering commitment to innovation have earned us a certified Net Promoter Score in the top 1% of B2B companies globally
  • Industry analysts and press applaud Pure’s leadership across these dimensions
  • And, our 5,000+ employees are emboldened to make Pure a faster, stronger, smarter company as we go

If you, like us, say “bring it on” to exciting challenges that change the world, we have endless opportunities where you can make your mark.

SHOULD YOU ACCEPT THIS CHALLENGE…

Our team is dedicated to maintaining the reliability, performance, and operational excellence of our product and platforms.The Fleet Reliability team is where you will be at the frontline of ensuring seamless customer experiences, especially during incidents or escalations. We work closely with engineering and support teams to proactively prevent issues, resolve customer escalations, and improve our monitoring and response processes. The work is cross-discipline and each team member develops an understanding and expertise in many functional areas of our products and technologies.

As an Observability and SRE Engineer, you’ll be responsible for managing and enhancing the observability of our systems, troubleshooting complex issues, and leading post-incident reviews. Your work will directly impact our ability to respond swiftly to incidents, minimize downtime, and improve customer satisfaction. You’ll focus on building a resilient infrastructure while also owning the process and tooling to resolve escalations effectively.

Key Responsibilities:

  1. Customer Escalation Management
    • Act as the primary technical resource for high-impact customer escalations, working to diagnose, troubleshoot, and resolve incidents.
    • Coordinate with customer support and engineering teams to ensure issues are resolved quickly and accurately.
    • Serve as a technical point of contact during incidents, communicating status and resolution plans to relevant stakeholders.
  2. Observability and Monitoring
    • Develop and maintain dashboards, alerts, and logging systems to track product performance.
    • Improve the observability and visibility of features through enhancements to monitoring, logging, and alerting.
    • Establish SLAs, SLIs, and SLOs to measure and ensure the reliability of product and proactively prevent escalations and sev-1’s 
    • Look for trends on features causing reliability issues 
  1. Collaboration and Communication
    • Work cross-functionally with development, product, and support teams to enhance system reliability and customer experience.
    • Provide feedback to development teams on areas of improvement for code stability and reliability.
    • Mentor other engineers on best practices in observability and reliability engineering.

WHAT YOU’LL NEED TO BRING TO THIS ROLE...

  • Experience: 7+ years in SRE, or a related field, with a strong focus on observability and customer-facing incident response.
  • Technical Skills: Proficiency in monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack, Datadog, New Relic, Splunk).
  • Programming and Scripting: Solid knowledge of languages like Python, Go, SQL and experience with shell scripting for automation.
  • Cloud Infrastructure: Experience with cloud platforms (e.g., AWS, GCP, Azure) and container orchestration tools (e.g., Kubernetes).
  • Problem-Solving: Strong analytical skills to diagnose and troubleshoot complex systems and identify root causes quickly.
  • Communication: Excellent verbal and written communication skills, with experience in handling high-stakes customer interactions.
  • Incident Management: Familiarity with incident management frameworks and tools (e.g., PagerDuty, Opsgenie, or similar) is a plus.
  • Certification in cloud platforms (e.g., AWS Certified Solutions Architect, Google Cloud SRE Professional).
  • We are primarily an in-office environment and therefore, you will be expected to work from the Santa Clara, CA office in compliance with Pure’s policies, unless you are on PTO, or work travel, or other approved leave.

The annual base salary range is: $207,000 – $312,000. Salary ranges are determined based on role, level and location. For positions open to candidates in multiple geographical locations, the base salary range is reflective of the labor market across the applicable locations. 

This role may be eligible for incentive pay and/or equity. 

And because we understand the value of bringing your full and best self to work, we offer a variety of perks to manage a healthy balance, including flexible time off, wellness resources, and company-sponsored team events - check out purebenefits.com for more information. 

INCLUDE FOR POSTING LOCATION IDENTIFICATION

#LI-REMOTE, #LI-HYBRID, #LI-ONSITE

 

BE YOU—CORPORATE CLONES NEED NOT APPLY.

 

Pure is where you ask big questions, think differently, and make an impact. This is not just a job, but a place where you have a voice and can accelerate your career. We value unique thoughts and celebrate individuality, and with ample opportunity to learn, develop yourself, and expand into different roles, joining Pure is an investment in your career journey.

 

Through our Pure Equality program, which supports a flourishing field of employee resource groups, we nourish the personal and professional lives of our team members. And our Pure Good Foundation gives back to local and global communities through volunteering and grants.

 

And because we understand the value of bringing your full and best self to work, we offer a variety of perks to manage a healthy balance, including flexible time off, wellness resources, and company-sponsored team events.

 

PURE IS COMMITTED TO EQUALITY.

Research shows that in order to apply for a job, women feel they need to meet 100% of the criteria while men usually apply after meeting about 60%. Regardless of how you identify, if you believe you can do the job and are a good match, we encourage you to apply.

Pure is proud to be an equal opportunity and affirmative action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or any other characteristic legally protected by the laws of the jurisdiction in which you are being considered for hire.

 

If you need assistance or an accommodation due to a disability, you may contact us at TA-Ops@purestorage.com.

 

APPLICANT & CANDIDATE PERSONAL INFORMATION PRIVACY NOTICE.

If you're wondering how or why Pure collects or uses information you provide, we invite you to check out our Applicant & Candidate Personal Information Protection Notice.

DEEMED EXPORT LICENSE NOTICE.

Some positions may require a deemed export license for compliance with applicable laws and regulations. Please note: Pure does not currently sponsor deemed export license applications so we are unable to proceed with applicants requiring stated sponsorship.

Pure Storage
Pure Storage
Cloud Computing Data Storage Enterprise Software

0 applies

1 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 401 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say