This job is closed! Check out or

Description

Company Overview

At Zuora, we do Modern Business. We’re helping people subscribe to new ways of doing business that are better for people, companies and ultimately the planet. It’s an approach resulting from the shift to the Subscription Economy that puts customers first by building recurring relationships instead of one-time product sales and focuses on sustainable growth. Through our leading expertise and multi-product suite, we are transforming all industries and working with the world’s most innovative companies to monetize new business models, nurture subscriber relationships and optimize their digital experiences.

THE TEAM

The Site Reliability Operations Engineer at Zuora plays a critical and visible role in delivering and supporting our platform. We are responsible for scaling and optimizing the reliability, availability, and performance of our infrastructure and platform services, and partnering with Engineering teams to build highly available and performant services. We work with amazing developer teams in the design, provisioning, integration, configuration, monitoring, and incident response of large scale distributed applications and platform services. We deliver awesome SaaS.

Responsible For:

Service Operations & Impacting issue Restoration
Driving Command Center Incident Bridges for customer issues to resolution
Responding to Observability Alerts/Alarms
Responding to escalated issues from Customer support
Write & Automate runbooks and drive alerts/incidents and service requests reduction by automation
Being a liaison for a service and partner with service owner to make the service rock solid and efficient

WHAT YOU’LL ACHIEVE

As a SRO, you will be a member of a team that understands the configuration, technical dependencies, and overall behavioral characteristics of production services. In partnership with developers, you have the responsibility to ensure services are designed and delivered with focus on security, resiliency, scale, and performance. SROs are the ultimate authority and are accountable for end-to-end performance and operability of the services they own.

Champion service reliability operations and incidents prevention

You will be part of the team whose mission is the shared ownership of a collection of services and technology areas, in partnership with developer teams.
You are a key escalation point for issues that have been documented as Standard Operating Procedures (SOPs) or issues that needed in-depth troubleshooting and analysis. You will help maintain up-to-date documentation on deployments, processes and SOP runbooks.
You are a key escalation point in leading incidents and working with Subject Matter Expert (SME) for performing real-time incident handling tasks to support operations. You will help develop and implement the incident management process.
You will have the deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Once you have expertly mitigated an incident, you will immediately work with SME on how to more quickly resolve the issue next time, with the goal to prevent the problem from recurring. You will help develop and implement the problem management process.
You will manage the full lifecycle of infrastructure and change management, including planned maintenance, standart, normal, and emergency changes. You will help develop and implement change management processes to ensure developers and SRO can easily manage system configurations, deploy new code quickly and fix incidents faster.

Service design and implementation

You will partner with development SCRUM teams in defining and implementing improvements to service architecture, both current and future. You will be an expert at articulating technical characteristics of services and their dependencies, and guide development teams to engineer highly reliable and performant services.
You will frequently partner with developer SCRUM teams and actively participate in the execution of tasks required to meet milestones and deliverables set by the team throughout a release cycle.

Operations Engineering

You will work with a large scale centralized monitoring and logging system to help maintain uptime and troubleshoot problems. You will understand and be able to communicate the capacity, scale, security, performance attributes and requirements of services you own. You will lead design and implementation of monitoring, alerts, and responses for all infrastructure and applications.
You will implement SRO automation, develop automation across the service reliability operation, and optimize operations hours by reducing manual operations. You will apply engineering mindset and development skills to site reliability operations.
You will take part in a shared on-call rotation that won’t cripple your life or kill your soul.

Job Involves:

Resolution of complex and critical issues, participation in Major incidents as a SME
Service expert ensuring expertise is reflected in SOP's documentation are shared
Instrumentation and metrics that clearly describe the service behaviors
Scaling requirements and patterns
Resiliency and recoverability, ensuring that backup / restore and disaster recovery capabilities are implemented, tested and maintained
Driving and escalating gaps in automation, solutions and documentation

WHAT YOU’LL NEED TO BE SUCCESSFUL

SROs are a rare mix of sysadmins and development engineers, and as such you have the ability to understand and explain the effect of product architecture decisions on the ability to run as distributed systems. You are driven by professional curiosity and a desire to develop a deep understanding of the services and the technologies they depend upon.

You demonstrate competence in shell scripting and high-level programming languages such as Bash, Ansible, Python, Terraform and low-level / no-code programming languages and solutions such as Google Apps Scripts, Jenkins Pipelines Groovy scripts, Jira Automation, Rundeck.

You are proactive, self-motivated, customer-focused, organized, and a good communicator.

You have over 4 years experience running large scale customer facing web services with a solid understanding of:

REST APIs
Linux/Unix system internals.
Load balancing technologies, including L7 routing, DNS, and CDN
Networking and TCP/IP
Off the shelf observability (monitoring, metrics, alerting, tracing) solutions (Grafana, LogicMonitor, Pingdom) or open source ones (Prometheus)
Log analysis and troubleshooting using Kibana
Standard Internet services, such as DNS, HTTP, etc.
Cloud computing patterns
Configuration management using Puppet, Chef, Ansible, or similar
IT Security and compliance
Container based orchestration platforms such as Kubernetes/EKS/AKS and ECS at scale
CI/CD pipelines using tools such as GIT, Jenkins, Spinnaker, Terraform and Ansible
RDBMS and Messaging Fundamentals - MySQL, Oracle and Kafka is preferred
Programming with Python

You demonstrate practical knowledge of various aspects of distributed service design, including messaging protocols, caching strategies, persistence technologies, and queuing.

You have experience with AWS Services like EC2, ELB, ElastiCache, DynamoDB, SQS, SNS, RDS, S3.

You are passionate about automation.

Your head is full of customer-delighting ideas for the next hackathon.

An ideal candidate will also have experience with:

Container and Container Management technologies, such as Docker and Kubernetes
Databases and big data stores.
Defining and documenting technical architecture of complex and highly scalable products.
Familiarity with ITIL-based incident, problem, and change management.
Experience working with large global teams and ability to coordinate well within and across various development teams.

#ZEOLife at Zuora

As an industry pioneer, our work is constantly evolving and challenging us in new ways that require us to think differently, iterate often and learn constantly—it’s exciting. Our people, whom we refer to as “ZEOs" are empowered to take on a mindset of ownership and make a bigger impact here. Our teams collaborate deeply, exchange different ideas openly and together we’re making what’s next possible for our customers, community and the world.

As part of our commitment to building an inclusive, high-performance culture where ZEOs feel inspired, connected and valued, we support ZEOs with:

Competitive compensation, corporate bonus program and performance rewards, company equity and retirement programs
Medical, dental and vision insurance
Generous, flexible time off
Paid holidays, “wellness” days and company wide end of year break
6 months fully paid parental leave
Learning & Development stipend
Opportunities to volunteer and give back, including charitable donation match
Free resources and support for your mental wellbeing

Specific benefits offerings may vary by country and can be viewed in more detail during your interview process.

Location & Work Arrangements

Organizations and teams at Zuora are empowered to design efficient and flexible ways of working, being intentional about scheduling, communication, and collaboration strategies that help us achieve our best results. In our dynamic, globally distributed company, this means balancing flexibility and responsibility — flexibility to live our lives to the fullest, and responsibility to each other, to our customers, and to our shareholders. For most roles, we offer the flexibility to work both remotely and at Zuora offices.

Our Commitment to an Inclusive Workplace

Think, be and do you! At Zuora, different perspectives, experiences and contributions matter. Everyone counts. Zuora is proud to be an Equal Opportunity Employer committed to creating an inclusive environment for all.

Zuora does not discriminate on the basis of, and considers individuals seeking employment with Zuora without regards to, race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or other applicable legally protected characteristics.

We encourage candidates from all backgrounds to apply. Applicants in need of special assistance or accommodation during the interview process or in accessing our website may contact us by sending an email to assistance@zuora.com.

Zuora

Billing Developer APIs Payments SaaS Software

0 applies

113 views

Similar Jobs

Senior Software Engineer (DevOps) Real Time Payments

Dublin, Ireland

Lead Software Engineer (DevOps) Real Time Payments-1

Dublin, Ireland

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 340 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Wall Of Love From Fellow Engineers

Frequently Asked Questions

Salaries for the engineering jobs on our site range from $100K-$200K. On average, senior engineer positions on our EchoJobs are about $160K.
The EchoJobs positions have been sourced and vetted from the top companies to work for in the US as a software engineer, including LinkedIn and other reputable job sites. We also have syndicated jobs from companies that have just raised funding, as well as those that have great unique products and culture. From all of these sources, our founder, Morgan, has also resourced the company's authenticity in terms of their website, public appearance, and more.
Yes, our users asked us for just this, so now our search filters allow you to search for your top jobs via location, as well as by onsite, remote, or both. Approximately 30% of our jobs are remote, so you’ve got the best options for you!
We have not yet implemented this option, but are considering doing so in the future. For the moment, you would need to cancel your subscription, and resubscribe when you wanted to come back.
We add new jobs to EchoJobs every day! We scan our sources for the newest jobs, verify them, and post them to EchoJobs within minutes. We add about 2,000-3,000 new jobs for you each day!
From starting your job search to getting hired, the entire job search process can take us software engineers anywhere between 3-6 months. However, at EchoJobs, we’re striving to shorten this duration by finding the best, newest jobs for you, so you can do less job searching, and more applying.
We’d recommend checking EchoJobs daily, as we add new jobs to the site each day. Additionally, if you got a chance to read our previous email on “what makes EchoJobs different from any other job search tools,” we also recommended that you set a job alert based on your job filters, so if you get emails on those new jobs, you could be checking more than once per day.
If you decide to continue with us after the 1-month trial, we definitely recommend this, as we all know it usually takes 3-6 months to find a quality job as a software engineer these days. So to best support you, we just adjusted our membership options at EchoJobs to monthly, 3 months, or 12 months (this option is more for passive job seekers looking a little bit for the future if they want to come back to work or make a job switch potentially. This lets you see what’s out there in case an even better fit job becomes available.)
EchoJobs is truly the only job site of its kind. We want to be THE spot for you to find the best job for you, and haven’t encountered any other company doing this. Other job sites are in niches besides software engineering or focus on a small portion of engineering jobs (like a specific coding language). In the words of Morgan, our founder, “I think what makes EchoJobs different is the amount of jobs, frequency that we add new jobs (we add 2,000-3,000 new jobs daily!), and the powerful search engines to find exactly the job you want more easily and efficiently. We can provide you with the most jobs that are vetted by us, we’ll continually find more new jobs for you, and we make it easier for you to apply and get hired.

Zuora

Site Reliability Operations Engineer III

Ugh.. sorry 😔 This job is closed.

Check out similar jobs below 😊

Jobs from our Partners

Principal Software Engineer

Software Engineer II

DevOps Engineer II

DevOps Engineer II

DevOps Engineer II

DevOps Engineer II

Other Jobs from Zuora

Sr Manager, Software Engineering

Sr Software Developer Engineer in Test

Sr Software Developer Engineer in Test

Sr Software Developer Engineer in Test

Senior Product Marketing Manager (Zuora Platform)

Senior Customer Solution Engineer

Similar Jobs

Senior Software Engineer (DevOps) Real Time Payments

Lead Software Engineer (DevOps) Real Time Payments-1