ESSENTIAL DUTIES AND RESPONSIBILITIES
- Take a purist SRE approach to shared multi-tenant infrastructure for a resilient SaaS microservice-based containerized systems in addition to customer-centric application environments
- ersee and automate the team’s growing presence in AWS
- Contribute to core infrastructure systems development with features, bug fixes, reliability improvements, etc
- Platform reliability engineering of a complex single sign-on SAML/OAuth-based central authentication platform
- Creatively build and develop tooling to aid in driving 24x7x365 follow-the-sun operations of critical production systems
- Automate deployment tasks for core product and infrastructure tools and maintain automation infrastructure
- Create system documentation and training materials to empower and educate our fellow team members
- Build and maintain observability tooling, metrics, and dashboarding for a global platform product infrastructureImprove our incident management lifecycle to identify, mitigate, and learn from reliability risks and issues
- Enhance platform observability with helping create a self-healing approach to platform reliability
- Collaborate with engineering teams, providing product feedback and where necessary contribute code to the product
Education and Work Experience
- Bachelor’s Degree in Computer Science or related fieldSoftware engineering and task automation skills with Bash, Python, and/or Go are a must.
- Solid understanding of agile software development methodologies (Scrum, Kanban, etc.)
- Deep background with Linux systems and engineering
- Highly experienced with engineering and automating on Amazon Web Services (AWS)Experience supporting web applications running on Java / Apache / Tomcat in a live production environment
- Prior experience with IaC tools like Terraform/Terragrunt/TerraspacePrior experience with devops/gitops tools (Git, Bitbucket, Flux CD, Teamcity) for gate promotions
- Production-At-Scale support background in a heavily microservice-based worldHands-on engineering and ops expertise in containerization (Docker, Helm, Kubernetes/EKS, CNI and Ingress networking)
- Strong understanding of Single-Sign On, SAML, OAuth (Bonus if hands-on experience with Okta)
- Seasoned expertise around x.509 certificate technology and basic concepts of encryption
- Experience working with Relational Databases such as Aurora Postgres and/or Oracle RDSAdvanced exposure to application development, web UI (design and development), JSON, application architecture
- Experience strongly utilizing observability tools (logging/APM) like Datadog, CloudWatch, and PagerDuty.
- Familiarity with event store/stream-processing technologies like Kafka or AWS SQSUnderstanding of Open Application Model systems such as KubeVela or Crossplane
Personal Qualities and Soft Skills
- You greatly prefer writing code than clicking a GUI.
- You enjoy teaching, being a mentor to others, and working across boundaries
- Outstanding troubleshooting skills; ability to think critically and display an aptitude for problem solving
- Strong analytical mind with a penchant for process development and enhancement
- A highly positive can-do attitude with desire for being a team player
- Great communication skills and ability to explain complex technical concepts to a varied audience
- Demonstrate strong follow-through, a strong work ethic and consistently keep and meet commitments
- Ability to champion a culture of reliability within the product team, promoting practices like blameless postmortems, SLO tracking, and continuous learning from incidents.
Other Requirements
- Ability to read, write, and speak EnglishAbility to speak in public settings, interface with customers, partners and vendors confidently
- Travel – Up to 25% of the job will require travel, approximately a week a month
Other Jobs from Guidewire Software
Project Manager, GSC
Project Manager, GSC
Full-stack cloud Engineer
Software Engineer - Full Stack (Java)
Senior Manager, Software Engineering-Tokyo
Similar Jobs
Staff Site Reliability Engineer
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 401 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say