What You'll do
- Collaborate with development teams to troubleshoot and solve problems, reducing customer impact.
- Develop automated runbooks and implement measures to handle issues proactively.
- Apply sound engineering principles and mature automation to our operating environments.
- Monitor, maintain, and enhance the reliability and performance of cloud based applications
- Leverage your automation and software engineering expertise to optimize systems.
- Document and examine incidents to improve processes and continuously prevent future occurrences.
- Stay up-to-date with the latest industry trends, tools, and best practices in site reliability engineering.
- Contribute to a culture of innovation, learning, and continuous improvement.
What You'll Bring
- Proven experience as an Full Stack Developer or similar role, with a track record of improving system reliability
- Strong problem-solving skills and the ability to analyze complex systems and devise effective solutions
- Excellent collaboration and communication abilities to work cross-functionally and clearly document processes
- Experience with automation, monitoring, and performance optimization tools and techniques
- Dedication to maximizing uptime, scalability, and delivering an exceptional end-user experience
- A passion for technology and a strong desire to continuously learn and grow your skills
- Alignment with Guidewire's mission to leverage technology to help protect and support others
- Good command of the English Language, (Read, Write, Speak)
Required Skills & Experience
- Proven experience leveraging application monitoring and telemetry tools to troubleshoot and diagnose problems
- Proven experience triaging and debugging distributed systems on cloud infrastructure
- Proven experience in designing and engineering CI/CD pipelines within Kubernetes (K8S) and legacy ecosystems
- Proven experience in designing and engineering monitors, dashboards, and synthetic transactions in Datadog
- Proven experience in building, deploying, and running scalable infrastructure within AWS and Kubernetes ecosystems and other cloud-native approaches
- Proven experience in managing infrastructure configuration at scale using multiple approaches and/or tools such as GitOps, Puppet, or Ansible
- Good understanding of AWS cloud networking and security with hands-on experience remediating infrastructure vulnerabilities at scale
- Good understanding of SLIs, SLOs, and Error Budgets
- Comfortable with Linux system administration, with the ability to program/script using Python, Go, Java, shell, or equivalent
- Participate in mandatory on-call rotations to ensure service availability and reliability, responding to incidents and alerts outside regular hours, including weekends and holidays. Candidates must be willing and able to fulfill this critical responsibility.
Preferred Skills
- Full Stack Developer certified in multiple categories
- AWS certified in multiple categories
- Proficiency with SQL, database administration, data pipelines, performance tuning, and schema design
- Proficiency with multiple pipelining tools such as TeamCity, Bitbucket Pipelines, Jenkins, and GitHub Actions
- Familiarity with open-source distributed data processing frameworks such as Hadoop, Apache Spark, AWS Redshift, etc
Other Jobs from Guidewire Software
Site Reliability Engineer
Project Manager, GSC
Project Manager, GSC
Software Engineer - Full Stack (Java)
Senior Manager, Software Engineering-Tokyo
Similar Jobs
Platform Engineer II
Lead Security Developer
Senior Site Reliability Engineer
DevOps Engineer
Site Reliability Engineer
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
π₯³π₯³π₯³ 401 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineersβ¦ in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. π οΈ
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. π
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. π―
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. π
What Fellow Engineers Say