American Family Insurance

Lead Resiliency Engineer

Madison, WI Boston, MA
USD 131k - 220k
API DevOps
Description

Lead Resiliency Engineer

Location: WI Madison, MA Boston

Remote Type: Hybrid

Time Type: Full time

Job Description

American Family Insurance is seeking a Lead Resiliency Engineer to strengthen the resiliency, reliability, and availability of our technology platforms and applications. This role leads a team focused on solutions, frameworks, and process optimization with an emphasis reducing incidents, business continuity, and reducing MTTR. You will combine technical expertise with leadership to analyze technologies, design robust systems, and implement best practices and solutions that ensure seamless operations and rapid recovery from disruptions.

This role does not have approval for sponsorship now or in the future. This also include OPT / F1 Visa.

Position Compensation Range:

$131,000.00 - $220,000.00

Pay Rate Type:

Salary

Compensation may vary based on the job level and your geographic work location. Relocation support is offered for eligible candidates.

Primary Accountabilities

  • Lead and mentor a team in implementing solutions to enhance overall resiliency.

  • Help define, document, and champion comprehensive resiliency engineering principles and practices tailored to the organization’s environment.

  • Participate in the selection and implementation of tools for monitoring, alerting, chaos engineering, and automated recovery.

  • Guide the analysis of ITSM workflows (incident, change, problem) to identify and prioritize high-impact automation opportunities.

  • Architect and direct the development of automation solutions leveraging ITSM tools and integrating with other enterprise systems.

  • Ensure the deployment and integration of automated workflows, maintaining seamless data flow and effective reporting.

  • Establish and monitor metrics to track the effectiveness of resiliency and automation initiatives.

  • Collaborate with Enterprise DevOps, Integration Platform DevOps, and other teams to review CI/CD pipelines, identify bottlenecks, and drive improvements.

  • Participate in the development of a comprehensive automation strategy encompassing build, test, security scanning, and deployment processes to improve system resiliency.

  • Participate in defining and standardizing operating environments and infrastructure configurations, partnering with architecture teams to implement Infrastructure-as-Code (IaC) frameworks.

  • Implement and manage automated testing and rollback mechanisms for infrastructure and application changes.

  • Lead the design, development, enhancement, and maintenance of tools, systems, and software solutions to support resiliency objectives.

  • Direct incident triage efforts, determining scope, urgency, and potential impact, and coordinate effective response and recovery.

  • Lead technology evaluations and re-engineering activities to support strategic direction and continuous improvement.

  • Transform business requirements into technical specifications, ensuring alignment with resiliency and automation goals.

  • Foster cross-functional coordination and ensure a partnership-focused approach to balancing service priorities with business needs, managing risk-based exceptions as necessary.

Specialized Knowledge & Skills Requirements

  • Experience in designing fault-tolerant architectures, including automated failover, multi-region redundancy, and graceful degradation strategies enabled by Chaos engineering.

  • Deep understanding of complex distributed systems, such as microservices orchestration, service meshes, and eventual consistency models.

  • Demonstrated experience delivering customer-driven solutions, support, or service in high-availability environments.

  • Extensive knowledge of software engineering architectures, system/software design, and system deployments.

  • Proven experience across multiple IT domains, including development, testing, configuration, deployment, and monitoring.

  • Strong understanding of infrastructure technologies and application development methodologies.

  • Demonstrated experience in system administration activities (configuration, installations, patch management, server maintenance) and network management (firewalls, proxies, IP management, routing, DNS).

  • Experience in the utilization and support of integration and communication protocols between applications, databases, and technology platforms.

  • Solid foundation in building scalable frameworks and providing specifications for APIs that support enterprise fulfillment processes.

  • Strong analytical and problem-solving skills, with the ability to diagnose and resolve complex technical issues.

  • Excellent communication and collaboration skills, with experience engaging stakeholders and cross-functional teams.

  • Proven ability to lead and mentor technical teams, driving continuous improvement and innovation in resiliency engineering.

Additional Information

  • Offer to selected candidate will be made contingent on the results of applicable background checks

  • Offer to selected candidate is contingent on signing a non-disclosure agreement for proprietary information, trade secrets, and inventions

  • Sponsorship will not be considered for this position unless specified in the posting

In this flex office/home role, you will be expected to work a minimum of 10 days per month from one of the following office locations: Madison, WI 53783; Boston, MA 02110
Candidates must reside within a 50-mile radius of the office location (or 35-mile radius for Boston). #LI-Hybrid

Internal Candidates are encouraged to apply however if you are not in the same job family as the posted role and you may be have to relocate to Madison, HQ.

We provide benefits that support your physical, emotional, and financial wellbeing. You will have access to comprehensive medical, dental, vision and wellbeing benefits that enable you to take care of your health. We also offer a competitive 401(k) contribution, a pension plan, an annual incentive, 9 paid holidays and a paid time off program (23 days accrued annually for full-time employees). In addition, our student loan repayment program and paid-family leave are available to support our employees and their families. Interns and contingent workers are not eligible for American Family Insurance Group benefits.

We are an equal opportunity employer. It is our policy to comply with all applicable federal, state and local laws pertaining to non-discrimination, non-harassment and equal opportunity. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law.

American Family Insurance is committed to the full inclusion of all qualified individuals. If a reasonable accommodation is needed to participate in the job application or interview process, to perform essential job functions, and/or to receive other benefits and privileges of employment, please email [email protected] to request a reasonable accommodation.

American Family Insurance
American Family Insurance

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say