Description

We are looking for a passionate and accomplished leader to join our Reliability Engineering org. The ideal candidate will possess deep expertise in Site Reliability Engineering (SRE) practices, performance engineering, cloud technologies, distributed systems architecture, and SQL/NoSQL databases.

You have the drive to pro-actively learn and stay up to date in a fast changing environment, including but not limited to evolving the org to meet strategic challenges, evaluation and adoption of new technology, managing significant changes in existing systems, and addressing security and compliance needs.
You have the ability to understand our platform that powers hundreds of eCommerce websites for our sports fans globally, and involves moving atoms, not just bits, with order management, fulfilment, and manufacturing systems on the backend.
You have strong execution and collaboration skills in driving alignment across the org for key initiatives and delivering on Objectives and key results (OKRs).
You are known for sharing knowledge, building trust, admitting mistakes, and fostering a safe and inclusive team environment.

Responsibilities

Manage a team of software engineers and performance engineers.
Drive/contribute to the reliability and performance engineering initiatives across the tech org building new capabilities and introducing new practices.
Build and manage platform tooling that provides standardisation on Service Level Objectives (SLOs), Operational Readiness Reviews (ORRs), incident metrics, availability and performance, etc
Develop and maintain working relationships with engineering leads across our distributed teams and evangelising SRE best practices and tooling. Develop paved-path solutions and drive adoption of standard tools.
Manage planning, scheduling and resourcing to deliver for our OKRs and Keeping The Lights On (KTLO) deliverables.
Own the scale in and scale out of critical services during high traffic volume days/events with a focus on keeping the costs optimal.
Work closely with production support and incident management teams in supporting our websites, order intake systems, fulfilment systems, analytics and reporting workflows, infrastructure across on-prem and AWS, corporate tools, vendor integrations and other third party software.
Be proactive in analysing fan impacting incidents related to availability and performance and help develop automations/tooling to reduce Mean time to detect (MTTD) and Mean time to repair (MTTR) for those incidents.
Find opportunities to apply new technologies such as GenAI to increase productivity and improve operational efficiency.

Qualifications

Overall 12+ years of experience in Information Technology with proven experience in engineering leadership.
Ability to mentor, develop talent and drive technical excellence.
A solid foundation in software development, with 10+ years in two or more of GoLang, Java, Python, ReactJS and NodeJS.
5+ years of experience in building in-house tools or setting up vendor tools for monitoring, alerting, alert correlation, on-call management, auto-remediation, chaos engineering, SLIs/SLOs/Error Budget tracking, performance testing, profiling, incident management, change management and reporting dashboards.
5+ years of experience in designing and implementing solutions on AWS.
5+ years of experience in implementing tools for reliability and performance engineering practices in medium to large sized organisations.
Experience in SQL and NoSQL DBs, e.g., SQL Server, MySQL, Cassandra, and Scylla.
Experience in building globally responsible and autonomous teams in Global Capability Centres (GCCs) in India.
Good understanding of DNS, networking and service discovery is a plus.
Experience with Kubernetes or Openshift is a plus.
Experience in developing Slack apps/bots is a plus.
Experience in eCommerce and/or supply chain systems is a plus.

Fanatics

0 applies

2 views

Other Jobs from Fanatics

Director - Software Engineering

Hyderabad, India Remote Hybrid

Data Analyst - Athlete Partnerships - Fanatics Collectibles

Ontario UK

eCommerce Software Engineer - Laravel

Milan, Italy Ontario

Sr Product Developer

Irvine, CA US

Sr Data Scientist

Los Angeles, CA US

Similar Jobs

Senior Software Engineer, Back End (Python, Node)

McLean, VA Richmond, VA

Software Engineer, Back End (Python, Node) - Team Cyber

McLean, VA Richmond, VA

Senior Software Engineer

Remote Spain

Senior Software Engineer, Back End (Java, AWS)

McLean, VA US

Staff Engineer (m/f/d)

Vienna, Austria

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 401 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say