TikTok

Major Incident Manager, Eng Support-Incident Management Team - USDS

Mountain View, CA
Swift Microservices Kubernetes
Description
About TikTok U.S. Data Security
TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. U.S. Data Security (“USDS”) is a subsidiary of TikTok in the U.S. This new, security-first division was created to bring heightened focus and governance to our data protection policies and content assurance protocols to keep U.S. users safe. Our focus is on providing oversight and protection of the TikTok platform and U.S. user data, so millions of Americans can continue turning to TikTok to learn something new, earn a living, express themselves creatively, or be entertained. The teams within USDS that deliver on this commitment daily span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Functions and more.

Why Join Us
Creation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible.
Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day.
To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always.
At TikTok, we create together and grow together. That's how we drive impact - for ourselves, our company, and the communities we serve.
Join us.

About the Team
USDS Tech and Product at TikTok provides core product platforms and services with leading infrastructure and applications. The Incident Management team plays a critical role in ensuring business continuity by addressing and mitigating high-priority incidents effectively. This role offers the opportunity to collaborate across functions to minimize impact, improve processes, and enhance the reliability of TikTok’s platforms and services.

About the Role
The Incident & Problem Manager will oversee the resolution of high-priority incidents, ensuring minimal disruption and swift resolution. This includes owning incident escalations, documenting processes, and collaborating with cross-functional teams to identify root causes and implement short term and long-term solutions.

In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department. We regularly review our hybrid work model, and the specific requirements may change at any time.

Responsibilities
- Serve as a subject matter expert in incident management, leading the resolution of critical incidents to minimize customer/business impact.
- Partner with SRE teams and service owners to ensure timely resolution of high-severity incidents and create high-quality RCAs.
- Act as an escalation point for critical incidents and lead crisis response processes as required.
- Prioritize incidents based on customer and operational impact, ensuring optimal resource allocation for swift resolution.
- Monitor, evaluate, and report on incident management programs, identifying trends and areas for improvement.
- Drive process improvements to minimize incident frequency and severity while enhancing efficiency.
- Implement automated procedures to capture incident data consistently, supporting data-driven decision-making.
- Lead post-incident reviews with cross-functional teams, identifying actionable insights and process optimizations.
- Partner with senior leaders to facilitate incident management communications and project delivery.
- Generate communications tailored for technical and non-technical audiences, including customer-facing updates.
- Collaborate with cross-functional teams to ensure effective containment and remediation strategies.
- Ability to work Sunday to Thursday, from 5 PM PT to 2 AM PT.
- Provide rotational on-call support (24x7x365) to ensure incidents are handled promptly and effectively.
- Stay updated on infrastructure dependencies and emerging technologies to proactively mitigate risks.Minimum Qualifications:
- Bachelor’s degree in Computer Science, Information Technology, or a related field, or equivalent work experience.
- 2+ years of experience in Incident Management, including leadership of high-severity incidents.
- Experience with monitoring solutions and applications such as Grafana.
- Technical knowledge of cloud architecture and design.
- Proficiency in troubleshooting techniques and problem-solving in a 24x7x365 environment.
- Strong oral and written communication skills, with the ability to effectively communicate with diverse audiences.
- Must be willing to be flexible with working hours depending on the needs of the business.

Preferred Qualifications:
- Proven ability to lead incident response calls confidently, driving toward resolution and minimizing downtime.
- Experience analyzing incident trends and operational metrics to inform prevention strategies.
- Expertise in micro-services architecture, and Linux environment with foundation knowledge of Kubernetes.
- Demonstrated success in process improvement, including conducting root cause analyses and implementing efficient solutions.
- Strong interpersonal and influencing skills to collaborate effectively across teams without direct authority.
- Familiarity with leading investigations in a large-scale enterprise environment.

Candidates for this position must be legally authorized to work in the United States. This position is not eligible for visa sponsorship or support.

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.

TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at https://shorturl.at/ktJP6

This role requires the ability to work with and support systems designed to protect sensitive data and information. As such, this role will be subject to strict national security-related screening.

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 401 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say