Description
About TikTok U.S. Data Security TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. U.S. Data Security (“USDS”) is a subsidiary of TikTok in the U.S. This new, security-first division was created to bring heightened focus and governance to our data protection policies and content assurance protocols to keep U.S. users safe. Our focus is on providing oversight and protection of the TikTok platform and U.S. user data, so millions of Americans can continue turning to TikTok to learn something new, earn a living, express themselves creatively, or be entertained. The teams within USDS that deliver on this commitment daily span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Functions and more. Why Join Us Creation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible. Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day. To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always. At TikTok, we create together and grow together. That's how we drive impact - for ourselves, our company, and the communities we serve. Join us. About the Team USDS Tech and Product at TikTok provides core product platforms and services with leading infrastructure and applications. The Incident Management team plays a critical role in ensuring business continuity by addressing and mitigating high-priority incidents effectively. This role offers the opportunity to collaborate across functions to minimize impact, improve processes, and enhance the reliability of TikTok’s platforms and services. About the Role The Incident & Problem Manager will oversee the resolution of high-priority incidents, ensuring minimal disruption and swift resolution. This includes owning incident escalations, documenting processes, and collaborating with cross-functional teams to identify root causes and implement short term and long-term solutions. In order to enhance collaboration and cross-functional partnerships, among other things, at this time, our organization follows a hybrid work schedule that requires employees to work in the office 3 days a week, or as directed by their manager/department. We regularly review our hybrid work model, and the specific requirements may change at any time. Responsibilities - Serve as a subject matter expert in incident management, leading the resolution of critical incidents to minimize customer/business impact. - Partner with SRE teams and service owners to ensure timely resolution of high-severity incidents and create high-quality RCAs. - Act as an escalation point for critical incidents and lead crisis response processes as required. - Prioritize incidents based on customer and operational impact, ensuring optimal resource allocation for swift resolution. - Monitor, evaluate, and report on incident management programs, identifying trends and areas for improvement. - Drive process improvements to minimize incident frequency and severity while enhancing efficiency. - Implement automated procedures to capture incident data consistently, supporting data-driven decision-making. - Lead post-incident reviews with cross-functional teams, identifying actionable insights and process optimizations. - Partner with senior leaders to facilitate incident management communications and project delivery. - Generate communications tailored for technical and non-technical audiences, including customer-facing updates. - Collaborate with cross-functional teams to ensure effective containment and remediation strategies. - Ability to work Sunday to Thursday, from 5 PM PT to 2 AM PT. - Provide rotational on-call support (24x7x365) to ensure incidents are handled promptly and effectively. - Stay updated on infrastructure dependencies and emerging technologies to proactively mitigate risks.Minimum Qualifications: - Bachelor’s degree in Computer Science, Information Technology, or a related field, or equivalent work experience. - 2+ years of experience in Incident Management, including leadership of high-severity incidents. - Experience with monitoring solutions and applications such as Grafana. - Technical knowledge of cloud architecture and design. - Proficiency in troubleshooting techniques and problem-solving in a 24x7x365 environment. - Strong oral and written communication skills, with the ability to effectively communicate with diverse audiences. - Must be willing to be flexible with working hours depending on the needs of the business. Preferred Qualifications: - Proven ability to lead incident response calls confidently, driving toward resolution and minimizing downtime. - Experience analyzing incident trends and operational metrics to inform prevention strategies. - Expertise in micro-services architecture, and Linux environment with foundation knowledge of Kubernetes. - Demonstrated success in process improvement, including conducting root cause analyses and implementing efficient solutions. - Strong interpersonal and influencing skills to collaborate effectively across teams without direct authority. - Familiarity with leading investigations in a large-scale enterprise environment. Candidates for this position must be legally authorized to work in the United States. This position is not eligible for visa sponsorship or support. TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too. TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at https://shorturl.at/ktJP6 This role requires the ability to work with and support systems designed to protect sensitive data and information. As such, this role will be subject to strict national security-related screening.