Chaos Platform is an SRE team in our Resilience Engineering organization whose mission is to enable engineering teams at Datadog to improve and maintain the resilience of their services. We offer engineers a library of different failure scenarios that they can use to verify if their systems operate as expected by embracing a mindset of experimentation and the practices of Chaos Engineering. Being able to simulate realistic failures in a complex ecosystem like Datadog, while limiting the blast radius in a safe way, requires our engineers to be strong software engineers with a good awareness of how distributed systems can break at scale.
As an Engineering Manager, you will help us realize this mission of enabling Chaos Engineering by continuing to build out our Chaos Platform and scaling the team by hiring and growing our engineering talent. You will play an instrumental role in helping us build more adoption and impact on company-wide projects, and actively drive collaboration between the team and key stakeholders.
At Datadog, we place value in our office culture - the relationships that it builds, the creativity it brings to the table, and the collaboration of being together. We operate as a hybrid workplace to ensure our employees can create a work-life harmony that best fits them.
What You’ll Do
- Lead and mentor a team of experienced SWEs around the globe who are passionate about building a culture of reliability at Datadog. Help engineers grow to the next level and continuously provide them opportunities to develop.
- Build exciting new features for our self-service Chaos Engineering platform to solve real user problems and ensure our platform remains relevant and well-integrated within the wider Datadog ecosystem.
- Advocate for and build out Chaos Engineering as a practice by focusing the team’s efforts on empowering engineers at Datadog to verify their resilience to failures, including in production.
- Work with stakeholders across Datadog to build consensus on shared topics around resilience – for example collaborating with infrastructure teams to ensure we have consistent graceful degradation and disaster recovery solutions, or security teams to ensure adversary testing works for engineers.
- Build a culture of continuous learning by incentivizing engineers to invest in chaos experiments, team-level game days, and ways to proactively discover failures before they happen in production.
Who You Are
- 2-3 years experience as a people manager or as a technical leader with strong mentorship skills. Ideally candidates will have experience in career development, performance management, tracking and optimizing team velocity, sprint planning, OKRs, and hiring.
- 2-3 years experience in SRE, Resilience, Chaos Engineering or any domain that adopts a mindset of proactively breaking software in order to learn. Although we create impact through our self-service platforms, our ultimate goal is to enable the company to build resilience.
- Technical pragmatism and an ability to help the team reason about trade-offs around implementation. You will often review the decisions and RFCs from senior engineers, and you will need to blend both your technical and business acumen to do this.
- Strong distributed systems knowledge, especially around Kubernetes and how controllers are designed, implemented and operated. We collaborate a lot with our infrastructure partners, so a solid understanding of Linux internals (e.g. control groups, namespaces and networking stack) will serve you well leading a technical team.
- Strong stakeholder management skills. This will involve using your empathy, collaboration, and communication skills in English to work remotely with people across teams. Because our programs touch lots of different parts of the business, we frequently collaborate with stakeholders across Datadog and need to motivate teams to work together towards a shared goal.
Datadog offers a competitive salary and equity package, and may include variable compensation. Actual compensation is based on factors such as the candidate's skills, qualifications, and experience. In addition, Datadog offers a wide range of best in class, comprehensive and inclusive employee benefits for this role including healthcare, dental, parental planning, and mental health benefits, a 401(k) plan and match, paid time off, fitness reimbursements, and a discounted employee stock purchase plan.
Datadog offers a competitive salary and equity package, and may include variable compensation. Actual compensation is based on factors such as the candidate's skills, qualifications, and experience. In addition, Datadog offers a wide range of best in class, comprehensive and inclusive employee benefits for this role including healthcare, dental, parental planning, and mental health benefits, a 401(k) plan and match, paid time off, fitness reimbursements, and a discounted employee stock purchase plan.
About Datadog:
Datadog (NASDAQ: DDOG) is a global SaaS business, delivering a rare combination of growth and profitability. We are on a mission to break down silos and solve complexity in the cloud age by enabling digital transformation, cloud migration, and infrastructure monitoring of our customers’ entire technology stacks. Built by engineers, for engineers, Datadog is used by organizations of all sizes across a wide range of industries. Together, we champion professional development, diversity of thought, innovation, and work excellence to empower continuous growth. Join the pack and become part of a collaborative, pragmatic, and thoughtful people-first community where we solve tough problems, take smart risks, and celebrate one another. Learn more about #DatadogLife on Instagram, LinkedIn, and Datadog Learning Center.
Equal Opportunity at Datadog:
Datadog is an Affirmative Action and Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. Here are our Candidate Legal Notices for your reference.
Your Privacy:
Any information you submit to Datadog as part of your application will be processed in accordance with Datadog’s Applicant and Candidate Privacy Notice.
0 applies
3 views
Other Jobs from Datadog
Staff Software Engineer - Infrastructure Monitoring
Manager 2, Technical Support Engineering - Denver
Senior Software Engineer - Incident Management
Senior Software Engineer - SDLC Security
Director, Engineering - Visualizations
Similar Jobs
Software Engineer - Marketplace (Back-end Focus)
Principal Software Engineer in Test Automation (SASE)
Senior Staff SDET Automation Engineer (Cloud Management Platform)
Principal Engineer Software (Cloud Application)
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say