Synthesia

Senior ML Platform Engineer - DataOps

London, UK
R AWS Python Docker
This job is closed! Check out or
Description

Who are we?

On a mission to make video easy for anyone โ€ฆ

Synthesia is the worldโ€™s #1 AI video generation platform. Well, itโ€™s actually a video production studio โ€” in a browser. As in, no cameras or film crews at all. You simply choose an avatar, enter your script in one of 60 languages, and your video is ready in minutes. In Synthesia, you can build personalised on-the-fly videos, give your chatbot a human face or run 24/7 weather channels in different languages, to name just a few of the possibilities. ๐ŸŽฌ

We believe the future of media is synthetic, and we are on a mission to turn cameras into code and make everyone a creator. Not sure what weโ€™re talking about? Check out our brand video that explains what weโ€™re doing at Synthesia in a way that even our grandparents *kind of * understand what this AI video stuff is all about.

 

About the role

We are looking for a Senior ML Platform Engineer - DataOps to help R&D manage large data-sets of audio-video data at Synthesia. We are creating a new ML Platform team, that will be supporting 7+ teams developing cutting edge solutions in generative video synthesis. You will join us to set up a world class data function, managing a lake with PB scale data and building complex audio/visual data pipelines to bring order and make data consumption simple. You are going to super-charge our research.

๐Ÿ”ฌYou are someone that loves DevOps, you love Data, and you want to work at Scale. You pay close attention to detail and you create and communicate clear, well-defined processes. You love to support and help others. The happiest day is when you hear "it was so easy, just 1-click and everything worked". You love to build systems that unblock others and unlock scale.

๐Ÿ‘ฉโ€๐Ÿ’ผ You will join a group of more than 30 Researchers and Engineers in the R&D department. This is an open, collaborative and highly supportive environment. We are all working together to build something big - the future of synthetic media and programmable video through Generative AI. We are proud of the culture, as well as the impact of the technology we are building.

 

What will you be doing?

๐Ÿš€ In this position, you will set up and provide data management for our ML teams in R&D at Synthesia. You will help set up our audio-video data pipeline for the Video team and our speech data pipeline for the Voice team. You will be responsible for:

  • Data storage - our data lake for large scale audio-visual datasets
  • Data sources - set up our ingest process, working with external data providers
  • Data annotation - manage data verification and annotation, working with external providers
  • Data pipelines - deploy custom ML data transformations, working with our ML teams
  • Data access - create transient data-sets on demand to support ML model training
  • Data tracking - usage tracking and monitoring across all data sources

 

Who are you?

We are looking for candidates that can own the DataOps function. You will have:

  • 3+ years minimum experience in Data Engineering / Data Ops / Data Science.
  • Been involved in managing large scale datasets not just one-off data collection tasks, you have seen continuous data collection.
  • Been responsible for setting up data ops (ingest / storage / transform / access) end-to-end for multiple teams.
  • Seen audio/video data and understand managing audio/video data at PB scale.
  • Strong understanding of Data Ops with dataset management, versioning, usage tracking, monitoring and logging.
  • In depth experience working with AWS for data and compute. You will work side-by-side with DevOps to define our infra.
  • Experience supporting deep tech teams working with Python and containerised development with Docker.
  • Outstanding communication skills.

 

Nice to haveโ€ฆ

If you have seen large scale data management and data governance, multi-modal data-sets, multi-stage data transform pipelines, and large model training with 10000s to 100000s of hours of content. If you have worked with ML Ops to provide data sources to support world class research teams spanning tech planned direct to product as well as foundational research for top-tier academic conferences, then we would love to talk to you! We'd also love to talk to you - if this what you dream of doing. ๐Ÿ˜Ž 

 

The good stuff...

๐Ÿ’ธ You will be compensated well (salary + stock options + bonus)

๐Ÿ“ You will work in a hybrid setting with an office in London

๐Ÿ You get 25 days of annual leave + public holidays

๐Ÿฅณ You will join an established company culture with regular socials and company retreats

๐Ÿคฉ You get 4 weeks paid sabbatical after 4 years at the company + $10,000!!

๐Ÿผ You get a paid parental leave

๐Ÿ‘‰ You can participate in a generous referral scheme

๐Ÿ’ป You get a brand new computer of your choice (if that still counts as a benefit in 2023 ๐Ÿค”)

๐Ÿš€ You will have huge opportunities for your career growth

You can see more about Who we are and How we work here: https://www.synthesia.io/careers

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

50,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

๐Ÿฅณ๐Ÿฅณ๐Ÿฅณ 208 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

Cancel anytime / Money-back guarantee

Wall of love from fellow engineers