Distillation Lead
Team: Autonomy & Algorithms
Location: San Francisco, CA, Toronto, ON, Pittsburgh, PA, Remote US & Canada
Commitment: Full-time
Workplace Type: hybrid
You will…
- Define and drive the technical strategy for model distillation and compression across Waabi's AI stack — spanning perception, world models, and planning — with an eye toward both onboard deployment and simulation use-cases.
- Design, implement, and scale state-of-the-art distillation and efficiency pipelines, which may include:
-
Distillation for generative models (diffusion, autoregressive, flow-matching, video models)
-
Quantization-aware training (QAT) and post-training quantization (PTQ)
-
Knowledge distillation (feature-level, response-based, and relation-based)
-
Structured and unstructured pruning and sparsification
-
Low-rank factorization and efficient architecture design
-
Speculative decoding and other inference-time efficiency techniques
- Collaborate closely with ML Platform, Infrastructure, Onboard, Autonomy, and Simulation teams to integrate compressed models into production pipelines and meet latency, memory, and throughput targets across deployment contexts.
- Define rigorous benchmarks and evaluation frameworks to characterize efficiency vs. quality trade-offs across models and hardware targets.
- Mentor and guide researchers and engineers working in the distillation and model efficiency space, setting a high technical bar and fostering a culture of rigorous experimentation.
- Champion best practices for model compression across the organization; disseminate knowledge through internal design reviews, documentation, and technical talks.
- Stay at the cutting edge of model efficiency research; contribute to the broader scientific community through publications and open-source contributions.
Qualifications:
- Deep distillation expertise: You have extensive hands-on experience designing and implementing distillation, quantization, pruning, and model compression techniques for large-scale neural networks, with demonstrated impact in production settings.
- Strong research and engineering foundation: A Bachelor's or Master's degree in Machine Learning, Computer Vision, Robotics, or a related field, or equivalent industry experience; relevant hands-on experience in model distillation and efficiency is what matters most. Expert Python and PyTorch (or JAX) skills with experience in large-scale distributed training.
- Technical leadership: You have a proven track record of setting technical direction and driving projects from conception to production. You inspire and elevate those around you through deep technical expertise and mentorship.
- Cross-functional collaboration: You have experience working closely with infrastructure, platform, and autonomy teams to deploy compressed models under real engineering constraints.
- Clear communicator: You can communicate complex technical trade-offs clearly to diverse audiences and drive alignment across research and engineering teams.
Bonus:
- Experience with hardware-aware optimization (TensorRT, ONNX, custom CUDA kernels, hardware-specific quantization).
- Publications at top-tier ML/CV venues (NeurIPS, ICML, CVPR, ICLR, ECCV) in model compression, efficient deep learning, or related areas.
- Experience distilling large generative models (diffusion models, LLMs, VLMs, or video models).
- Background in autonomous vehicles or robotics.
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
