1X

Multimodal Research Engineer, AI Companion

Deep Learning PyTorch Streaming
Description

Job description

About the Role

NEO is a home robot that handles chores and provides personalized assistance. It's meant to be controlled naturally: you talk, it understands, it responds, it acts.

As a Research Engineer, you'll own the speech and language stack that makes NEO feel like a calm, capable presence: streaming ASR, low-latency TTS, speech-to-speech interaction, and task-level NLU, all tuned for robotics constraints (latency, noise, far-field audio, on-device budgets, safety, and reliability).

Job requirements

In this role, you will:

  • Build and deploy production speech models (ASR, TTS, and/or Speech to Speech) optimized for NEO's on-device compute and real-time latency requirements.

  • Own the full pipeline from data collection and model training through deployment and monitoring.

  • Solve hard acoustic problems that come with putting a robot in someone's living room: far-field recognition, noise robustness, barge-in handling, and multi-speaker environments.

  • Design voice interaction that feels human: natural prosody, appropriate turn-taking, responses that match context and emotional tone.

  • Build evaluation infrastructure, define quality metrics that actually correlate with user experience, and use real-world feedback to prioritize improvements.

  • Collaborate with hardware and robotics teams to integrate speech capabilities with NEO's vision, memory, and physical behavior systems.


You might thrive in this role if you:

  • Have 3+ years building speech systems, especially those currently running in production environments.

  • Have deep expertise in at least one of: automatic speech recognition, text-to-speech synthesis, spoken language understanding, or speech-to-speech modeling.

  • Are fluent in modern deep learning (transformers, diffusion models, autoregressive generation) and can train, debug, and optimize models end-to-end in PyTorch or JAX.

  • Have deployed models under real constraints: on-device inference, latency budgets, memory limits, or edge hardware.

  • Have published at top venues (ICASSP, Interspeech, NeurIPS, ICML) or built equivalent systems at companies known for voice AI.


or

1X
1X

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say