RoboForce

Senior / Staff AI Research Engineer, Real-Time Inference

Milpitas, CA
Python C++ CUDA TensorRT ONNX Runtime TVM Triton NVIDIA Jetson Deep Learning Machine Learning AI
Description

Senior / Staff AI Research Engineer, Real-Time Inference

Location: Milpitas, CA

Department: AI

Why RoboForce

RoboForce is an AI robotics company developing Physical AI–powered Robo-Labor for dull, dirty, and dangerous work. The company's robots are engineered for demanding industrial environments, with a focus on real-world deployment and scalability.
 
We are looking for a Senior / Staff AI Research Engineer, Real-Time Inference to make embodied AI practical on the edge. In this role, you will drive the full stack of model optimization — from CUDA kernel engineering to quantization and compression — to deploy high-performance AI models on edge compute platforms powering RoboForce robots in the field.
 
Responsibilities
  • Develop and optimize inference pipelines for embodied AI models (VLA, perception, world models) targeting real-time execution on edge hardware such as NVIDIA Jetson platforms.
  • Implement CUDA-level optimizations including custom kernels, memory layout tuning, and hardware-aware graph compilation to minimize model latency.
  • Apply and advance model compression techniques — quantization (INT8/FP16/INT4), pruning, distillation, and structured sparsity — to achieve production-grade throughput on constrained devices.
  • Profile and debug end-to-end inference stacks using tools such as NSight, TensorRT, and Triton to identify and eliminate performance bottlenecks.
  • Collaborate with ML research and robotics teams to co-design model architectures that meet real-time control-loop latency requirements.
  • Establish benchmarking frameworks to evaluate model performance across latency, throughput, power consumption, and accuracy tradeoffs on target hardware.
Requirements
  • Master's degree in Computer Science, Electrical Engineering, or related field with 4+ years of experience, or a PhD degree.
  • Deep expertise in CUDA programming, GPU architecture, and low-level kernel optimization, including custom kernel authoring with tools such as Triton.
  • Hands-on experience with model quantization, pruning, distillation, and deployment using frameworks such as TensorRT, ONNX Runtime, TVM, or Triton.
  • Proficiency in C++ and Python; strong systems programming and performance profiling skills.
  • Experience deploying ML models on edge or embedded hardware (e.g., NVIDIA Jetson, Orin, or equivalent ARM/GPU SoCs).
  • Requires 5 days/week in-office collaboration with the teams.
Bonus Qualifications
  • Familiarity with embodied AI models — VLA, multimodal transformers, or diffusion-based policies — and their inference characteristics.
  • Familiarity with compiler-based optimization pipelines such as XLA, torch.compile, or MLIR for graph-level model acceleration.
  • Understanding of robotics system constraints such as control-loop timing, sensor fusion latency, and memory bandwidth limits on edge SoCs.
  • Publication or production work in efficient deep learning or on-device ML systems.
Benefits
  • Competitive stock options/equity programs.
  • Health, dental, and vision insurance, 401(k) plan.
  • Visa sponsorship and green card support for qualified candidates.
  • Lunches and dinners, a fully stocked kitchen, and regular team-building events.
RoboForce
RoboForce

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say