Staff AI Software Engineer, Edge Model Optimization & Deployment
Team: Autonomy
Location: Seattle, WA
Commitment: Full time
Workplace Type: onsite
What You’ll Do:
- Convert and optimize 2D/3D CNNs and Transformer-based models (PyTorch/TensorFlow → ONNX → TensorRT/Triton) for real-time inference on Jetson/Orin platforms.
- Apply model compression techniques—quantization, pruning, distillation, weight sharing—to meet strict constraints on latency, memory, bandwidth, and power.
- Develop custom TensorRT plugins and CUDA kernels for performance-critical components.
- Integrate optimized models into the broader robotic system using ROS nodes and interfaces.
- Build benchmarks, profile and debug end-to-end inference pipelines, and validate performance in real-world robotic scenarios.
- Collaborate closely with AI researchers, robotics engineers, and hardware teams to translate cutting-edge research into robust, deployable edge solutions.
- Ensure the reliability, robustness, and stability of deployed models operating continuously in challenging, resource-constrained environments.
What You Have:
- 5+ years of professional experience developing and deploying deep learning models for edge, embedded, or real-time systems.
- PhD in Computer Science, Robotics, Electrical or Computer Engineering, or a closely related technical field.
- Strong proficiency in PyTorch, C++, Python, and CUDA for AI/ML development and model optimization.
- Hands-on experience with TensorRT, ONNX, and Triton, including authoring custom plugins for TensorRT.
- Proven experience applying model optimization techniques such as quantization, pruning, and distillation in production systems.
- Deep understanding of hardware constraints and performance tuning on Jetson / ARM platforms, GPUs, and embedded Linux systems.
- Experience integrating AI models into ROS-based robotic systems.
- Ability to work independently while collaborating effectively in a fast-paced, cross-functional engineering environment.
The Extras That Set You Apart:
- Experience with ROS2.
- Experience writing and optimizing custom CUDA kernels and low-level GPU performance tuning.
- Familiarity with Triton, ML compilers, or compiler-level optimizations for GPU inference.
- Experience with JAX or additional ML frameworks beyond PyTorch.
- Background deploying AI systems on real robots operating in the field, not just offline or in simulation.
- Familiarity with NVIDIA’s edge and robotics ecosystem (e.g., Isaac ROS, DeepStream, JetPack).
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
