Machine Learning Systems Engineer
Location: Menlo Park, CA
Department: Engineering
Location Type: IN_OFFICE
Employment Type: FULL_TIME
- Design and maintain high-performance ML pipelines for training, evaluation, and inference of LLMs and retrieval-augmented systems, with a focus on hardware efficiency and throughput
- Optimize core transformer operations at the kernel level, designing and tuning custom kernels and low-level implementations for GPU-accelerated workloads
- Implement and integrate low-precision computation techniques to reduce memory footprint and accelerate inference with minimal accuracy degradation
- Build and maintain inference engines for on premises deployments
- Architect distributed training and inference systems
- Collaborate closely with researchers and infra teams to bring cutting-edge model innovations into production
- Interface directly with enterprise hardware environments, tuning performance based on real-world deployment constraints
- Languages: Expertise in C, C++, or Rust
- Design and Optimize CUDA Kernels for LLMs: Develop and fine-tune custom CUDA kernels to accelerate core transformer operations
- Implement Low-Precision Computation Techniques: Apply quantization methods like AWQ and GPTQ to reduce model size and inference latency. Ensure minimal accuracy loss while maximizing throughput on GPU architectures with familiarity with concepts like GGUF and GGML
- Develop and Maintain High-Performance Inference Systems: Build, improve, and maintain inference engines such as vLLM, SGLang, and TensorRT with a focus on low-latency and high throughput
- Architect Distributed Training and Inference Solutions: Design systems that support model parallelism (tensor, pipeline, expert etc) to enable efficient training and inference across multiple GPUs and nodes
- Integrate Research into Production Systems: Translate cutting-edge research findings into robust, production-ready systems. Ensure that innovations in model architectures and optimization techniques are effectively deployed.
- Monitor and Optimize System Performance: Implement monitoring tools to track system metrics, identify bottlenecks, and optimize performance
- Some background in hardware/electronics, gained through professional, academic, or personal projects
- Contributions to open-source initiatives
- Notable awards or publications in leading journals/conferences
- Experience thriving in a fast-paced, hyper-growth startup environment
- Unlimited PTO: Recharge when you need it, no questions asked.
- Comprehensive Health Coverage: Medical, dental, and vision insurance for you and your dependents.
- Free Meals and Snacks: Daily lunches, dinners, and snacks in the office.
- Professional Growth: We invest in your continuous learning and offer opportunities to expand your skills.
- Visa Sponsorship: We welcome global talent and provide visa sponsorship to support qualified candidates.
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
