Machine Learning Operations
Location: Menlo Park, CA
Department: Engineering
Location Type: IN_OFFICE
Employment Type: FULL_TIME
- Design, build, and maintain scalable ML pipelines for training, evaluation, and deployment of LLMs and retrieval-augmented systems, optimized for performance, traceability, and reproducibility
- Operationalize evaluation workflows using both synthetic and human-labeled datasets to monitor model quality at scale across multiple downstream tasks and customer deployments
- Automate the ML Developer lifecycle by implementing robust data versioning, model tracking, and CI/CD pipelines using modern ML Ops tooling
- Optimize model training and inference, focusing on reducing latency, maximizing throughput, and controlling cost across heterogeneous hardware environments.
- Collaborate cross-functionally with research, infrastructure, and product teams to productionize foundation models and integrate them into customer-facing AI products
- Deploy and manage both open-source and proprietary models within stringent constraints on latency, security, and compliance—balancing reliability with innovation.
- Implement real-time monitoring and alerting systems to detect model/data drift, quality regressions, and infrastructure bottlenecks in live environments.
- Work directly with enterprise customers, supporting deployment strategies, ensuring production readiness, and creating tight feedback loops from real-world usage to continuous model improvement
- Software Engineering Expertise: Proven experience in building reliable and scalable systems, with a strong foundation in software engineering principles and expertise with Python, Go, or Rust
- ML Ops Platforms: Hands-on experience with ML Ops platforms such as MLflow, Kubeflow, SageMaker, Vertex AI, or Apache Airflow, facilitating efficient model lifecycle management
- Cloud-Native Tools: Proficiency in cloud-native tools including Docker, Kubernetes, storage optimizers. Experience with major cloud providers like AWS, GCP or other compute providers for deploying and managing ML workloads with a focus on cost optimization
- Experiment Tracking & Model Management: Proficiency in tools like Weights & Biases, MLflow, and/or CometML for tracking experiments, managing model metadata, and facilitating collaboration
- Infrastructure: Familiarity with infrastructure tools like Terraform, Pulumi, and Chronosphere for monitoring and alerting.
- Demonstrated ability to translate research findings into robust, production-ready systems, bridging the gap between experimentation and deployment.
- Some background in hardware/electronics, gained through professional, academic, or personal projects
- Contributions to open-source initiatives
- Notable awards or publications in leading journals/conferences
- Experience thriving in a fast-paced, hyper-growth startup environment
- Unlimited PTO: Recharge when you need it, no questions asked.
- Comprehensive Health Coverage: Medical, dental, and vision insurance for you and your dependents.
- Free Meals and Snacks: Daily lunches, dinners, and snacks in the office.
- Professional Growth: We invest in your continuous learning and offer opportunities to expand your skills.
- Visa Sponsorship: We welcome global talent and provide visa sponsorship to support qualified candidates.
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
