Senior Machine Learning Infrastructure Engineer
Team: Machine Learning and Data Engineer
Location: Santa Clara, CA
Commitment: Full-time
Workplace Type: hybrid
Responsibilities:
- Design and develop scalable, high-performance systems for training, inference, deploying, and monitoring ML models at scale.
- Build and maintain efficient data pipelines, model versioning systems, and experiment tracking frameworks.
- Collaborate with cross-functional teams, including ML researchers and engineers, to identify bottlenecks and improve platform usability.
- Implement distributed systems and storage solutions optimized for machine learning workloadsDrive improvements in CI/CD workflows for ML models and infrastructure.
- Ensure high availability and reliability of the ML platform by implementing robust monitoring, logging, and alerting systems.
- Stay current with industry trends and integrate relevant tools and frameworks to enhance the platform.
- Mentor junior engineers and contribute to a culture of technical excellence
- Ensure that your work is performed in accordance with the company’s Quality Management System (QMS) requirements and contribute to continuous improvement efforts.
- Ensure team compliance with QMS, monitor quality, and drive process improvements.
Required Skills:
- Phd or MS in Computer Science, Electrical Engineering, or related field
- Good oral and written communication skills
- Phd new grad or Masters with 3+ years of software engineering experience with a focus on ML infrastructure or distributed systems.
- Proficiency in in Python, C++, SQL
- Deep understanding of containerization, orchestration technologies, distributed ML workload, and experiment tracking tools (e.g., Docker, Kubernetes, multiprocessing, Kubeflow, and mlflow)
- Deploy and manage resources across multiple cloud platforms (AWS, GCP, or on-prem environments)
- Proficiency in at least one deep learning framework, such as PyTorch and data pipeline tools (e.g., Apache Airflow, Prefect).
- Strong knowledge of distributed systems, databases, and storage solutions.
- Extensive software design and development skills.
- Ability to learn and adapt to new technologies and contribute in a productive environment.
Preferred Skills:
- Familiarity with fundamental deep learning architectures, such as Convolutional Neural Networks (CNNs) and Transformer models
- Experience in building large-scale ML datasets, MLOps pipelines, and distributed computing frameworks like Ray
- Experience working with autonomous vehicles or robotics
Salary Range:
- $160,000 - $200,000 a year
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
