Senior MLOps Engineer
Team: Engineering
Location: Toronto, Ontario
Commitment: Full-time
Workplace Type: hybrid
About Us
Opportunity
Ideal Candidate
Key Responsibilities
- Maintain and improve cloud infrastructure (GCP) using Infrastructure-as-Code tools (Terraform).
- Manage IAM, RBAC, and permission policies across cloud environments.
- Own and evolve CI/CD pipelines (CircleCI, GitHub Actions) and ensure best practices are followed across the engineering and ML teams.
- Administer and support workflow orchestration platforms (e.g., Seqera/Nextflow, Argo, Kubeflow).
- Operate and configure ML experiment tracking and registry tooling (e.g., W&B, MLflow).
- Build and maintain containerized environments (Docker) and manage Kubernetes clusters.
- Manage GPU resources – provisioning, scheduling, and debugging hardware and driver issues.
- Write and maintain Python tooling, scripts, and integrations that support ML infrastructure.
- Help deploy ML models to production environments and monitor their performance.
Basic Qualifications
-
4+ years of experience operating production infrastructure.
- Proficiency with cloud platforms (GCP preferred; AWS/Azure acceptable) and Infrastructure-as-Code (Terraform).
- Extensive Hands-on experience with Kubernetes and containerization (Docker).
- Solid background in CI/CD systems (CircleCI, GitHub Actions, or similar).
- Experience managing GPU compute (provisioning, debugging, driver management).
- Familiarity with Python package and environment management (e.g., pip, conda, pixi).
- Strong Python programming skills.
- Self-motivated problem solver with excellent communication skills.
Preferred Qualifications
- Understanding of ML frameworks (e.g., PyTorch, PyTorch Lightning), ML workflows (training, inference, evaluation), and the model lifecycle.
- Familiarity with MLOps tooling (e.g., W&B, Ray, VertexAI) and distributed compute patterns
(e.g., DDP, realtime/batch inference, multi-node training). - Familiarity with Kubernetes CRDs and batch/gang schedulers (e.g., Volcano, Kueue).
- Experience working with large-scale datasets (storage, versioning, efficient access patterns).
- Experience working directly with scientists and researchers in an interdisciplinary setting.
- Knowledge of biology and/or machine learning science.
- Familiarity with data compliance and governance frameworks (e.g., HIPAA, SOC 2).
- Previous startup experience.
What We Offer
- A collaborative and innovative environment at the frontier of computational biology, machine learning, and drug discovery.
- Highly competitive compensation, including meaningful stock ownership.
- Comprehensive benefits - including health, vision, and dental coverage for employees and families, employee and family assistance program.
- Flexible work environment - including flexible hours, extended long weekends, holiday shutdown, unlimited personal days.
- Maternity and parental leave top-up coverage, as well as new parent paid time off.
- Focus on learning and growth for all employees - learning and development budget & lunch and learns.
- Facilities located in the heart of Toronto - the epicenter of machine learning and AI research and development, and in Kendall Square, Cambridge, Mass. - a global center of biotechnology and life sciences.
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
