Millennium Management

ML Infrastructure Engineer

London, UK
Machine Learning AWS GCP Chef Ansible PyTorch Kubernetes Terraform Python TensorFlow
Description
ML Infrastructure Engineer

This role is a member of the AI/ML Infrastructure Engineering team and will be dedicated to implementing and supporting AI/ML infrastructure solutions in cloud and on-premise environments. The role will work directly with infrastructure teams and potentially face off with data scientists, machine learning engineers, application developers, and quantitative analysts by functioning as both a solutions architect, helping them implement their own AI/ML solutions, and as a professional services engineer, implementing solutions for them in cloud environments such as AWS, GCP, and Kubernetes.

This is a hands-on developer role and candidates ideally have had experience deploying and supporting their own production-ready AI/ML models in cloud environments as well as automating the build and management of a broad range of cloud infrastructure using tools like Terraform. Candidates should be familiar with developing unit and functional tests, have experience designing and implementing CI/CD tools with infrastructure as code pipelines, and have knowledge of Linux systems administration, containerization, networking, security, automated configuration and state management, cross-system orchestration, configuration management, logging, metrics, monitoring, and alerting.

Principal Responsibilities:

• Architect, develop and maintain internal AI/ML infrastructure components, frameworks, and offerings

• Architect, develop and maintain AI/ML solutions for customers in cloud environments

• Help customers architect, develop and maintain their own AI/ML solutions in cloud environments

• Implement CI/CD pipelines which include application tests, security tests, and gates

• Implement availability, security, performance monitoring, and alerting of AI/ML solutions

• Automate data resiliency and replication for AI/ML models

• Manage multiple environments and promote code between them

• Automate systems configuration and orchestration using tools such as Terraform, Chef, Ansible, or Salt

• Automate creation of machine images and containers

Required Qualifications/Skills

• 6+ years of experience designing and supporting production cloud environments

• Experience consulting with customers to develop AI/ML solutions

• Experience developing collaboratively, including infrastructure as code, preferably in Python

• Systems engineering knowledge, including understanding of Linux, security, and networking

• Cloud templating tools such as Terraform

• Experience with AI/ML frameworks (e.g., TensorFlow, PyTorch)

• Experience with distributed computing tools (e.g., Ray, Dask)

• Experience with model serving tools (e.g., vLLM, KFServing)

• Experience with building, monitoring, and alerting on logs and metrics

• Cloud Networking including connectivity, routing, DNS, VPCs, proxies, and load balancers

• Cloud Security including IAM, Certificate Management, and Key Management

• Excellent written and verbal communications

• Excellent troubleshooting and analytical skills

• Self-starter able to execute independently, on a deadline, and under pressure

Millennium Management
Millennium Management
Software Finance Financial Services Impact Investing

0 applies

1 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 401 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say