Trimble

AI MLOps hosting Engineer

Chennai, India
Machine Learning PowerShell Python Docker Kubernetes Terraform Azure
Description

Job Summary

We are seeking an experienced Site Reliability Engineer with AI MLOps to support the development and optimization of our ERP product, primarily in Azure and Windows environments. This role combines MLOps expertise with Site Reliability Engineering (SRE) principles to ensure the reliable, scalable, and cost-efficient deployment of AI models. The ideal candidate will focus on improving security, compliance, and operational efficiency, collaborating with North American and global teams to meet business objectives.

Key Responsibilities

  • AI MLOps Pipeline: Build and optimize CI/CD pipelines to automate the training, testing, and deployment of AI models on Azure, with a strong emphasis on improving efficiency and reducing costs.

  • Azure Infrastructure Management: Manage and maintain scalable, secure infrastructure using Azure services like Azure Machine Learning, AKS, and Virtual Machines. Continuously optimize resource usage and implement cost-saving measures.

  • Windows Server Management: Oversee Windows-based servers hosted on Azure, ensuring they meet performance, security, and compliance requirements, while also identifying and executing cost-saving opportunities.

  • Cost Optimization: Analyze and manage infrastructure costs by identifying unused or underused resources and implementing optimization strategies to drive cost savings.

  • Monitoring & Performance Optimization: Monitor the health, performance, and costs of AI models and services using Azure Monitor, NewRelic and other tools. Identify performance bottlenecks and optimize for both operational efficiency and cost reduction.

  • Model Versioning & Governance: Assist in managing model version control, governance, and lifecycle processes with a focus on cost-effective operations.

  • Cross-functional Collaboration: Collaborate with data scientists, AI engineers, and software developers to support the efficient deployment and operationalization of AI models, while actively seeking ways to minimize costs.

  • Incident Management & Automation: Participate in incident resolution and automate tasks to reduce manual work, improve system reliability, and lower operational overhead.

  • Security & Compliance Assurance: Ensure AI/ML workloads comply with security and regulatory standards, implementing cost-efficient solutions to enhance security and data protection.

Qualifications

  • Experience: 2 –5 years in MLOps, SRE, or similar roles, focusing on Azure and Windows environments.

  • Cloud Skills: Proficient in Azure services, managing infrastructure, and Windows workloads.

  • SRE Knowledge: Familiar with Site Reliability Engineering principles like monitoring and automation.

  • DevOps: Hands-on experience with CI/CD tools like Azure DevOps.

  • Scripting: Skilled in PowerShell and Python for automation.

  • Containers: Knowledge of Docker and Kubernetes for deploying AI/ML applications.

  • Windows Admin: Strong experience managing Windows Servers and related services.

  • AI/ML Knowledge: Understanding of AI/ML workflows and model deployment.

Nice-to-Have

  • Experience with Infrastructure-as-Code tools like Terraform.

  • Azure certifications (e.g., Azure AI Engineer, Azure DevOps Engineer)

  • Experience implementing cost-saving strategies in cloud environments

Soft Skills

  • Strong problem-solving skills with the ability to troubleshoot complex issues.

  • Excellent communication skills and the ability to collaborate effectively with cross-functional teams.

  • A passion for innovation and continuous improvement in AI/ML systems.

Trimble
Trimble
Energy Fossil Fuels Geothermal Energy Manufacturing Natural Resources Oil and Gas Renewable Energy Indoor Positioning Mapping Services Navigation Software Web Hosting

0 applies

13 views

Other Jobs from Trimble

Senior Data Engineer

Chennai, India

Systems Engineer

Christchurch, New Zealand US

Similar Jobs

Senior Data Scientist I

Johannesburg, South Africa Cape Town, South Africa

Senior Backend Engineer

Bengaluru, India Remote Hybrid

Intern, AI Research Scientist

Toronto, Ontario Canada

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

πŸ₯³πŸ₯³πŸ₯³ 401 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. πŸ› οΈ
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. πŸš€
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. πŸ“…

What Fellow Engineers Say