CodeRoad

Machine Learning Operations Engineer

Remote
Python Bash Git Docker Kubernetes GCP AWS Vertex AI Cloud Functions Amazon SageMaker AWS Lambda CloudWatch Terraform Prometheus Grafana SQL BigQuery Redshift S3 Cloud Storage PyTorch Langraph CrewAI N8N API Machine Learning AI
Description

Machine Learning Operations Engineer (MLOps)

Location: Latin America

Department: CodeRoad

Machine Learning Operations (MLOps) Engineer

The Team

At Coderoad, we're more than just a software development company—we're your gateway to the global tech world. Whether you're looking to skill up or level up your career, we offer the challenges you’ve been searching for.

We provide end-to-end software development services and give you the opportunity to work on exciting, real-world projects in a supportive environment. Whether it's staff augmentation, dedicated IT teams, or general software engineering, we have opportunities for everyone to challenge themselves and take their career to the next level!

Position Location - Latam (Remote).
Time Zone Requirements - This team operates on the East/West Coast time zones.

About the Role

We are seeking a skilled and innovative Machine Learning Operations (MLOps) Engineer with a focus on Agentic AI to design, deploy, and maintain scalable, robust, and ethical autonomous AI systems. The ideal candidate will combine deep expertise in modern MLOps practices with a solid understanding of agentic AI principles, enabling the seamless integration, monitoring, and optimization of AI models that exhibit autonomous decision-making and adaptability. You will be a key contributor in our cross-functional teams, ensuring our agentic AI solutions are reliable, efficient, and aligned with our business goals and ethical standards.

Key Responsibilities

  • Model Deployment & Integration: Design and implement scalable, secure, and production-grade pipelines for deploying agentic AI models. Focus on seamless integration with existing systems and enable real-time adaptability for autonomous decision-making.

  • Cloud Infrastructure Management: Build and maintain robust cloud infrastructure on Google Cloud Platform (GCP) or Amazon Web Services (AWS) for the entire AI lifecycle. Leverage services like GCP's Vertex AI and Cloud Functions, or their AWS equivalents such as Amazon SageMaker, and AWS Lambda, to create efficient and resilient environments.

  • Automation & CI/CD: Develop and maintain automated workflows for continuous integration, continuous deployment (CI/CD), and continuous training (CT) of agentic AI models. Optimize for performance, scalability, and reliability using CI/CD platforms.

  • Monitoring & Performance Optimization: Implement and manage advanced monitoring systems to track the performance, health, and decision-making accuracy of agentic AI models in production. Utilize specialized tools like Lantrace, AgentOps, or AWS's CloudWatch to detect and resolve issues related to model drift, latency, and bias in real-time.

  • Security & Compliance: Integrate security best practices throughout the MLOps lifecycle. Ensure agentic AI systems adhere to ethical guidelines and regulatory requirements, implementing safeguards for data privacy, bias mitigation, and transparency in autonomous operations.

  • Collaboration: Work closely with AI researchers, data scientists, software engineers, and product teams to align MLOps processes with project goals. Facilitate iterative development and deployment of agentic AI solutions.

  • Data & Model Governance: Establish and enforce robust data and model governance frameworks, ensuring data quality, security, and compliance with industry standards for all agentic AI systems.

Qualifications

  • Experience: 4+ years of experience in MLOps, DevOps, or a related field, with at least 1 year focused on deploying and managing AI/ML models in production. Experience with agentic or autonomous AI systems is highly preferred.

  • Cloud Expertise: (4years)Deep hands-on experience with either Google Cloud Platform (GCP) or Amazon Web Services (AWS). Knowledge of relevant services such as GCP's Vertex AI, Cloud Storage, BigQuery, and Cloud Functions or AWS equivalents like Amazon SageMaker, S3, Redshift, and Lambda.

  • Technical Stack: (1 year or less)Strong knowledge of MLOps tools and frameworks(Pytorch, Langraph, CrewAI, N8N). Proficiency in containerization with Docker and orchestration with Kubernetes.

  • Programming & Scripting: Expertise in Python and familiarity with scripting for automation (e.g., Bash, Terraform). Strong experience with version control systems, particularly Git.

  • Monitoring & Analytics: Hands-on experience with modern monitoring tools like Lantrace, AgentOps, Prometheus, or AWS's CloudWatch and Grafana. Proven ability to track model performance, data drift, and system health in a production environment.

  • Security Mindset: A strong understanding of security principles related to cloud and MLOps, including Identity and Access Management (IAM), data encryption, and secure pipeline design.

  • Ethical AI Knowledge: Understanding of ethical AI principles, including bias detection, explainability, and compliance with regulations like GDPR or other relevant standards.

  • Collaboration & Communication: Strong interpersonal and communication skills, with the ability to work effectively in cross-functional teams and explain technical concepts clearly to diverse stakeholders.

Education: Bachelor’s degree in Computer Science, Engineering, Data Science, or a related field. Advanced degrees or certifications in MLOps, AI/ML, or cloud technologies are highly valued.

What you’ll love:
  • 100% Remote

  • Contractor position available for Latin American candidates

  • Holidays Off

  • Paid Time Off

  • Health insurance assistance program.

  • Competitive Pay (USD)

  • Excellent teamwork and work environment

  • Training

CodeRoad
CodeRoad

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say