Upwork

Contract: Senior/Lead Kubernetes Engineer

Remote LATAM
Microservices GCP Azure Terraform Python Bash Kubernetes AWS
Description

Upwork ($UPWK) is the world’s work marketplace. We serve everyone from one-person startups to over 30% of the Fortune 100 with a powerful, trust-driven platform that enables companies and talent to work together in new ways that unlock their potential.  

Last year, more than $3.3 billion of work was done through Upwork by skilled professionals who are gaining more control by finding work they are passionate about and innovating their careers.  

This is an engagement through Upwork’s Hybrid Workforce Solutions (HWS) Team. Our Hybrid Workforce Solutions Team is a global group of professionals that support Upwork’s business. Our HWS team members are located all over the world.


Work/Project Scope:

The Platform Engineering Team is a core part of Central Engineering, managing a robust Service Mesh ecosystem that enables decentralized application architecture across Upwork's operations. Our infrastructure leverages a sophisticated data plane with application-level proxies and a control plane to manage distributed component communications. At the core of our architecture lies Kubernetes Orchestrator (EKS) and Istio as the foundational mesh technology.

We are seeking an experienced Senior/Lead Platform Engineer to join our Platform Engineering Team. This role involves managing our Kubernetes-based platforms and contributing to a high-scale system that handles tens of thousands of requests per second. Our Service Mesh infrastructure is designed to enhance security, resilience, observability, and control across applications, moving beyond traditional client library models.

Key Responsibilities

  • Platform Management
    • Oversee production-grade Kubernetes clusters deployed on AWS EKS, ensuring high availability, scalability, and reliability of platform services
    • Implement and manage Istio mesh architecture to support microservices communication, enhancing service discovery, traffic management, and security
  • Collaboration and Support
    • Work closely with Engineering teams to guide scalability and stability improvements early in the development lifecycle
    • Provide frontline support during US West Coast business hours (9 AM to 5 PM PST), managing platform-related issues through tickets and Slack channels
  • Technical Leadership
    • Drive performance and reliability improvements through proactive issue identification and resolution
    • Leverage programming and scripting skills to instrument, automate, troubleshoot, deploy, and orchestrate application services
    • Participate in modernization efforts, including multi-cloud deployment strategies across AWS, Google Cloud, and Azure
  • Innovation and Growth
    • Contribute to operations of existing installations and new Kubernetes-based solutions
    • Bring deep expertise to guide technical improvements while working with our diverse technical user community

What We Offer

  • High-Scale Environment
    • Opportunity to work with large-scale systems that present unique challenges and learning experiences
    • Exposure to complex failure patterns and advanced mitigation strategies
  • High Availability and Reliability Focus
    • Working on systems with stringent uptime requirements (zero downtime)
    • Emphasis on building and maintaining highly reliable and available platforms
  • Cutting-Edge Technologies
    • Opportunity to work with Istio mesh and advanced Kubernetes features
    • Potential to engage in multi-cloud deployments (AWS, Google Cloud, Azure)
  • Strategic Projects
    • Involvement in modernization projects aimed at uplifting the platform to support AI teams
    • Exposure to future-looking technologies and architectural patterns
  • Growth Opportunities
    • Potential to develop leadership skills in a senior role
    • Collaboration with multiple teams across the organization
    • Contribution to critical infrastructure that supports the entire company's operations

Must-Have Qualifications

  • Kubernetes and Microservices Expertise
    • Extensive hands-on experience managing Kubernetes clusters in production environments, particularly with AWS Elastic Kubernetes Service (EKS)
    • Proven track record with large-scale clusters and deploying numerous services (ideally hundreds)
    • Deep understanding of microservices architectures, including service discovery, inter-service communication patterns, and high-availability deployments
  • Service Mesh and Container Orchestration Proficiency
    • Strong familiarity with Istio service mesh or similar technologies
    • Hands-on experience with container orchestration platforms in AWS, specifically EKS (mandatory) and optionally ECS
  • Cloud Infrastructure and DevOps Skills
    • Proficient with AWS services, including VPC, IAM, EC2, ELB, Route53, KMS, CloudWatch, and CloudTrail
    • Hands-on experience with infrastructure provisioning and continuous deployment using Terraform and ArgoCD
    • Strong working knowledge of DevOps/GitOps principles and best practices for modern DevOps operations and software development
  • Programming and Scripting Proficiency
    • Mid to high-level experience with at least one programming/scripting language such as Python or Bash in an enterprise environment
  • High-Traffic Environment Experience
    • Experience handling systems with significant traffic (preferably tens of thousands of requests per second)
    • Knowledge of failure patterns and mitigation strategies in high-availability systems
  • Security Expertise
    • Experience with security best practices related to infrastructure and platform design
  • Strong Communication and Leadership Skills
    • Ability to collaborate effectively with cross-functional teams
    • Ability to guide technical decisions, mentor team members, and drive architectural improvements while maintaining strong stakeholder relationships
  • Problem-Solving Abilities
    • Demonstrated expertise in identifying and resolving complex platform issues
    • Proactive approach to enhancing platform reliability and performance
  • Time Zone Availability 
    • Must be available to work 9 AM - 5 PM Pacific Time to support teams located on the US West Coast

Nice-to-Have Qualifications

  • Hands-on experience with GCP cloud infrastructure
  • Experience supporting ML workloads in Kubernetes cloud-deployed clusters

If you are passionate about working with high-scale, high-availability systems and have the expertise we're looking for, we'd love to hear from you! Please submit your resume, and if you’d like, include a cover letter that highlights your relevant experience and explains why you would be a great fit for this role.


Upwork is proudly committed to fostering a diverse and inclusive workforce. We never discriminate based on race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical condition), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.   

Additionally, a criminal background check may be run on a candidate after a conditional offer to perform your services for Upwork is made. Qualified applicants with arrest or conviction records will be considered in accordance with applicable law, including the California Fair Chance Act and local Fair Chance ordinances.

To learn more about how Upwork processes and protects your personal information as part of the application process, please review our Global Job Applicant Privacy Notice

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say