AI Infrastructure Engineering (Cloud, DevOps)
Department: Research & Engineering
Location: San Francisco
Compensation: $150K – $300K • Offers Equity • Offers Bonus
Employment Type: FullTime
Location: San Francisco, CA (Onsite | Remote)
About Virtue AI
Virtue AI sets the standard for advanced AI security platforms. Built on decades of foundational and award-winning research in AI security, its AI-native architecture unifies automated red-teaming, real-time multimodal guardrails, and systematic governance for enterprise apps and agents. Deploy in minutes—across any environment—to keep your AI protected and compliant. We are a well-funded, early-stage startup founded by industry veterans, and we're looking for passionate builders to join our core team.
What You’ll Do
As an AI infra Engineer, you will own the reliability, scaling, automation, and operational discipline of Virtue AI’s AI production systems, focusing on deployment and model serving performance.
You will:
Design and maintain deployment workflows for Virtue AI on major cloud providers (e.g., AWS and GCP)
Own IaC (Terraform / Pulumi) for repeatable, auditable customer deployments.
Package our services into secure, customer-ready deployment units (Docker, Helm, Marketplace images).
Design, build, and maintain product CI/CD pipelines using GitHub Actions.
Serve and optimize the LLM inference pipeline; build necessary inference APIs and routers; auto-scaling
Design production-grade system observability (Metrics, logs, alerts, dashboards) using tools like Datadog, Grafana, and Prometheus.
Implement secure networking (VPCs, IAM, service accounts, private endpoints, firewalling).
Collaborate with product developers to align infrastructure and inference behavior with product requirements.
Required Qualifications
Bachelor’s degree or higher in CS, CE, EE, or related field.
Strong experience deploying production systems on major cloud platforms, e.g., AWS and/or GCP.
Deep hands-on experience with Docker and containerized workloads, Kubernetes (EKS, GKE, or equivalent).
Strong experience serving LLMs and embedding models in production.
Strong hands-on experience with CI/CD (GitHub Actions required) and repository management (monorepos, release branches, tagging, rollbacks).
Preferred Qualifications
Experience with SGLang, vLLM, or similar inference frameworks.
Strong understanding of GPU behavior (memory limits, batching, fragmentation, utilization) and experience with GPU-level optimization
Experience with model-level inference optimization (Quantization, KV-cache optimization, Speculative decoding or batching strategies) and inference kernels
Startup experience: you move fast, take ownership, and fix things properly.
Why Join Virtue AI
Competitive salary + equity
High ownership – You define how production runs
Real impact – Your work directly affects customers and revenue
Hard problems – Distributed systems, GPUs, scale, security
Strong technical peers – Engineers who ship and debug, not just designLocation: San Francisco, CA (Onsite | Remote)
About Virtue AI
Virtue AI sets the standard for advanced AI security platforms. Built on decades of foundational and award-winning research in AI security, its AI-native architecture unifies automated red-teaming, real-time multimodal guardrails, and systematic governance for enterprise apps and agents. Deploy in minutes—across any environment—to keep your AI protected and compliant. We are a well-funded, early-stage startup founded by industry veterans, and we're looking for passionate builders to join our core team.
What You’ll Do
As a DevOps Engineer, you will own the reliability, automation, and operational discipline of Virtue AI’s production systems. When something breaks, you fix it. When it doesn’t scale, you redesign it.
You will:
Design, build, and maintain CI/CD pipelines using GitHub Actions
Own repo structure, branching strategy, release workflows, and versioning
Build and operate Kubernetes infrastructure on GKE
Package, deploy, and optimize services using Docker
Design production-grade system observability
Metrics, logs, alerts, dashboards
Datadog, Grafana, Prometheus
Monitor and improve service reliability, latency, and uptime
Debug real production issues across infra, networking, containers, and code
Partner with backend, ML, and platform engineers to remove operational bottlenecks
What Makes You a Great Fit
You don’t just “set up pipelines.” You understand why systems fail, and you design so they don’t fail the same way twice.
Required Qualifications
Bachelor’s degree or equivalent practical experience
Strong hands-on experience with:
CI/CD (GitHub Actions required)
Repository management (monorepos, release branches, tagging, rollbacks)
Deep experience with:
Kubernetes
Docker
Experience designing and operating observability systems
Datadog and/or Grafana in production
Strong understanding of system design
Availability, scalability, fault isolation
Proven ability to solve real production problems, not just configure tools
Comfortable working directly on production systems
Preferred Qualifications
Experience operating ML / LLM inference systems
Experience with GPU workloads and resource scheduling
Experience supporting enterprise customers with SLAs
Familiarity with infrastructure-as-code (Terraform / Pulumi)
Startup experience: you move fast, take ownership, and clean up after yourself
Why Join Virtue AI
Competitive salary + equity
High ownership – You define how production runs
Real impact – Your work directly affects customers and revenue
Hard problems – Distributed systems, GPUs, scale, security
Strong technical peers – Engineers who ship and debug, not just design
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
