Sieve

Reliability Engineer

San Francisco, CA
USD 150k - 300k
AI Python Go Rust C++ Terraform GCP AWS Oracle Cloud Cloudflare Argo Helm Kustomize Prometheus OpenTelemetry VictoriaMetrics API Svelte
Description

Reliability Engineer

Department: Engineering

Location: San Francisco

Compensation: $150K – $300K • Offers Equity

Employment Type: FullTime

About Us

Sieve is the only AI research lab exclusively focused on video data. We combine exabyte-scale video infrastructure, novel video understanding techniques, and dozens of data sources to develop datasets that push the frontier of video modeling. Video makes up 80% of internet traffic and has become the enabling digital medium powering creativity, communication, gaming, AR/VR, and robotics. Sieve exists to solve the biggest bottleneck in the growth of these applications: high-quality training data.


We’ve partnered with top AI labs and did $XXM last quarter alone, as a team of just 12 people. We also raised our Series A earlier this year from Tier 1 firms such as Matrix Partners, Swift Ventures, Y Combinator, and AI Grant.


About the Role

We process petabytes of video across thousands of nodes and multiple cloud environments. As we scale, reliability, observability, and security become existential.


We’re hiring our first engineer fully dedicated to the infrastructure foundation of Sieve. This is a high-ownership role for someone who thinks deeply about:

  • throughput and system stability

  • monitoring and incident response

  • security and least-privilege design

  • reducing operational burden for the entire engineering team


You’ll work directly with our CTO and our founding engineers to build the core tooling that powers all of engineering.


This role is for someone who spends their time thinking deeply about reliability, throughput, observability, and security. You’re the kind of engineer who is always anticipating failure modes, eliminating operational risk, and designing systems that don’t break.


If something goes down, you take it personally, and you thrive in that level of responsibility.


What You’ll Do

  • Work with engineering to design and validate the infrastructure powering PB-scale workloads

  • Build and maintain Terraform-managed multi-cloud deployments

  • Improve cloud and data security (SSO, IAM, least privilege, auditability)

  • Own incident response and harden systems against failure

  • Develop CI/CD systems that minimize user error and maximize safety

  • Build monitoring + alerting platforms (Prometheus, OpenTelemetry, VictoriaMetrics)

  • Wrap internal reliability tooling with simple UIs for engineers


Requirements

  • 3+ years building internal infrastructure at scale

  • Experience on-call for Sev 0 / Sev 1 production incidents (L3 preferred)

  • Strong cloud experience (GCP, AWS, Oracle, Cloudflare, etc.)

  • Deep Infrastructure-as-Code experience (Terraform preferred)

  • Familiarity with Argo, Helm, Kustomize, or similar deployment tools

  • Experience operating observability systems (Prometheus, OTel, VictoriaMetrics)

  • Backend fundamentals in Python, Go, Rust, or C++

  • Strong networking + security intuition, including SSO implementation

  • High ownership mindset over critical systems


Bonus

  • Experience building lightweight internal tooling (APIs, dashboards, Svelte)

  • Familiarity with object storage systems (“buckets”)

  • Active GitHub or portfolio projects


Location

In-person at our SF HQ.

Sieve
Sieve

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say