Astera

Site Reliability Engineer

Emeryville, CA
USD 100k - 300k
Ansible Kubernetes Docker Python Grafana Prometheus
Description

Site Reliability Engineer

Department: Neuro & AGI

Location: Emeryville HQ

Compensation: $100K – $300K

Employment Type: FullTime

About Astera

Astera is a private foundation with a $2.5B endowment on a mission to steer science and technology toward an abundant future for all. Unlike traditional foundations, we operate like a high-velocity startup with unprecedented access to computational resources and complete freedom from funding pressures or profit motives. This allows us to focus on ambitious goals and attract incredibly creative scientists and engineers from leading academic institutions and from frontier AI labs.

Neuro-AI is our large-scale AI research program, pursuing a neuroscience-informed approach to engineering AGI. This is not yet-another-lab scaling LLMs in a hope of achieving general intelligence. We are integrating neuroscience, AI, and bioengineering to understand and digitally model the architecture of the human brain.

Position Summary

We are looking for a Site Reliability Engineer to own the digital infrastructure that powers our research.

This includes compute resources that we rent from third parties, container registries, and dashboards. The main objective is to make sharing these resources easy and efficient, ensuring the infrastructure is reliable and accessible to the right people.

This role spans a broad spectrum of activities:

  • Compute Access: Ensure easy and efficient access to compute resources for our researchers.

  • Resource Visibility: Provide clear visibility into resource utilization and cluster health.

  • Auto-Scaling: Enable automatic scaling of compute resources based on demand.

  • Access Management: Ensure the right people have access to the right resources.

  • Reproducibility: Drive towards deterministic deployments and reproducible research environments.

  • Process Automation: Automate operational processes where it makes sense to increase efficiency.

  • Current stack: Ansible, Kubernetes, Docker, Tailscale, Python, Grafana, Prometheus, and Talos Linux. We're not religious about any of it.

Qualifications

  • Ownership: You are comfortable being the person accountable when the cluster is unhealthy or capacity is tight.

  • Systems Intuition: You understand how schedulers, containers, networking, storage, and hardware interact. You can reason about failure modes and design systems that degrade predictably.

  • Operational Rigor: You value observability, reproducibility, and clear operational boundaries. You leave systems in a state that other engineers can understand, operate, and debug without you.

  • Pragmatism: You can support experimental research workloads without forcing everything into a rigid "production" mold. You know when to stabilize and when to allow controlled chaos to speed up discovery.

Location & Visa

  • This role is in-person in Emeryville, CA.

  • Visa sponsorship may be available for qualified candidates.

Astera
Astera

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say