Site Reliability Engineer
Department: Neuro & AGI
Location: Emeryville HQ
Compensation: $100K – $300K
Employment Type: FullTime
About Astera
Astera is a private foundation with a $2.5B endowment on a mission to steer science and technology toward an abundant future for all. Unlike traditional foundations, we operate like a high-velocity startup with unprecedented access to computational resources and complete freedom from funding pressures or profit motives. This allows us to focus on ambitious goals and attract incredibly creative scientists and engineers from leading academic institutions and from frontier AI labs.
Neuro-AI is our large-scale AI research program, pursuing a neuroscience-informed approach to engineering AGI. This is not yet-another-lab scaling LLMs in a hope of achieving general intelligence. We are integrating neuroscience, AI, and bioengineering to understand and digitally model the architecture of the human brain.
Position Summary
We are looking for a Site Reliability Engineer to own the digital infrastructure that powers our research.
This includes compute resources that we rent from third parties, container registries, and dashboards. The main objective is to make sharing these resources easy and efficient, ensuring the infrastructure is reliable and accessible to the right people.
This role spans a broad spectrum of activities:
Compute Access: Ensure easy and efficient access to compute resources for our researchers.
Resource Visibility: Provide clear visibility into resource utilization and cluster health.
Auto-Scaling: Enable automatic scaling of compute resources based on demand.
Access Management: Ensure the right people have access to the right resources.
Reproducibility: Drive towards deterministic deployments and reproducible research environments.
Process Automation: Automate operational processes where it makes sense to increase efficiency.
Current stack: Ansible, Kubernetes, Docker, Tailscale, Python, Grafana, Prometheus, and Talos Linux. We're not religious about any of it.
Qualifications
Ownership: You are comfortable being the person accountable when the cluster is unhealthy or capacity is tight.
Systems Intuition: You understand how schedulers, containers, networking, storage, and hardware interact. You can reason about failure modes and design systems that degrade predictably.
Operational Rigor: You value observability, reproducibility, and clear operational boundaries. You leave systems in a state that other engineers can understand, operate, and debug without you.
Pragmatism: You can support experimental research workloads without forcing everything into a rigid "production" mold. You know when to stabilize and when to allow controlled chaos to speed up discovery.
Location & Visa
This role is in-person in Emeryville, CA.
Visa sponsorship may be available for qualified candidates.
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
