Senior Site Reliability Engineer
Team: Engineering
Location: Remote, Grand Rapids, MI, Austin, TX
Commitment: Full-time
Workplace Type: remote
What You’ll Do
- Reliability & Platform Operations: Own and improve the reliability, availability, and performance of production systems while defining and operationalizing SLIs/SLOs and error budgets.
- AI Agent Enablement: Design and implement autonomous and semi-autonomous AI agents for monitoring distributed systems and applications. Build agents capable of consuming multi-source observability data (metrics, logs, traces, etc.).
- Incident Response: Participate in and help lead an on-call rotation, serving as an escalation point for major incidents and facilitating blameless postmortems.
- Automation & Infrastructure: Build automated workflows to eliminate manual work and design/maintain Infrastructure-as-Code with Terraform.
- Observability: Improve metrics, logs, traces, and alerting using tools like Datadog or Prometheus to reduce noise and increase signal.
- Collaboration & Mentorship: Partner with application teams to implement reliability best practices and mentor junior engineers to foster a culture of knowledge sharing.
Who You Are
- Strategic Architect: You look beyond the "what" to understand the "why," providing insights that influence our GTM and technical direction.
- Startup Veteran: You are comfortable moving fast and staying proactive in an environment where the playbook is still being written.
- Relatable & Adaptable: You can navigate different personalities across the organization, from high-energy sales teams to analytical engineering partners.
- Lifelong Learner: You have a thirst for learning, keeping up with emerging technologies and industry trends.
What We're Looking For
- Experience: 5+ years in SRE, DevOps, Platform Engineering, or Infrastructure Engineering.
- Cloud Expertise: Proven experience supporting production SaaS systems in Azure (preferred), AWS, or GCP.
- Technical Stack: Strong Linux, networking, and distributed systems troubleshooting skills.
- Containers: Strong experience with containers and orchestration (Kubernetes/EKS/AKS).
- IaC & Tooling: Expertise with Infrastructure-as-Code (Terraform strongly preferred).
- Programming: Strong scripting/programming skills in Python, Go, Bash, or C#/.NET.
- Observability: Hands-on experience with Datadog, Prometheus/Grafana, or OpenTelemetry.
What We Offer
- Flexible vacation
- 12 company-paid holidays
- 10 paid sick days
- No work on your birthday
- Health, dental, and vision Insurance (including a $0 option)
- 401(k) with matching, and no waiting period
- Equity
- Life insurance
- Generous parental paid leave
- Wellness reimbursement of $300/year
- Remote worker reimbursement of $300/year
- Professional development reimbursement
- Competitive pay
- An award-winning culture
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
