Site Reliability Engineer (Space Communications)
Department: Engineering
Location: Torrance, CA
Compensation: $108K – $140K • Offers Equity
Employment Type: FullTime
About Northwood:
Northwood is on a mission to transform connectivity between earth and space and bring the benefits of space to the masses through innovations in space communications technologies. If you like building quickly and seeing your work deployed in locations around the globe with real impact, we want you at Northwood.
Role:
Northwood is looking for an Site Reliability Engineer to help build the monitoring and reliability systems that keep satellites connected to Earth. As we rapidly scale our ground station network across multiple continents, you'll build the observability infrastructure that ensures our space communications systems operate 24/7 for customers ranging from commercial satellite operators to national security missions.
This is a high-growth role where you'll evolve from building core monitoring systems to potentially leading infrastructure teams and architecting global-scale reliability platforms. You'll work directly with our founding engineering team to establish the monitoring, alerting, and deployment practices that will scale with us from startup to enterprise. If you're excited about space technology and want to build infrastructure that directly supports mission-critical satellite operations, this role offers that opportunity.
Responsibilities:
Build and maintain observability stack (Grafana, Prometheus, Loki, Vector, VictoriaMetrics) that monitors ground stations, satellite communication systems, and cloud infrastructure across multiple AWS regions
Support CI/CD pipelines using GitLab and ArgoCD, partnering with development teams to ensure reliable deployments of mission-critical software
Develop and maintain AWS infrastructure using Terraform, with focus on multi-region reliability and automated scaling for ground station operations
Deploy and manage Kubernetes applications with Helm, ensuring both developer productivity and system uptime for satellite communication services
Establish monitoring strategies, alerting frameworks, and incident response procedures for infrastructure supporting real-time satellite communications
Participate in on-call rotation and lead post-incident reviews to continuously improve system reliability
Basic Qualifications
2-5 years of production infrastructure and monitoring experience with measurable reliability improvements
Strong experience with Kubernetes, Docker, and container orchestration in production environments
Hands-on experience with CI/CD tools and infrastructure as code (Terraform preferred)
AWS experience with multi-service deployments and Python programming skills for automation
Self-directed work style with ability to own projects from conception to production in fast-moving environments
Understanding of SRE principles, SLOs/SLIs, and systematic approaches to system reliability
Preferred Qualifications
Experience with observability tools (Vector, Loki, Grafana, Prometheus) in production environments
Familiarity with HashiCorp Vault, Okta, or similar identity/secrets management systems
Previous experience scaling infrastructure at high-growth companies (startup to 100+ employees)
AWS certification or demonstrated expertise with advanced cloud networking and security
Linux system administration experience and networking fundamentals
Interest in aerospace, telecommunications, or mission-critical systems
Additional Information:
To conform to U.S. Government space technology export regulations, including the International Traffic in Arms Regulations (ITAR) you must be a U.S. citizen, lawful permanent resident of the U.S., protected individual as defined by 8 U.S.C. 1324b(a)(3), or eligible to obtain the required authorizations from the U.S. Department of State.
Northwood is an Equal Opportunity Employer; employment with Northwood is governed on the basis of merit, competence and qualifications and will not be influenced in any manner by race, color, religion, gender, national origin/ethnicity, veteran status, disability status, age, sexual orientation, gender identity, marital status, mental or physical disability or any other legally protected status.
#LI-DNI
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
