Atlas Labs

Senior Site Reliability Engineer

Remote US
USD 150k - 150k
Go Python Ruby Shell Bash Microservices GCP
This job is closed! Check out or
Description
 

About Atlas Health:

Atlas Health automates philanthropic aid to improve access, affordability, outcomes, and health equity for vulnerable populations. Through intelligent matching and patient-friendly digital enrollment to 20,000+ philanthropic aid programs, healthcare organizations can improve patient outcomes, advance health equity, reduce the total cost of care and improve the patient experience. Join us on our mission of saving and improving lives by helping patients access and afford healthcare. 

 

Are you looking for a challenging and exciting opportunity at a life changing startup? 

As a Sr. Site Reliability Engineer, you will work to ensure the secure, prompt, and reliable delivery of services to our customers.  The current infrastructure and DevOps landscape requires the combination of software and systems engineering to solve operational problems and a strong security background to designed hardened environments that go beyond the traditional firewall rules of VPCs.  Your systems thinking and holistic knowledge will allow you to see automation efficiencies in areas outside of infrastructure, help re-engineer processes when they need it, and clearly communicate the necessary change to impacted stakeholders.  You have the grace to stay calm when production services are being impacted by operational or security issues and the courage to ask for help from the right people as needed.  You enjoy collaborating with people from other teams and disciplines and make plans a reality. 

At Atlas, everyone can contribute and the leadership team welcomes automation and efficiency contributions from all roles.  The purpose of this position is to further improve the company's ability to deliver an available and secure platform to help patients in need. You will work with the management team and report to the company’s CIO and dotted line report to the VP of Engineering. 

 

Sr. Site Reliability Engineer Duties & Responsibilities: 

  • Design and build core infrastructure that enables Atlas to scale, facilitate building cloud-based product offerings and solutions in a DevOps ecosystem, while being able to solve immediate problems to improve service delivery. 
  • Develop solutions to increase service stability through automation and process re-engineering.
  • Contribute to designing a distributed architecture product platform utilizing modern design philosophies and cloud-native practices.
  • Manage and deploy application and services using infrastructure-as-code and security best practices.
  • Implement and maintain CI/CD pipelines for automated build, test, and deployments.
  • Deliver system observability and monitoring capabilities to alert on problems and prevent outages, and get ahead of customer needs.
  • Coordinate the handling of incident repose and ticket queue management (Atlas Issues). This ranges from responding and handling monitoring alerts and reported issues, to simple IAM and DNS requests, to designing deploying new scalable application infrastructure.
  • Gather and analyze operating system and application metrics to assist in performance tuning and fault finding.
  • Document every action so your findings turn into repeatable actions and then into automation.
  • Updates job knowledge by studying state-of-the-art tools and techniques; participating in educational opportunities ; reading professional publications; maintaining personal networks; participating in professional organizations.
  • Provide mentorship to engineering teams to help build secure and scalable  systems and platforms.
  • Collaborate with other teams to improve services and help with system design, platform management, and capacity planning 

Sr. Site Reliability Engineering Requirements: 

Qualifications: 

  • Sound fundamentals in Linux-based systems including proficiency with commands like SSH, grep, sed, awk, find, etc. 
  • Solid understanding of networking and core internet protocols (e.g., TCP/IP, DNS, TLS, HTTP) 
  • Programming skills in a modern language, such a Go, Python, Ruby. 
  • Ability to script in a shell language (Bash or POSIX Shell). 
  • Proven experience working in container-orchestration systems and microservices architecture in production environments. 
  • Advanced knowledge of identity and access management in Google Cloud preferred 
  • Ability to remain calm and focused under the pressure of an incident. 
  • Comfort with collaboration, open communication and reaching across functional borders. 
  • 5+ years of experience working in a software engineer or development role 
  • Experience in Healthcare is a plus 
  • Applicants must reside in the United States

Education Requirements: 

  • Bachelor (undergraduate) degree in a relevant field (Computer Science, Software Engineer, Security, or others) OR an equivalent combination of education, training, and experience. 

Base compensation starting at $150,000 annually.

Why Join Our Team:

Because you’re motivated by a combination of success, working alongside incredible people, and have a passion for helping clients and patients. Atlas helps people access essential medical treatment, and avoid financial ruin from medical debt. You care about being a part of the journey and wish to play a key role in our organization’s success.

Benefits:

We offer a comprehensive benefit plan for our U.S. based employees which includes:

  • Health, dental and vision insurance
  • 401K
  • Flexible time off 
  • Paid holidays

Atlas values diversity of all kinds, and we’re committed to building a diverse and inclusive workplace where we learn from each other. We are an equal opportunity employer and welcome people of all different backgrounds, experiences, abilities, and perspectives.

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

50,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 166 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

Cancel anytime / Money-back guarantee

Wall of love from fellow engineers