Platform Science

Staff Site Reliability Engineer

Remote San Diego, CA
USD 227k - 227k
GCP Kubernetes Node.js Go AWS Docker Python Bash Azure
Description

Who We Are

At Platform Science, we’re working to connect everything that moves.

Founded in 2015, we are an open IoT platform that partners with innovative fleets, application developers, vehicle manufacturers, and equipment providers in the transportation industry to deliver revolutionary solutions to supply chain professionals across the globe.

Our employees are an engaging, diverse group of people who believe in the power of great ideas. We hire people with different experiences and perspectives to build a company culture that fuels growth through innovation.

We value thoughtful actions and empathy for others.  We approach challenges with resiliency and creativity, while encouraging transparency because, no matter our backgrounds or responsibilities, we are one team.

About the Role

We are looking for a qualified Staff SRE to join our team in San Diego, CA (or remote). You will be hired to solve operational problems and provide support to development teams for critical business applications in production.  Our focus is to ensure reliability in all production services and enable dev teams to be able to measure their reliability to effectively make decisions.

The SRE team has the unique opportunity to work with all aspects of our platform. We run entirely in the cloud—AWS, Azure and GCP. Our applications and services are containerized and serverless. If you’re excited about learning and supporting new technologies and many different types of products (including mobile apps, hardware, websites, messaging queues, serverless pipelines, and more), and working with an incredibly talented team, then this is the position for you!

As a Staff SRE, you have a software development background or systems background with strong coding skills. Ideal candidates want to deeply understand how our systems work from the infrastructure level, their dependencies to other systems, to the customer experience, and how to mitigate risk. You are comfortable with giving and taking technical direction. You are a great communicator and self-starter who strives to make the company and our technologies better.

Essential Responsibilities

  • Lead the development and enhancement of Continuous Integration/Continuous Deployment (CI/CD) pipelines, along with refining release management processes and associated toolsets
  • Architect and maintain Helm charts to streamline application deployment and management
  • Establish standardized observability solutions to empower development teams in efficiently managing their applications
  • Lead the effort in promoting and prioritizing reliability, driving achievement of uptime goals and mentoring colleagues in SRE best practices
  • Conduct comprehensive Production Readiness Reviews, working with teams to identify and establish Service Level Objectives (SLOs), and ensure high-quality and dependable services
  • Design and develop software solutions to address operational challenges effectively to improve system stability and reliability
  • Fulfill on-call duties, providing expert support to development teams for mission-critical applications in production environments
  • Improve the resiliency of applications and systems using chaos engineering

Experience

  • Possess 9+ years of hands-on experience in SRE or Platform Engineering roles
  • Demonstrated expertise (4+ years) with automation technologies like Jenkins, ArgoCD, or similar
  • Extensive (3+ years) experience with Kubernetes, Helm, and Docker within production environments
  • Proficiency with current software development lifecycle (SDLC) concepts and best practices, CI/CD pipelines, and test-driven development
  • Experience with AWS, encompassing proficiency in EKS, IAM, autoscaling, networking, and load balancing/request routing in a production environment
  • Proficient in Python, Bash, Nodejs, and/or Go 
  • Proficient with distributed tracing methodologies and observability tools such as Prometheus, ELK, or Datadog
  • Strong emphasis on documentation and fostering knowledge-sharing practices within the team and organization
  • Track record of successfully training and mentoring engineers 
  • Proven expertise in optimizing performance and managing costs within cloud environments
  • Sound understanding of SLI/SLO concepts and adherence to SRE best practices

Platform Science Benefits Highlights

The company offers various benefits to regular, full-time employees including: 

  • Medical, dental, and vision insurance
  • Short-term and long-term disability insurances
  • AD&D and life insurance
  • 401k plan
  • Paid vacation, sick leave and holidays
  • Six weeks of paid parental leave

For more information please see the Benefits Highlights brochure for regular, full-time employees.

In addition, you can access the Benefit Highlights brochure for regular, full-time employees by copying and pasting the link into your browser: https://www.platformscience.com/benefits.

This is an exempt role. Our job titles for each posting may span across more than one job level. The estimated base salary for this role is between $145,292 and $227,680. The range displayed on each job posting reflects the minimum and maximum target range for new hire base salaries across all US locations. Compensation packages are based on many factors unique to each candidate, including but not limited to skill set, work experience, relevant trainings and certifications, business needs, market demands and specific geographical location. The base pay range is subject to change and may be modified in the future. This role may also be eligible for bonus, equity, and benefits.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits.

Platform Science collects your personal information to support its business operations, including for human resources, employment, benefits administration, health and safety, and other business-related purposes as well as to be in legal compliance. You can review further details of such collection and use in our Privacy Policy (link for browser: https://www.platformscience.com/privacy-notice).

At this time we only consider candidates in these states: AL, AR, AZ, CA, CO, FL, GA, ID, IL, KY, MA, MD, MI, MN, MO, NC, NH, NV, NY, OH, OK, OR, PA, SC, TN, TX, UT, VA, WA, and WI. In the future we plan to add more states.

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

50,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 223 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

Cancel anytime / Money-back guarantee

Wall of love from fellow engineers