Guidepoint

Team Lead, Site Reliability Engineering

Toronto, Ontario Canada
Terraform Ansible Kubernetes Python Go SQL Azure
Description

Overview: 

Guidepoint’s Engineering team thrives on problem-solving and creating happier users. As Guidepoint works to achieve its mission of making individuals, businesses, and the world smarter through personalized knowledge-sharing solutions, the engineering team is taking on challenges to improve our internal application architecture and create new products to optimize the seamless delivery of our services. 

The site reliability engineering team lead is responsible for ensuring the reliability, scalability and performance of a SaaS product running on Azure. The role involves, leading a team of SRE’s to proactively monitor, Automate and optimize system performance while fostering a culture of collaboration with development teams, innovations and continuous improvements. As the SRE lead, this person will act as the bridge between development ad operations driving best practices of in reliability engineering and proactive management of environments thru Observability, Key areas of focus would include maintaining uptime, monitoring performance, resolving incidents, optimizing capacity, managing error budgets, and collaborating with development teams to build resilient and maintainable systems.


This is a hybrid position based in Toronto. 

What You’ll Do:

  • Guide, mentor, and upskill the SRE team, ensuring alignment with organizational priorities
  • Design and implement monitoring strategies to ensure uptime and minimize failures
  • Automate manual processes to improve efficiency and reduce human error
  • Define, manage, and maintain SLOs and SLIs to ensure high availability of systems
  • Manage error budgets and trigger breach actions as per established policies
  • Enhance Datadog automated monitoring and alerting, ensuring critical events are managed through the Status Page
  • Lead incident response alongside engineering leads, support RCA efforts, and drive auto-remediation initiatives
  • Collaborate with Product, Support, Engineering, and Cloud Operations teams to deliver scalable and reliable solutions
  • Actively participate in cost optimization initiatives with Cloud Operations and Engineering
  • Handle escalated customer issues and ensure satisfactory resolution
  • Conduct regular team meetings and training sessions
  • Identify areas for process improvement and implement best practices
  • Provide insights and recommendations to enhance reliability and customer satisfaction

What You Have:

  • 8+ years of experience in software development and Site Reliability Engineering or Production Engineering
  • 3+ years of experience leading an SRE team with expertise in Infrastructure as Code (IaC) using Terraform and Ansible, managing and operating Kubernetes clusters, and implementing monitoring and observability solutions with Datadog
  • Comprehensive understanding of web application security
  • Strong system engineering background with Linux/Windows
  • Proficient in development with Python or Golang
  • Strong understanding of Azure libraries (Client, Management, Asset)
  • In-depth knowledge of web application SaaS platforms and architecture
  • Proficient in SQL and possibly other database operations
  • Strong communication skills
  • Expertise in technical writing and documentation
  • Ability to rapidly analyze issues, anticipate consequences, make decisions, and take action
  • Ability to work independently and as part of a team
  • Experience in presenting monthly reports and metrics to managers and stakeholders

What We Offer:

  • Paid Time Off
  • Comprehensive benefits plan
  • Company RRSP Match
  • Development opportunities through the LinkedIn Learning platform

About Guidepoint: 

Guidepoint is a leading research enablement platform designed to advance understanding and empower our clients’ decision-making process. Powered by innovative technology, real-time data, and hard-to-source expertise, we help our clients to turn answers into action.

Backed by a network of nearly 1.5 million experts and Guidepoint’s 1,300 employees worldwide, we inform leading organizations’ research by delivering on-demand intelligence and research on request. With Guidepoint, companies and investors can better navigate the abundance of information available today, making it both more useful and more powerful.

At Guidepoint, our success relies on the diversity of our employees, advisors, and client base, which allows us to create connections that offer a wealth of perspectives. We are committed to upholding policies that contribute to an equitable and welcoming environment for our community, regardless of background, identity, or experience.

#LI-DH1

#LI-Hybrid

Guidepoint
Guidepoint
Consulting Financial Services Industrial Information Services Market Research Technical Support Telecommunications

0 applies

1 views

Other Jobs from Guidepoint

AI Engineer

Toronto, Ontario Canada

Similar Jobs

Data Analyst

Bengaluru, India

Director of Engineering

Remote Tel Aviv, Israel

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say