Algolia is set to enable every company to create world-class Search and Discovery experiences with an API-first approach. Performance and Scalability is at the heart of our mission: we power 1.5 trillion searches a year, for 10K+ customers all over the world.
If you're a problem solver, able to think outside the box and eager to nurture others and learn from them, then this is your challenge!
The Team
The Platform as a Service (PaaS) team is dedicated to empowering development teams by creating toolchains, guidelines, and standards. Our focus is on enabling seamless automation and CI/CD, comprehensive observability, and unwavering reliability in a secured cloud-native environment.
The Opportunity
The Site Reliability Engineer position within the Platform As a Service team provides a dynamic opportunity for a professional with foundational experience in maintaining and optimizing scalable infrastructures. This role specifically concentrates on three key areas: CI/CD, Observability, and application hosting.
As a member of the Platform As a Service team, you will play a key role in supporting the reliability and scalability of Algolia’s Search Products. Your responsibilities will include operating components or features, ensuring proper monitoring and alerting are in place, and assisting in the transition from legacy systems. You will work on planning and accountability for the next quarter, demonstrating independence in problem-solving and minimal reliance on managers and senior team members.
Your role will consist of:
-
CI/CD Support and Optimization: Assist in the implementation and maintenance of a scalable CI/CD toolchain, contributing to the overall efficiency and reliability of development processes.
-
Observability Implementation: Support the development and deployment of observability standards and solutions, providing teams with actionable insights to enhance system reliability.
-
Kubernetes and Cloud Services Management: Help maintain and optimize Kubernetes-based architecture and cloud services, enhancing fault tolerance and resource utilization.
-
Collaboration and Problem Solving: Work collaboratively with team members to identify and solve problems, reducing dependence on senior staff for guidance.
-
Process Improvement: Contribute to establishing engineering processes and best practices to ensure high-quality, reliable, and scalable systems.
You might be a fit if you have:
-
Programming Skills: Basic to intermediate knowledge of programming languages such as Golang or Python, with an understanding of software craftsmanship. Familiarity with Ruby is a plus.
-
Experience with CI/CD and Kubernetes: Experience in setting up and managing CI/CD pipelines and Kubernetes-based architectures.
-
Knowledge of Distributed Systems: Exposure to operating distributed systems and understanding their challenges at a basic level.
-
Public Cloud Experience: Familiarity with public cloud providers such as Microsoft Azure, AWS, or GCP.
-
Problem-Solving Skills: Ability to independently identify and solve problems, demonstrating initiative and minimal reliance on senior team members.
-
Communication and Organization Skills: Strong communication and organizational skills to effectively collaborate with team members and stakeholders.
We’re looking for someone who can live our values:
-
GRIT - Problem-solving and perseverance capability in an ever-changing and growing environment
-
TRUST - Willingness to trust our co-workers and to take ownership
-
CANDOR - Ability to receive and give constructive feedback.
-
CARE - Genuine care about other team members, our clients and the decisions we make in the company.
-
HUMILITY- Aptitude for learning from others, putting ego aside.