Workable makes software to help companies find and hire great people. We get recruiting and its role in building healthy workplaces — which is why we’re proud more than 20,000 teams around the world use Workable to do exactly that.
And while we take recruiting seriously, we don’t take ourselves too seriously. At Workable, you’ll find smart people who have fun, learn and innovate, and help others do the same. We brainstorm, we laugh, and, occasionally, we party (there’s a lot to celebrate), but we also appreciate people’s need for quiet time and focused work. We respect everyone, we hire the best, and make sure every experience is special.
We’re growing fast and we want to make sure that we scale from thousands to hundreds of thousands so we’re looking for a Junior Site Reliability Engineer to join our SRE team.
Our product is built with a microservices architecture deployed on Kubernetes platform. Our SRE team is responsible for deploying, monitoring, optimizing and securing our cloud infrastructure and company software; both rapidly expanding. Automation is at the core of what we do. If you love working with new technologies, open source software, and solving complex problems on highly distributed systems then this is the job for you! You will be part of a talented team of engineers that demonstrate superb technical competency, delivering mission critical infrastructure and ensuring the highest levels of availability, performance and security.
Responsibilities
As a Junior Site Reliability Engineer and member of the observability team you will:
- Operate, deploy and monitor cloud services from development to production with an emphasis on our observability tooling
- Working in a highly cross-functional team with Developers on designing, releasing and troubleshooting production systems and supporting them in properly instrumenting their applications and services.
- Develop tools and automations to make operations and deployments simpler and more robust.
- Be responsible for the availability, scalability and performance of our systems and continuously improve their monitoring to provide better insights.
- Troubleshoot issues and analyze system performance and continuously enhance our alerting system to ensure proactive detection of issues.
You must have:
- BS/MS degree in Computer Science, Engineering (or a proven strong background)
- Excellent communication skills in English, particularly written communication.
- Strong understanding of observability principles and practices.
- Analytical, troubleshooting and sysadmin skills (experience with linux systems)
- Passion for cutting-edge cloud technologies and automation
- Relevant work experience, including programming experience
- Familiarity with centralized logging, monitoring systems, and tooling frameworks such as Prometheus, Grafana, ELK Stack, and Datadog.
- Experience with a major cloud provider (GCP and AWS preferred)
- Experience with configuration management and orchestration tools (e.g., Ansible, Terraform)
- Familiarity of at least one programming language (preferably Go and / or Python)
Preferred qualifications
- Bonus: Familiarity with OpenTelemetry standards.
- Bonus: Familiarity with Kubernetes platform and technology stack.
- Bonus: Familiarity with Relational and / or NoSQL Databases.
Other Jobs from Workable
Similar Jobs
Site reliability engineer
Senior QA Engineer
Principal Dev Ops Engineer
Senior Back-end Engineer
See 15,000+ More Jobs Like These
Subscribe to weekly membership and unlock all jobs
Engineering Jobs
15,000+ jobs from 2,600+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Cancel anytime