Senior Site Reliability Engineer
About the Company
Clarifai is a leading, full-lifecycle deep-learning AI platform for computer vision, natural language processing, LLM and audio recognition. We help organizations transform unstructured images, video, text, and audio data into structured data at a significantly faster and more accurate rate than humans would be able to do on their own. Founded in 2013 by Matt Zeiler, Ph.D. Clarifai has been a market leader in AI since winning the top five places in image classification at the 2013 ImageNet Challenge. Clarifai continues to grow with employees remotely based throughout the United States, Canada, Argentina, India and Estonia.
We have raised $100M in funding to date, with $60M coming from our most recent Series C, and are backed by industry leaders like Menlo Ventures, Union Square Ventures, Lux Capital, New Enterprise Associates, LDV Capital, Corazon Capital, Google Ventures, NVIDIA, Qualcomm and Osage.
Clarifai is proud to be an equal opportunity workplace dedicated to pursuing, hiring, and retaining a diverse workforce.
Your Impact
Clarifai’s platform is a kubernetes-native distributed system that requires the orchestration of many components. Efficiently serving and training large neural networks presents unique design and infrastructure challenges.
You will be critical to solving these challenges both in the context of the cloud and in on premise environments. Additionally, you will be responsible for our broader cloud infrastructure and development tools and environments.
The Opportunity
- Ensure the smooth operation and high availability of Clarifai's core services
- Monitor system performance, identify bottlenecks, and implement optimizations to enhance reliability and efficiency
- Develop Kubernetes resources and custom tooling for seamless cloud and on-premise deployments
- Design and implement scalable, secure, and cost-effective infrastructure solutions.
- Partner with teams across the organization to identify & solve engineering challenges
Requirements
- BS/BA in Computer Science or related degree
- Good knowledge of cloud providers (AWS, GCP or similar)
- Expertise with Kubernetes (EKS, GKE, self-hosted) and Infrastructure as Code using Terraform, Helm
- Solid understanding of web and networking (HTTP, TLS, DNS, Certificates, etc)
- Experience with CI/CD pipelines using tools such as GitHub Actions, ArgoCD, and Atlantis
- Strong interpersonal skills working with teams across different time zones and regions
Great to Have
- Knowledge of basic Microservice Architecture principles
- Familiarity with security best practices for cloud-based systems.
- Experience with relational databases, message queues, key value stores
- Experience writing python, golang, or any other popular programming language
- Familiarity with any RPC framework
- Experience developing & building custom Kubernetes operators
0 applies
3 views
Other Jobs from Clarifai
Senior Site Reliability Engineer
Senior Research Scientist, Machine Learning
Similar Jobs
Azure DevOps Engineer
Engineering Support - Network Administrator (NYC based)
Sr Software Engineer
Site Reliability Engineer III
Site Reliability Engineer III
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say