C3.ai, Inc. (NYSE:AI) is a leading Enterprise AI software provider for accelerating digital transformation. The proven C3 AI Platform provides comprehensive services to build enterprise-scale AI applications more efficiently and cost-effectively than alternative approaches. The C3 AI Platform supports the value chain in any industry with prebuilt, configurable, high-value AI applications for reliability, fraud detection, sensor network health, supply network optimization, energy management, anti-money laundering, and customer engagement. Learn more at: C3 AI
We are seeking a highly skilled Site Reliability Engineer (SRE) to join our team to manage, monitor, and optimize our C3 clusters on Kubernetes. The ideal candidate will have a deep understanding of Kubernetes, Cloud Infrastructure, and Infrastructure as Code (IaC) practices. You will be responsible for ensuring the reliability, scalability of our Kubernetes clusters and Cloud Infrastructure
Responsibilities:
- Monitor and Manage Kubernetes Clusters: Ensure the stability, health, and scalability of Kubernetes Clusters, deploying applications and services on Kubernetes.
- Kubernetes Management: Deploy, monitor, and scale applications on Kubernetes clusters. Maintain Helm charts, manage services, and ensure resource allocation for optimal cluster performance.
- Cloud Infrastructure Management: Work with leading Cloud Platforms (AWS, GCP, Azure) to set up, configure, and manage infrastructure resources using Infrastructure as Code (Terraform, CloudFormation, etc.).
- Monitoring & Incident Response: Set up monitoring solutions, define alerts, and manage the incident response process for any issues related to Jenkins, C3, or Kubernetes clusters.
- Automate Infrastructure Processes: Build automation tools for scaling, monitoring, and maintaining infrastructure using modern tools like Terraform, Ansible, or equivalent.
- Collaborate Across Teams: Work closely with development, services, and operations teams to ensure a seamless integration between application development and infrastructure.
- Security & Compliance: Ensure all systems follow best practices in terms of security and compliance with relevant regulations. This includes role-based access, encryption, and automated vulnerability scanning.
Qualifications:
- 3+ years of experience as an SRE, DevOps Engineer, or related role.
- Hands-on experience with Kubernetes in production environments (managing clusters, deployments, services, and pods).
- Proficiency in cloud platforms like AWS, GCP, or Azure, including managing infrastructure via IaC tools like Terraform, CloudFormation, or equivalent.
- Familiarity with monitoring tools like Prometheus, Grafana or equivalent.
- Experience with Helm and managing Kubernetes applications via Helm charts.
- Strong scripting and automation skills in languages like Bash, Python, or Groovy.
- Experience with CI/CD tools, GitOps, and best practices for continuous integration and delivery pipelines.
- Understanding of networking concepts and security best practices in a cloud-native environment.
- Incident management experience, including setting up on-call rotations, managing runbooks, and post-incident reviews.
C3 AI provides excellent benefits, a competitive compensation package and generous equity plan.
C3 AI is proud to be an Equal Opportunity and Affirmative Action Employer. We do not discriminate on the basis of any legally protected characteristics, including disabled and veteran status.
Other Jobs from C3 AI
Associate Site Reliability Engineer/Site Reliability Engineer
Director, Product Marketing, Generative AI
Senior Director, Product Management
Senior/Lead Full-Stack Software Engineer - Platform UI
Similar Jobs
AVP, Principal Product Engineer -Devops (L11)
Technical Manager of Engineering (DevOps/QA), Resi
DevOps Engineer
Senior DevOps Engineer
DevOps Engineer
Senior System Engineer DevOps (REF3681C)
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
π₯³π₯³π₯³ 401 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineersβ¦ in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. π οΈ
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. π
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. π―
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. π
What Fellow Engineers Say