Qode

Site Reliability Engineer

Pune, India
AWS Azure Kubernetes Git GitLab PowerShell Python Ansible Terraform Bash Linux AD LDAP DNS EC2 Route 53 ServiceNow
Description

Site Reliability Engineer

Location: Maharashtra, Maharashtra, India

Workplace: on_site

Employment Type: full

Description

Site Reliability Engineer

Location: Pune, India
Workplace Type: Onsite
Shift: US Shift

About the Role

We are seeking an experienced Site Reliability Engineer to join our dynamic team in Pune. In this role, you will be instrumental in managing our multi-cloud infrastructure, focusing on AWS and Azure. You will be responsible for setting up and maintaining the infrastructure to support our cloud migration and future division expansion. This position offers a unique opportunity to work in a global environment, collaborate with Automotive and corporate IT teams, learn new skills, and shape the future direction of our infrastructure. The ideal candidate will have a strong background in cloud computing, infrastructure as code, and automation, with a proactive approach to problem-solving and performance optimization. You will be part of the Tech Ops / SRE Team, which operates in a sharing and learning culture to maintain continuous access to our products.

Key Responsibilities

  • Gather and analyze metrics from operating systems and applications to assist in performance tuning and fault finding.
  • Partner with development teams to improve services through rigorous testing and release procedures.
  • Participate in system design consulting, platform management, and capacity planning.
  • Create sustainable systems and services through automation.
  • Balance feature development speed and reliability with well-defined service-level objectives.
  • Manage day-to-day operations of AWS/Azure Infrastructure.
  • Build and document automation processes for Infrastructure as a Service/Infrastructure as code.
  • Manage backup and patch management processes.
  • Provide adequate support in architecture planning, migration, and installation for new projects.
  • Lead the structural/architectural design of platforms, middleware, databases, and backups according to system requirements.
  • Conduct technology capacity planning by reviewing current and future requirements.
  • Strategize and implement disaster recovery plans, including creating and implementing backup and recovery plans.
  • Manage day-to-day operations by troubleshooting issues, conducting root cause analysis (RCA), and developing fixes.
  • Plan for and manage upgrades, migrations, maintenance, backups, installations, and configurations.
  • Review technical performance and deploy ways to improve efficiency and fine-tune performance.
  • Develop shift rosters to ensure no disruption in the tower.
  • Create and update SOPs, Data Responsibility Matrices, operations manuals, and daily test plans.
  • Provide weekly status reports to client leadership and internal stakeholders.
  • Leverage technology to develop Service Improvement Plans (SIP) through automation.

Required Skills & Qualifications

  • Bachelor’s degree (or equivalent) in computer science or a related discipline with at least 7 years of experience.
  • Strong understanding and hands-on experience with EKS, including configuring, deploying, maintaining, troubleshooting, upgrading, and monitoring EKS on AWS.
  • Hands-on experience with CI/CD pipelines and DevOps tooling, including Git-based version control (GitLab preferred), pipeline design and maintenance, automated builds, testing, and deployments for cloud-native and containerized workloads.
  • Hands-on Experience with Linux Server, AD, LDAP, DNS, Network Storage, AWS Compute services (EC2, FSX, Managed AD, Route 53, etc…).
  • Ability to program using scripting with tools or languages, such as PowerShell, Python, Ansible, Terraform, and Bash.
  • Familiarity with ITSM processes like Incident, Problem, and Change Management using ServiceNow (preferable).
  • Proactive approach to identifying problems, performance bottlenecks, and areas for improvement.
  • Strong interpersonal skills, analytical and problem-solving ability, along with strong written and verbal communication.
  • Ability to communicate ideas in both technical and non-technical ways.
  • A strong capacity for teamwork and a sense of ownership, with the ability to work independently and be self-driven.
  • Experience with Infra Cloud Computing Consulting.

Qode
Qode

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say