Service Reliability Engineer (Production Engineering), AVP
Location: Bangalore, Velankani Tech Park
Time Type: Full time
Job Description
Job Description:
Job Title: Service Reliability Engineer (Production Engineering), AVP
Location: Bangalore, India
Corporate Title: AVP
Role Description
We are seeking a highly motivated and experienced AVP Site Reliability Engineer to join our Engineering team, focusing on Investment Banking Settlements. This role is uniquely positioned to act as a hands-on developer and engineer, dedicated to proactively improving the reliability, scalability, and performance of our core settlement systems, which encompass both modern Fabric-based solutions and integrated legacy components. As an SRE embedded within Production Engineering, you will be instrumental in architecting, developing, and implementing solutions that enhance our CI/CD practices, drive robust cloud adoption, and champion SRE best practices directly into our system architectures.
What we’ll offer you
As part of our flexible scheme, here are just some of the benefits that you’ll enjoy
- Best in class leave policy
- Gender neutral parental leaves
- 100% reimbursement under childcare assistance benefit (gender neutral)
- Sponsorship for Industry relevant certifications and education
- Employee Assistance Program for you and your family members
- Comprehensive Hospitalization Insurance for you and your dependents
- Accident and Term life Insurance
- Complementary Health screening for 35 yrs. and above
Your key responsibilities
- Reliability Engineering & Development:
- Proactively develop and implement engineering solutions to enhance the reliability, availability, and observability of Fixed Income Securities Settlement systems, aligning with SRE best practices and the "you built it, you run it" paradigm.
- Define and instrument Service Level Objectives (SLOs) and Service Level Indicators (SLIs), leveraging them to drive development priorities and measure customer satisfaction.
- Design, develop, and integrate reliability and resilience patterns such as auto-scaling, circuit breakers, bulk-heads, rate limiters, and retry mechanisms directly into application and infrastructure code.
- Actively contribute to the reduction of toil by developing automation and tooling for service request fulfillment, incident, and problem management, thereby improving Mean Time To Resolution (MTTR).
- Lead root-cause analysis, performance optimization initiatives, and incident response, translating insights directly into permanent code-based solutions and system improvements.
- Champion and contribute to state-of-the-art SRE best practices including GitOps, Distributed Tracing, Open Telemetry, and Chaos Engineering within the development lifecycle.
- CI/CD Engineering & Automation Development:
- Design, develop, and maintain advanced CI/CD pipelines using tools such as GitHub Actions, Ansible, to automate build, test, and deployment workflows for both GCP and Fabric-based.
- Develop robust scripts, utilities, and automation frameworks to streamline deployment processes, environment provisioning, and monitoring, reducing manual effort and improving deployment velocity.
- Cloud Engineering & Platform Development:
- Architect, develop, and implement scalable, secure, and resilient systems on Google Cloud Platform (GCP is mandatory; Azure/AWS experience is a plus).
- Develop and maintain infrastructure-as-code solutions using Terraform for managing cloud resources, ensuring consistency and repeatability across environments.
- Contribute to and implement deployment architectures, with a focus on optimizing multi-platform and distributed systems that integrate seamlessly with legacy infrastructure.
- Containerization & Orchestration Development:
- Implement and manage Docker-based solutions, including creating and maintaining Dockerfiles, Helm charts, and Kubernetes workload configurations, specifically within Google Kubernetes Engine (GKE).
- Develop and optimize microservices deployments, container clustering, and container lifecycle automation.
- Collaboration & Engineering Excellence:
- Collaborate closely with Developers, QA Engineers, other SREs, and Platform teams to ensure seamless and efficient delivery processes, embedding reliability from inception.
- Consult with Business Functional Analysts and Solution Architects to proactively embed resilience into solution design early in the development lifecycle.
- Utilize development ecosystem tools such as Bitbucket, Artifactory, Confluence, and Jira for effective collaboration and project management.
- Apply a strong understanding of SDLC models, Agile methodologies, and engineering best practices, fostering a culture of continuous improvement and shared ownership.
- Mentor junior engineers, contributing to overall team capability uplift.
Your skills and experience
- Technical Expertise (Mandatory):
- Bachelor's or Master’s degree in Computer Science, Computer Engineering, Electrical Engineering, or a related field.
- Strong hands-on programming/scripting proficiency in at least one: Java, NodeJS/TypeScript, Python, Shell scripting – with a proven track record of developing production-grade solutions.
- Hands-on experience with automation tools: GitHub Actions.
- Leverage AI-powered tools to enhance incident detection, automate responses, and optimize reliability engineering workflows.
- Expert knowledge of containerization (Docker), orchestration (Kubernetes, specifically GKE), and packaging (Helm).
- Extensive hands-on experience with Google Cloud Platform (GCP) is mandatory.
- Hands-on experience with Terraform-based Infrastructure-as-Code (IaC) for cloud infrastructure management.
- Proven experience in setting up and developing observability, monitoring, and self-healing solutions (e.g., New Relic, Splunk, Google Cloud Operations, Ansible).
- Strong familiarity with application development, distributed systems, and multi-platform architectures, including robust integration with legacy systems.
- Deep understanding of SDLC processes, DevOps models, and cloud-native engineering practices.
- Technical Expertise (Beneficial):
- Experience with Azure/AWS.
- Experience with financial domain knowledge.
- Soft Skills:
- Strong analytical and troubleshooting skills, with a creative and solution-oriented approach to complex technical problems.
- Excellent communication skills (written and verbal English) and proven cross-team collaboration abilities, capable of driving technical discussions and consensus.
- Proactive mindset with strong ownership and initiative in solving complex engineering problems, even in stressful situations, maintaining a calm and detail-oriented approach.
- Ability to work independently in agile, fast-paced environments, actively seeking opportunities for system improvement.
- A collaborative team player mindset, with self-confidence and a passion for continuous learning and mentoring.
How we’ll support you
- Training and development to help you excel in your career
- Coaching and support from experts in your team
- A culture of continuous learning to aid progression
- A range of flexible benefits that you can tailor to suit your needs
About us and our teams
Please visit our company website for further information:
https://www.db.com/company/company.html
We strive for a culture in which we are empowered to excel together every day. This includes acting responsibly, thinking commercially, taking initiative and working collaboratively.
Together we share and celebrate the successes of our people. Together we are Deutsche Bank Group.
We welcome applications from all people and promote a positive, fair and inclusive work environment.
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
