Associate Director Application Support Engineering
Location: Hyderabad, India
Do you want to work on innovative projects, collaborate with a dynamic and supportive team, and receive investment in your professional development? At DTCC, we are at the forefront of innovation in the financial markets. We are committed to helping our employees grow and succeed. We believe that you have the skills and drive to make a real impact. We foster a thriving internal community and are committed to creating a workplace that looks like the world that we serve.
The Information Technology group delivers secure, reliable technology solutions that enable DTCC to be the trusted infrastructure of the global capital markets. The team delivers high-quality information through activities that include development of essential, building infrastructure capabilities to meet client needs and implementing data standards and governance.
Pay and Benefits:
- Competitive compensation, including base pay and annual incentive
- Comprehensive health and life insurance and well-being benefits, based on location
- Pension / Retirement benefits
- Paid Time Off and Personal/Family Care, and other leaves of absence when needed to support your physical, financial, and emotional well-being.
- DTCC offers a flexible/hybrid model of 3 days onsite and 2 days remote (onsite Tuesdays, Wednesdays and a third day unique to each team or employee).
The Impact you will have in this role:
The Enterprise Application Support (EAS) team is responsible for providing technical application support for ITP and ECS lines of business. Within EAS, the Associate Director Application Support Engineering / SRE Lead (Site Reliability Engineer Lead) is a senior technical role responsible for driving the overall reliability, scalability, and performance of critical systems by implementing standard methodologies, participating in incident response, automating processes, and collaborating with development teams to ensure system stability and uptime across the organization, often acting as a technical partner in promoting a strong SRE culture within the company; key responsibilities include designing monitoring systems, capacity planning, and actively identifying and mitigating potential issues before they impact users. The SRE team works closely with development teams, infrastructure and network partners, security partners, Scrum Masters, and internal / external clients to improve observability, operational supportability, resiliency, and mean time to restore service through driving improvements to support capabilities.
Your Primary Responsibilities:
- Scrum Participation: Join all project collaborators planning and design sessions, sprint zero and stand-ups for all new delivery, to champion NFRs reflective of a strong observability and resiliency traits.
- System Reliability Architecture: Drive Design and help implement reliable, resilient, and scalable systems, considering redundancy, fault tolerance, and disaster recovery strategies. Make design recommendations that will allow the application to recover without cleanup activities or create a recovery runbook for application support team to follow for improved application recovery times.
- Monitoring and Alerting: Develop comprehensive monitoring systems to identify potential issues proactively, define actionable alerts, and establish SLIs (Service Level Indicators) and SLOs (Service Level Objectives).
- Incident Management: Lead incident response during critical system outages, facilitating timely problem diagnosis and resolution, conducting post-mortem analysis to identify root causes and prevent future occurrences.
- Automation and Tooling: Develop and maintain automation scripts to streamline operational tasks, including self-healing, application deployments, scaling, and infrastructure management.
- Collaboration with Development Teams: Work closely with development teams to integrate SRE practices into the software development lifecycle, promoting code quality, reliability, and observability.
- Security Integration: Collaborate with security teams to ensure system resilience against cyber threats, implementing security best practices and supervising for vulnerabilities.
- Technical Expertise: Stay updated on emerging technologies and industry trends related to cloud computing, distributed systems, and reliability engineering.
- Operational Readiness: Attend and present operational readiness with application support (EAS L2) at each project management meeting - raise any operational risks and concerns. Test NFRs in UAT environments to validate effectiveness and completeness of operational capabilities.
- Risk Management: Partner with IT Embedded Risk Managers to identify strategic solutions for risk incidents.
- Metrics and Reporting: Demonstrate operational improvements through defined KPIs.
- Capacity Planning: Proactively assess system capacity needs, plan for future growth, and implement scaling strategies to ensure optimal performance under high load.
- Performance Optimization: Analyze system performance metrics to identify bottlenecks and implement optimization strategies to improve system responsiveness and efficiency.
Qualifications:
- Minimum of 8 years of related experience
- Bachelor's degree preferred or equivalent experience
Talents Needed for Success:
- Strong Programming Skills: Proficiency in one or more programming languages like Python, Java, Go, etc., for automation and development of monitoring tools.
- System Administration: Expertise in Linux/Unix operating systems, network administration, and cloud platforms (AWS, Azure, GCP). Mainframe experience is a plus.
- Monitoring and Observability: Deep understanding of monitoring tools (Splunk, Dynatrace, ITSI, etc.) and experience in designing robust monitoring systems.
- Incident Management: Proven track record to participate in incident response teams under pressure, effectively solving complex issues.
Actual salary is determined based on the role, location, individual experience, skills, and other considerations. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.
About Us
DTCC proudly supports Flexible Work Arrangements favoring openness and gives people freedom to do their jobs well, by encouraging diverse opinions and emphasizing teamwork. When you join our team, you’ll have an opportunity to make meaningful contributions at a company that is recognized as a thought leader in both the financial services and technology industries. A DTCC career is more than a good way to earn a living. It’s the chance to make a difference at a company that’s truly one of a kind.
Learn more about Clearance and Settlement by clicking here.
About the Organization
To maintain strong alignment between IT and the business, we are bringing together all Solutions-focused teams under a unified technology organization, IT Solutions. The newly-formed IT Solutions department combines Application Development and Enterprise Application Support functions, allowing us to leverage synergies to support the Solutions business lines.
Within EAS, the Associate Director Application Support Engineering / SRE Lead (Site Reliability Engineer Lead) is a senior technical role responsible for driving the overall reliability, scalability, and performance of critical systems by implementing standard methodologies, participating in incident response, automating processes, and collaborating with development teams to ensure system stability and uptime across the organization, often acting as a technical partner in promoting a strong SRE culture within the company; key responsibilities include designing monitoring systems, capacity planning, and actively identifying and mitigating potential issues before they impact users.There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
