Morgan Stanley

Site Reliability Engineer

Remote Alpharetta, GA
Spring API JQuery JavaScript Azure AWS Swift Java
Description

Company profile

Morgan Stanley is a leading global financial services firm providing a wide range of investment banking, securities, investment management and wealth management services. The Firm's employees serve clients worldwide including corporations, governments, and individuals from more than 1,200 offices in 43 countries.


As a market leader, the talent and passion of our people is critical to our success. Together, we share a common set of values rooted in integrity, excellence, and strong team ethic. Morgan Stanley can provide a superior foundation for building a professional career - a place for people to learn, to achieve and grow. A philosophy that balances personal lifestyles, perspectives and needs is an important part of our culture.
It is important to stress that strong interpersonal skills and team spirit are required in addition to the ability to proficiently deliver verbal and written communications. This is a team that works extensively with other IT members in all MSWM locations. This job demands strong work ethic and requires extensive hands-on, active participation.

 Department Profile

Morgan Stanley Wealth Management (MSWM) Technology is the global technology department responsible for the design, development, delivery and support of the technical solutions behind the products and services used by the Morgan Stanley Wealth Management (MSWM) business. The department is comprised of 10 organizations: Sales, Banking & Corporate-Client Technology, Investment Products & Markets Technology, Client Reporting, Core Processing, Private and International Wealth Management Technology, Technology Integration Office, Enterprise Infrastructure & Production Management, Capital Markets Application & Data Services, Deployment Planning & Release Management, and the Chief Operating Office.

 The Reliability Operations(RO) with in WMT is responsible for providing swift, courteous, and knowledgeable customer service to end users of the production systems. This position is focused on user and systems support, answering hotline calls, monitoring systems alerts, and taking corrective action. Technical understanding is important as well as the ability to speak to users and understand their problems. In addition to direct user support tasks, the team performs infrastructure related tasks including process configuration, hardware capacity planning, event management, release work, and support tool development to ensure any repetitive tasks are packaged to remove any element of risk.

 Position Overview 

o   The Wealth Management Production Management Site Reliability Engineer position is a highly visible/critical role, which will be a team member of technical SME’S managing the stability and optimization of the Wealth Management systems.  Scope includes but not limited to, the day-to-day support of the organization’s technology related outages, collaboration on technology projects focused on stability, optimization, business impact analysis, and associated risk-related methodologies. This role will be responsible for overall stability of the Wealth Management Investment Management application platforms, participation on key optimization initiatives, and collaboration with multiple technical teams within Morgan Stanley.  Additionally, partner with WM business units, various levels of management and staff to collect, analyze and make recommendations on optimizing the platform.

o   As a team member with expertise in deep analytical triage, you will provide subject matter expertise in debugging, issue analysis and troubleshooting, working with business and technical colleagues to provide reviews and recommendations to avoid any future application issues. Produce guidance documentation, standards and procedures, products assessments, and training material including working with the various application and infrastructure support teams ensuring that they are documenting every single troubleshooting step in Morgan Stanley knowledge base system to resolve issues in a faster time frame. You will serve as a fully seasoned/proficient technical resource; provide technical knowledge in outage management and proactive solutions to improve the user experience.

o   This position will mainly perform DevOps/ SRE role in Digital.

The Right candidate:

  • Proactively detecting, troubleshooting, and resolving all issues affecting production applications. This involves coordination with and escalation to development and external teams where necessary. This team owns all issues escalated to us until it is resolved or a workaround is provided for end user to continue functioning.
  • Responsible for maintaining clear, concise, and timely communications with affected parties during the investigation and resolution of any individual or system-wide outage.
  • Responsible for the stability of the Production environment
  • Develop and continually revise (in partnership with other teams where necessary) suitable policies and procedures to ensure appropriate application development standards are available to guide development for systems deployed to Production.
  • As the gatekeepers of the Production environment, responsible for ensuring the Change Implementation Management guidelines/policies are adhered to for all systems deployed to Production.
  • Responsible for servicing all requests for data or other activities that require access to Production systems.
  • Work with development teams at the appropriate stages in application development to ensure any new systems or projects meet the Production standard.
  • Responsible for maintaining and growing a body of knowledge that is accessible to all team members. Ensure information regarding any support related activities or issues are available and easily accessible. The goal is to improve self-reliance and reduce dependency on the availability of development or external team resources for the initial troubleshooting and resolution of problems.

Primary skills

o   Minimum of 3 years’ experience in developing and/or supporting Enterprise Applications

o   3+ years of experience in java server technologies like J2EE, servlets.·      

o   Extensive hands-on experience in Core Java, Spring Framework, Spring Boot, Spring Integration·     

o   Experience in Tomcat, Maven, etc. tools used for Java development·      

o   Experience with Service Oriented Architecture (SOA)·      

o   Strong knowledge of object-oriented programming design patterns and methodologies·      

o   Good understanding of Web Services protocols such as REST, SOAP and API design for extensibility and portability      

o   Experience in client-side technologies like AJAX, JQuery, JavaScript (preferred).

o   Experience Azure, AWS is Plus

o   Should be a fast learner of technologies in a quick paced environment.

o   Have strong organizational skills and the ability to manage multiple tasks and high-pressure situations for outage handling, management, or resolution.

o   Is driven to learn about new technologies, techniques and what it takes to be an integral member of this team.

o   Hands-on experience administering large-scale, high-availability systems and the tools to monitor performance and availability.

o   Experience creating technical architecture documentation.

o   Excellent communication and writing skills specific to technical discussions across the management layers.

o   BS/MS or equivalent, preferably in quantitative discipline (Computer Science, Computer Engineering, EE, Math, Physics).

o   Experience with incident “on call” and ability to respond to emergencies on a 24/7 basis.

o   Experience working with Financial Services area will be a plus.

o   Assisting in the investigation and troubleshooting of production issues and playing an active role in mentoring/coaching/training and development of team members 

Secondary Skills

 o   Experienced, technically hands-on professional that understands both code and infrastructure.

o   Solid track record in an operational/support role, understands incident/problem/change management and how to drive stability across organizations.

o   Be able to manage an outage incident, coordinating user communications, and other teams to help resolve an incident.

o   Strong and keen focus on metrics and trend analysis

o   Strong problem-solving skills with ability to analyze and understand data

o   Candidate must have the ability to forge strong relationships and coordinate effectively with multiple parties during outages and actively communicate updates to APG and BU partners.

o   Must be comfortable with on-call rotation including weekend work

o   End user support - able to talk to users to discuss their problem and work through to a resolution

o   Self-motivated with exceptional oral and written communication skills, ability to communicate clearly and concisely

o   Strong ownership mentality with a focus on customer satisfaction

o   Detail oriented and organized with strong analytical skills

o   Experience working in a virtual or global team

o   Self-starter and ability to multi task with can do attitude

o   Familiarity with ITIL terms around incident and problem management

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

50,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 232 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

Cancel anytime / Money-back guarantee

Wall of love from fellow engineers