Microsoft

Senior Site Reliability Engineer

Redmond, WA San Francisco, CA
USD 117k - 250k
Azure GCP Chef Python Docker Kubernetes AWS Terraform Ansible Puppet Bash
Search for More Jobs Talk to a recruiter now 💪
This job is closed! Check out or
Description

Microsoft is looking for a Senior Site Reliability Engineer to support and expand Viva Engage. Viva Engage (formerly Yammer) is the industry-defining social network for the enterprise. We provide a platform for millions of employees, including those from 85% of Fortune 500 companies, to build community and culture, share knowledge, and connect with their leaders and each other.  

 

 The user base for Viva Engage is growing quickly. The Site Reliability team is responsible for keeping the services reliable as we scale and modernize our tech stack. We are seeking a Senior Site Reliability Engineer who knows how to manage the conflicting priorities of keeping things running today while making sure we have the architecture we need for the future. 

 

Acquired by Microsoft in 2012, Viva Engage combines the benefits of a startup - rapid innovation, cutting-edge technology, outsized individual impact - with the advantages of working for one of the most successful software companies in the world. We believe in mission-driven work and in this post-Covid world, our platform has become more indispensable than ever as it fosters connection and a sense of belonging among remote teams. 

 

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. 

 

In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.

Required Qualifications:

  • 6+ years technical experience in software engineering, network engineering, or systems administration
    • OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration
    • OR Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration.

Other Requirements:

 

Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:

  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
  • Citizenship Verification:
    • This position requires verification of citizenship due to citizenship-based legal restrictions. Specifically, this position supports United States federal, state, and/or local government agency customers and is subject to certain citizenship-based restrictions where required or permitted by applicable law. To meet this legal requirement, and as a condition of employment, the successful candidate’s citizenship will be verified with a valid passport.
  • This position requires passing a background check conducted through the CJIS criminal justice information system by authorized local, state, and/or federal agencies and across multiple states. This role requires candidates to maintain CJIS screening eligibility. This position is required to work in GCC-M, GCC-H, and DoD environments. 

Preferred Qualifications:

  • Experience applying SRE principles in a large production environment. 
  • Demonstrated proficiency in cloud computing platforms (e.g., AWS, Azure, GCP) and related services (e.g., EC2, S3, VPC, IAM, Lambda). 
  • Expertise in automation tools and frameworks (e.g., Terraform, Ansible, Chef, Puppet) and scripting languages (e.g., Python, Bash). 
  • Deep understanding of containerization and orchestration technologies (e.g., Docker, Kubernetes). 
  • Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack) and incident response processes. 
  • Demonstrated problem-solving skills and the ability to troubleshoot complex issues in distributed systems. 
  • Effective communication and collaboration skills, with the ability to work effectively in a cross-functional team environment. 

Site Reliability Engineering IC4 - The typical base pay range for this role across the U.S. is USD $117,200 - $229,200 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $153,600 - $250,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

Microsoft will accept applications for the role until June 5, 2024. 

 

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances.  We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.

 

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

 

 

 

 

 

 

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances.  We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.

 

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

  • Participate in on-call rotations and incident responses throughout product development and operations cycles. On-call will require responding to support requests after normal business hours to include the weekends and/or holidays in a designated Microsoft office.   
  • Monitor system performance and proactively identify and resolve issues to ensure high availability and performance.  
  • Develop and maintain automation tools and processes for deployment, monitoring, and configuration management. 
  • Apply troubleshooting skills, debugging tools, and examines logs, telemetry, and other methods to verify assumptions and customer impact. Proactively and reactively address findings with customer and/or service engineering efficiently via written and verbal communications. 
  • Lead blameless postmortems for root cause and production resiliency.  
  • Consult with developers to design services that scale in Azure.  
  • Mentor less experienced team members and contribute to the overall growth and development of the team. 
  • Stay current with industry trends and emerging technologies in site reliability engineering and cloud computing.

     

    Other

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 307 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

Cancel anytime / Money-back guarantee

Wall of love from fellow engineers