Microsoft

Senior Software Engineer

Bengaluru, India
Deep Learning Kubernetes Machine Learning C# C++ Rust API Azure
Search for More Jobs Talk to a recruiter now 💪
This job is closed! Check out or
Description

Azure ML Infrastructure team is looking for passionate engineers to build the largest deep-learning infrastructure service at Microsoft. In this role you will be tasked with building new components to bring the latest innovations in AI Infrastructure onto the Azure ML platform. You will partner with top engineering talent within Azure ML Infrastructure and across Azure to work on cluster orchestration, job scheduling, storage, networking, containerization and operating system integration. Your work will enable various AI languages and run-times on Azure ML Infrastructure to bring distributed deep learning training and inferencing to life. In addition, you will build infrastructure components required to build, deploy, monitor and service highly available and scalable Microsoft Service Fabric and Kubernetes clusters under your care. You will lead development and customer support from the frontline and establish architecture, service excellence guidelines and a high-quality bar

Candidates must have a track record for delivering engineering and service excellence on a mid-to-large scale service

 

Who are We?

 

We are engineers on Azure ML Infrastructure. We believe that building a planet-scale AI Supercomputer from the ground-up which addresses the fundamental pain-points of data scientists and AI practitioners and takes AI to the unprecedented scale is an opportunity of a lifetime. If you share the same dream as us, come join us!

 

What Is Azure ML Infrastructure?

 

High scale AI workloads are always testing the limits of the infrastructure stack. Large-scale model training and inference with huge data volumes of training data on hundreds-thousands of GPUs make it a true engineering challenge. Azure ML Infrastructure is a globally distributed, multi-tenant service that provides robust, cost-effective and competitive AI infrastructure (compute, networking and storage) for AI training and inferencing. By abstracting workloads from underlying infrastructure, Azure ML Infrastructure creates a shared pool of resources that can be dynamically provisioned for full utilization of expensive GPU compute, and enabling data scientists to productively build, scale, experiment, and iterate their models on top of a robust, performant, scalable and cost-effective distributed infrastructure built for AI. In Azure ML Infrastructure, we are constantly seeking to apply the best ideas from AI, ML, distributed systems, distributed databases, machine learning, information retrieval, networking, and security

  • 6+ years of experience with coding in one of C#, C or C++, Rust, go
  • Experience working with the Linux operation system and Kubernetes cluster orchestration
  • Experience with improving service operations or engineering fundamentals
  • Excellent collaboration skills
  • A master’s or bachelor’s degree in computer science or a related field
  • At least 5 years of experience building and shipping production software or services

#IDCAIPlatformHiring

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form

 

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work

  • Deliver a robust container orchestration platform for Singularity
  • Design and build the scheduling sub-system that is responsible for delivering on the SLAs for AI training and inferencing workloads
  • Design and build storage and caching system for efficient DNN training and inferencing
  • Design and build control plane APIs for creation and management of training jobs and inference model metadata
  • Deliver node management, fault detection and node repair as a service to improve job/model reliability
  • Deliver world-class monitoring systems and telemetry pipelines to enhance service and job observability for both end-users and operators
  • Codify security and compliance requirements by building and strengthening system defenses against malicious attacks and exploits
  • Leverage performance and profiling tools to identify hot spots and bottlenecks across hardware and software boundaries: from CPU, GPU, microcode, OS, networking code and drive end-to-end job performance
Microsoft
Microsoft
Data Management Developer Tools DevOps Enterprise Software Operating Systems

0 applies

3 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 389 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • Salaries for the engineering jobs on our site range from $100K-$200K. On average, senior engineer positions on our EchoJobs are about $160K.
  • The EchoJobs positions have been sourced and vetted from the top companies to work for in the US as a software engineer, including LinkedIn and other reputable job sites. We also have syndicated jobs from companies that have just raised funding, as well as those that have great unique products and culture. From all of these sources, our founder, Morgan, has also resourced the company's authenticity in terms of their website, public appearance, and more.
  • Yes, our users asked us for just this, so now our search filters allow you to search for your top jobs via location, as well as by onsite, remote, or both. Approximately 30% of our jobs are remote, so you’ve got the best options for you!
  • We have not yet implemented this option, but are considering doing so in the future. For the moment, you would need to cancel your subscription, and resubscribe when you wanted to come back.
  • We add new jobs to EchoJobs every day! We scan our sources for the newest jobs, verify them, and post them to EchoJobs within minutes. We add about 2,000-3,000 new jobs for you each day!
  • From starting your job search to getting hired, the entire job search process can take us software engineers anywhere between 3-6 months. However, at EchoJobs, we’re striving to shorten this duration by finding the best, newest jobs for you, so you can do less job searching, and more applying.
  • We’d recommend checking EchoJobs daily, as we add new jobs to the site each day. Additionally, if you got a chance to read our previous email on “what makes EchoJobs different from any other job search tools,” we also recommended that you set a job alert based on your job filters, so if you get emails on those new jobs, you could be checking more than once per day.
  • If you decide to continue with us after the 1-month trial, we definitely recommend this, as we all know it usually takes 3-6 months to find a quality job as a software engineer these days. So to best support you, we just adjusted our membership options at EchoJobs to monthly, 3 months, or 12 months (this option is more for passive job seekers looking a little bit for the future if they want to come back to work or make a job switch potentially. This lets you see what’s out there in case an even better fit job becomes available.)
  • EchoJobs is truly the only job site of its kind. We want to be THE spot for you to find the best job for you, and haven’t encountered any other company doing this. Other job sites are in niches besides software engineering or focus on a small portion of engineering jobs (like a specific coding language). In the words of Morgan, our founder, “I think what makes EchoJobs different is the amount of jobs, frequency that we add new jobs (we add 2,000-3,000 new jobs daily!), and the powerful search engines to find exactly the job you want more easily and efficiently. We can provide you with the most jobs that are vetted by us, we’ll continually find more new jobs for you, and we make it easier for you to apply and get hired.

What Fellow Engineers Say