NVIDIA is on the journey to build the best cloud offering for AI workloads and to bring its latest GPU technology to our clients as a set of managed services under the DGX Cloud umbrella. We want to be able to innovate on behalf of our clients and provide an easy, no-hassle way of using the latest and greatest NVIDIA products through scalable managed self-service APIs. We are looking for a Cloud Platform Engineer to drive the technical design and build foundational elements of our high-performing cloud services for Artificial Intelligence and high-performance computing. This is a unique opportunity to be a founding member of a team building at the intersection of a highly scalable fault-tolerant cloud services and AI.
If you are passionate about IaC and you can argue why declarative infra is the way to go, can explain Kubernetes PDB to your family in under 5 minutes, or If you always felt that Kubernetes is great, but this not the ultimate goal and always wanted to extend it and turn into the distributed operating system for AI, you are a perfect fit to join our team!
What you'll be doing:
As a part of the service team, build and design platforms for DGX Cloud services
Figure out how to take best from HPC and Kubernetes and help us make the unified platform
Work within the team of software engineers and product people as well as engineering teams across all of NVIDIA on DGX Cloud AI Compute services
Write IaC code, work on Kubernetes, and help the team to design and implement release pipelines
Collaborate to understand how to make the best use of GitOps and Pipelines
What we need to see:
BS in Computer Science, Information Systems, Computer Engineering or equivalent experience
Solid technical foundation in distributed computing and storage, including substantial experience with all of the following: server systems, storage, I/O, networking, and system software
12+ years of platform engineering experience on large-scale production systems
Kubernetes and IaC expertise as an engineer
Ability to understand and communicate complex designs, distributed infrastructure, and requirements to peers, customers, and vendors
General shared storage knowledge such as NFS, LustreFS, GlusterFS, etc.
Familiarity with system-level architecture, such as interconnects, memory hierarchy, interrupts, and memory-mapped IO.
Ways to stand out from the crowd:
Proven experience in high performance computing, Deep Learning, and/or GPU accelerated computing domains
Large-scale distributed system, HPC, ML and Training experience with Slurm and Kubernetes
Deep knowledge of both software and hardware knowledge in HPC and ML infrastructure
NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions, from artificial intelligence to autonomous cars. NVIDIA is looking for great people like you to help us accelerate the next wave of artificial intelligence.
The base salary range is 224,000 USD - 356,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
Other Jobs from NVIDIA
Senior AI Cluster Tools Developer
Senior Deep Learning Software Engineer, Inference
Senior Deep Learning Profiling Tools Engineer
Senior ASIC Physical Design Engineer, Netlisting
Principal Graphics System Software Engineer
Manager, Developer Technology, Data Compression
Similar Jobs
Systems Engineer
Systems Engineer
Web Developer
Senior Backend Developer
Software Developer, DeployX
Software Engineer
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say