NVIDIA

Systems Software Engineer - NIM Factory Platforms

Austin, TX Santa Clara, CA
USD 184k - 356k
Microservices Deep Learning Docker Kubernetes
Description

NVIDIA is the platform upon which every new AI-powered application is built. We are seeking a senior engineer to design and build factory automation for NVIDIA Inference Microservices (NIMs). The right person for this role brings technical drive and creativity to change the way NVIDIA optimizes and serves performant inferencing for every AI model. Our NIM offerings are easy to use, highly performant and tested in all deployment scenarios, in the cloud, on customer’s self hosted infrastructure and locally on all NVIDIA GPUs. You will apply your deep technical expertise to design an efficient, scalable and reliable automation factory pipeline that will take AI models to become NIMs that are validated for best in class performance and accuracy. 

NVIDIA is building a new category of products, by intersecting our prowess in deep learning and computing, with industry-leading technologies. You will harness groundbreaking technologies, and build a highly efficient factory to power how NVIDIA builds and validates NIMs for inferencing all the way through deployment in heterogeneous hardware and software environments. You will influence and drive technical advances in NVIDIAs workflows and build the infrastructure that strives to accelerate the delivery of every AI model on NVIDIA's GPUs anywhere. We are looking for technical talent to design and build our factory capabilities, including the underlying infrastructure, pipelines, backends, Docker build, test harness, metrics, performance engineering, log ingestion, and more.

What you'll be doing: 

  • Develop, analyze and optimize factory infrastructure that will take an AI model in and produce a deployable service that is validated across Cloud, On-prem and Kubernetes environments. With the team, define and deliver rapid iterations on the group's technical strategies and roadmaps to deliver and improve the NIM factory. You will be developing harness, automating hardware acceptance, analyze benchmarks, data gathering and statistical analysis of systems health and performance analysis of NIMs

  • Work with technical leaders designing and developing scalable and reliable factory acceptance and performance tuning of hardware platforms. You will collaborate with multiple AI model teams to understand their requirements to build an efficient infrastructure that improves every team's productivity. 

  • You will define metrics and drive improvements based on user feedback. You will mentor and collaborate throughout the team and with other teams to grow your colleagues and yourself. You will have a history of learning and growing your skills and those around you.

What we need to see: 

  • A history of using your advanced programming skills to build tooling and automation for hardware system characterization and benchmarking.

  • Proven experience debugging and analyzing performance of compute applications and system

  • Deep technical expertise working with system software and platform layers including Kernel, device driver, memory, storage, networking and PCIe devices

  • Passion for building platform engineering components and automation of system benchmarking and characterization.

  • Excellent interpersonal skills and the ability to lead multi-functional efforts

  • Experience working with hardware clusters, distributed system, networking, GPU interconnects (PCie, NVlink), node and cluster interconnect (Infiniband)

  • BS or MS in Computer Science, Computer Engineering or related field (or equivalent experience)

  • 6+ years of shown experience developing performant microservice, cloud software and/or tooling roles 

Ways to stand out from the crowd: 

  • Experience delivering optimized system engineering environment for inference applications in data center and consumer grade hardware platforms.

  • A history of building and deploying automated benchmarking solution in Cloud and On-prem environments, and their associated CI/CD pipelines

  • Prior experience in working with large scale compute infrastructure solution

We are widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and creative people in the world working for us. If you're creative and autonomous with a real passion for technology we want to hear from you.

#LI-Hybrid

The base salary range is 184,000 USD - 356,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

NVIDIA
NVIDIA
Artificial Intelligence (AI) GPU Hardware Software Virtual Reality

0 applies

1 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say