We are looking for a highly motivated AI infrastructure automation and tools development expert to join us. As a seasoned professional with a strong passion for designing and implementing cutting-edge infrastructure solutions, you will play a key role in architecting and driving advancements in our large-scale cloud and on-premise computing clusters. We are a small and fast moving team, and we own production excellence of everything we develop, on all layers from OS and up to the services. Please apply if you are passionate about operational reliability, building AWS infrastructure automation and deployment tools and working on new technologies and Cloud Native applications. The solutions you propose and build will directly impact the efficiency of the NVIDIA Autonomous Vehicles development team!
What you'll be doing:
You will be applying strong programming skills and a deep understanding of the distributed systems design for crafting and building production-grade software.
Focus on designing and implementing Continuous Deployments (CD) pipelines to ensure flawless and efficient software delivery
Responsible for the big picture of how our systems relate to each other and utilizing a breadth of tools and approaches to tackle a broad spectrum of problems.
What we need to see:
BS or MS in the CS/CE/EE or equivalent experience
4+ years of the k8s based computing platforms tooling/APIs development
At least 4 years building automation software for the cloud with Terraform, Python, Go
Strong AWS fundamentals: IAM, VPC, RDS, S3, CDN, EC2
Expert knowledge of DevOps principles, tools, and methodologies
Working experience with Continuous Deployments (CD) pipelines
Good understanding of the Traffic Engineering solutions. Load Balancing, Layer7 proxies
In depth understanding of all layers of the Internet protocols
Operational expertise with Observability, Prometheus eco system, logs ingestion at scale
Proficiency with Linux environment
Excellent written and verbal interpersonal skills
You'll be a fun and motivated teammate who enjoys a challenge and celebrates success
Ways to stand out from the crowd:
Previous experience with building sophisticated tooling and SRE automation on large GPU/CPU clusters
You have working experience with Agentic AI tools for the computing infrastructure management
Artifactory Management at scale
Good understanding of cloud and datacenter security concepts, AWS is preferred
Solid understanding of the large scale k8s observability platforms
NVIDIA is the leader in AI, machine learning and datacenter acceleration! NVIDIA is a “learning machine” that constantly evolves by adapting to new opportunities that are hard to solve, that only we can tackle, and that matter to the world. This is our life’s work, to amplify human imagination and intelligence. Make the choice, join our diverse team today!
The base salary range is 184,000 USD - 356,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
Other Jobs from NVIDIA
Senior Data Engineer, Cloud Operations Engineering
Senior Firmware Engineer - Memory Subsystem
Senior Signal and Power Integrity Engineer - Hardware
Senior Mechanical Product Design Engineer
Senior Mixed Signal Design Validation Engineer
Senior ASIC Verification Engineer, Coherent High Speed Interconnect
Similar Jobs
Principal Site Reliability Engineer
Principal Site Reliability Engineer
PLM Systems Development Engineer, Global Robotics Delivery
Systems Development Engineer, Tooling & Automation (SETA)
Senior Software Engineer
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say