NVIDIA

Senior Software Engineer, CUDA Core Libraries

Remote Munich, Germany
C++ Python PyTorch
Description

Senior Software Engineer, CUDA Core Libraries

Location: Germany, Remote, Germany, Munich

Time Type: Full time

Job Description

NVIDIA’s accelerated computing platform is the foundation of modern HPC and AI.At the core of this platform are the CUDA Core Libraries. C++ and Python libraries that enable developers to write fast, reliable, and scalable GPU-accelerated software! We are hiring a full-time Software Engineer to work on the CUDA Core Libraries that power GPU computing for both C++ and Python developers. This includes projects such as CCCL (Thrust, CUB, libcudacxx), cuda-python, and numba-cuda. You will join the team building the foundational libraries, algorithms, and language/runtime infrastructure that make CUDA a speed-of-light experience for developers across deep learning, scientific computing, and data analytics!

What you’ll be doing:

  • Develop and implement CUDA Core Libraries in C++ and/or Python, including parallel algorithms and idiomatic language bindings for core CUDA functionality.

  • Compose, optimize, and evolve GPU algorithms and APIs, from high-level interfaces down to low-level performance tuning involving memory, parallelism, and synchronization.

  • Own features end-to-end: develop, implementation, testing, benchmarking, documentation, and long-term maintenance.

  • Improve developer experience across the stack: CI, tests, benchmarks, packaging, examples, and docs.

  • Collaborate with senior CUDA engineers in design reviews, code reviews, and open-source-style workflows.

  • Engage with real users through issues, performance investigations, and API feedback.

What we need to see:

  • BS, MS, or PhD in Computer Science, Computer Engineering, or a related field or equivalent experience.

  • Minimum of 8+ years of related development experience

  • Strong programming skills in C++, Python, or both, with proven interest in systems-level software (performance, memory, concurrency, API design).

  • Solid understanding of modern C++ (templates, generics, standard library) and/or Python library development and packaging.

  • Practical experience with parallel or heterogeneous programming (CUDA, OpenMP, GPU-accelerated Python, or similar).

  • Experience contributing to production software or open-source libraries, including testing, profiling, and code review.

  • Ability to work independently, scope problems, and drive projects to completion.

  • Clear written communication for technical design and documentation.

  • Comfort navigating large, multi-language codebases (C++, Python, CMake, Pixi, CI systems).

Ways to stand out from the crowd:

  • Strong understanding of CPU/GPU architecture and how hardware details affect performance.

  • Hands-on experience with CUDA C++, CUDA Python, PyTorch, JAX, Numba, CuPy, or similar GPU-accelerated stacks.

  • Familiarity with Thrust, CUB, libcudacxx, or other modern C++/GPU libraries.

  • Experience with compiler infrastructure or tooling (LLVM, Clang tooling, MLIR).

  • Demonstrated interest in developer tools, library design, and making other developers faster.

If you care deeply about performance, enjoy working at the C++/Python boundary, and want to shape the core CUDA libraries relied on by thousands of developers, this role is a direct fit.

NVIDIA
NVIDIA

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say