NVIDIA

Principal Data Scientist, Accelerated Apache Spark

Santa Clara, CA US
USD 272k - 419k
Python Pandas TensorFlow Java Spark Machine Learning NumPy PyTorch Scala
This job is closed! Check out or
Description

NVIDIA is looking for a Principal Data Scientist to join the GPU accelerated Apache Spark team. Data scientists spend a considerable amount of time exploring data, iterating over machine learning (ML) experiments.Apache Spark is the most popular data processing engine in data centers for data science. It is used for interactive data science, from data preparation, to running ML experiments, and all the way to deployment of ML applications. You will work with the open source community to accelerate Apache Spark with GPU. You will apply the latest ML/AI methods to empower enterprises to migrate Spark workloads onto GPUs at scale. Come join NVIDIA to apply data science to help us grow the adoption of GPU accelerated Spark.

What you’ll be doing:

  • Develop ML models to predict the performance of GPU accelerated Apache Spark on existing workloads.

  • Develop ML models to tune GPU accelerated Apache Spark configurations to optimize performance on specific workloads.

  • Work on systems that continuously adapt and improve the aforementioned ML models.

  • Work on ML/AI agents that can help fix and optimize GPU accelerated Apache Spark applications.

  • Work on new functionality for GPU accelerated Apache Spark to facilitate large scale ML model training and inference.

  • Create examples showcasing how to best use GPU accelerated Apache Spark and Spark MLlib to carry out large scale ML and DL training and inference.

  • Work with NVIDIA partners and customers on deploying GPU accelerated Spark ML algorithms in cloud or on-premise.

  • Keep up with published advances in ML systems and algorithms.

  • Provide technical mentorship in data science and ML to a team of engineers.

What we need to see:

  • BS, MS, or PhD in Data Science, Statistics, Computer Science, Computer Engineering, or closely related field (or equivalent experience).

  • 12+ years of work or research experience, with 5+ years as technical lead, in ML model development.

  • 2+ years of hands-on experience with Apache Spark.

  • Proven technical skills in crafting, implementing, and productionizing high-quality ML solutions.

  • Proven ability to use modern techniques and tools for all aspects of ML model development, deployment, and maintenance.

  • Excellent programming skills in Python and Python data science related libraries like numpy, pandas, scikit-learn, scipy, pytorch, and tensorflow.

  • Experience developing boosted tree model based solutions, using libraries like XGBoost.

  • Background in developing LLM/GenAI based solutions.

  • Experience in feature engineering and feature importance assessment.

  • Familiar with agile software development practice.

Ways to stand out from the crowd:

  • Knowledge of architecture of Apache Spark is a strong plus.

  • Familiarity with NVIDIA GPUs and CUDA is a strong plus.

  • Experience coding in Scala, Java, and/or C++ is a strong plus.

  • Able to work well with multi-functional teams across organizational boundaries and geographies.

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most skilled and dedicated people in the world working for us. If you are passionate about what you do, creative and driven, we want to hear from you!

The base salary range is 272,000 USD - 419,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

NVIDIA
NVIDIA
Artificial Intelligence (AI) GPU Hardware Software Virtual Reality

0 applies

2 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say