Description

We are seeking highly skilled engineers with expertise in machine learning, distributed systems, and high-performance computing to join our Research team. In this role, you will collaborate closely with researchers to build and optimize platforms that train next-generation foundation models on massive GPU clusters. Your work will play a critical role in advancing the efficiency and scalability of cutting-edge generative AI technologies.

Key Responsibilities

Scale and optimize systems for training large-scale models across multi-thousand GPU clusters.
Profile and enhance the performance of training codebases to achieve best-in-class hardware efficiency.
Develop systems to distribute workloads efficiently across massive GPU clusters.
Design and implement robust solutions to enable model training in the presence of hardware failures.
Build tools to diagnose issues, visualize processes, and evaluate datasets at scale.
Optimize and deploy inference workloads for throughput and latency across the entire stack, including data processing, model inference, and parallel processing.
Implement and improve high-performance CUDA, Triton, and PyTorch code to address efficiency bottlenecks in memory, speed, and utilization.
Collaborate with researchers to ensure systems are designed with optimal efficiency from the ground up.
Prototype cutting-edge applications using multimodal generative AI.

Qualifications

Experience:
- 3+ years of professional experience in ML pipelines, distributed systems, or high-performance computing.
- Hands-on experience training large models using Python and PyTorch, with familiarity in the full pipeline: data processing, loading, training, and inference.
- Proven expertise in optimizing and deploying inference workloads, with experience in profiling GPU/CPU code (e.g., Nvidia Nsight).
- Deep understanding of distributed systems and frameworks, such as DDP, FSDP, and tensor parallelism.
- Strong experience writing high-performance parallel C++ and custom PyTorch kernels, with knowledge of CUDA and Triton optimization techniques.
- Bonus: Experience with generative models (e.g., Transformers, Diffusion Models, GANs) and prototype development (e.g., Gradio, Docker).
Technical Skills:
- Proficiency in Python, with significant experience using PyTorch.
- Advanced skills in CUDA/Triton programming, including custom kernel development and tensor core optimization.
- Strong generalist software engineering skills and familiarity with distributed and parallel computing systems.

Note: This position is not intended for recent graduates.

Compensation

The salary range for this role in California is $175,000–$250,000 per year. Actual compensation will depend on job-related knowledge, skills, experience, and candidate location. We also offer competitive equity packages in the form of stock options and a comprehensive benefits plan.

Pika

Artificial Intelligence (AI) Generative AI Graphic Design Video

0 applies

24 views

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say

Pika

Senior Distributed Systems Engineer

Other Jobs from Pika

Frontend Engineer

Senior Research Engineer (Data)

Research Engineer (Foundation Model)

Research Engineer (Applied Research)

Full-Stack/Backend Engineer

Similar Jobs

Senior Machine Learning Engineer

Software Engineer (Machine Learning)

Cloud Support Engineer - Sagemaker, Vision & other, Support Engineering

Experienced Backend Software Engineer - AI

Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training

Data Scientist II, Enterprise Engineering