We are seeking highly skilled engineers with expertise in machine learning, distributed systems, and high-performance computing to join our Research team. In this role, you will collaborate closely with researchers to build and optimize platforms that train next-generation foundation models on massive GPU clusters. Your work will play a critical role in advancing the efficiency and scalability of cutting-edge generative AI technologies.
Key Responsibilities
- Scale and optimize systems for training large-scale models across multi-thousand GPU clusters.
- Profile and enhance the performance of training codebases to achieve best-in-class hardware efficiency.
- Develop systems to distribute workloads efficiently across massive GPU clusters.
- Design and implement robust solutions to enable model training in the presence of hardware failures.
- Build tools to diagnose issues, visualize processes, and evaluate datasets at scale.
- Optimize and deploy inference workloads for throughput and latency across the entire stack, including data processing, model inference, and parallel processing.
- Implement and improve high-performance CUDA, Triton, and PyTorch code to address efficiency bottlenecks in memory, speed, and utilization.
- Collaborate with researchers to ensure systems are designed with optimal efficiency from the ground up.
- Prototype cutting-edge applications using multimodal generative AI.
Qualifications
- Experience:
- 3+ years of professional experience in ML pipelines, distributed systems, or high-performance computing.
- Hands-on experience training large models using Python and PyTorch, with familiarity in the full pipeline: data processing, loading, training, and inference.
- Proven expertise in optimizing and deploying inference workloads, with experience in profiling GPU/CPU code (e.g., Nvidia Nsight).
- Deep understanding of distributed systems and frameworks, such as DDP, FSDP, and tensor parallelism.
- Strong experience writing high-performance parallel C++ and custom PyTorch kernels, with knowledge of CUDA and Triton optimization techniques.
- Bonus: Experience with generative models (e.g., Transformers, Diffusion Models, GANs) and prototype development (e.g., Gradio, Docker).
- Technical Skills:
- Proficiency in Python, with significant experience using PyTorch.
- Advanced skills in CUDA/Triton programming, including custom kernel development and tensor core optimization.
- Strong generalist software engineering skills and familiarity with distributed and parallel computing systems.
Note: This position is not intended for recent graduates.
Compensation
The salary range for this role in California is $175,000–$250,000 per year. Actual compensation will depend on job-related knowledge, skills, experience, and candidate location. We also offer competitive equity packages in the form of stock options and a comprehensive benefits plan.
Other Jobs from Pika
Frontend Engineer
Senior Research Engineer (Data)
Research Engineer (Foundation Model)
Research Engineer (Applied Research)
Full-Stack/Backend Engineer
Similar Jobs
Senior Machine Learning Engineer
Sr. Software Engineer- AI/ML, AWS Neuron Distributed Training
Data Scientist II, Enterprise Engineering
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say