Pinely

Machine Learning Performance Engineer

Amsterdam, North Holland
Python PyTorch JAX CUDA Triton Deep Learning Machine Learning Transformer LSTM
Description

ML Performance Engineer

Location: Amsterdam, North Holland, Netherlands

Department: Infrastructure

Workplace: on_site

Employment Type: full

Description

We’re looking for a performance-focused ML Engineer to help speed up large-scale model training by optimizing our internal stack and compute infrastructure. You’ll work across the full training pipeline — from GPU kernels to system-level throughput — applying profiling, CUDA-level tuning, and distributed systems techniques. The goal is to reduce training time, boost iteration speed, and use compute more efficiently.

This is a key role in a growing team building deep technical expertise in ML training systems.

Responsibilities

  • Optimize our model training pipeline to improve both speed and reliability, enabling faster and more efficient experimentation;
  • Apply GPU-level optimization techniques using tools like JAX, Triton, low-level CUDA to improve training performance and efficiency at scale;
  • Identify and resolve performance bottlenecks across the entire ML pipeline — from data loading and preprocessing to CUDA kernels;
  • Build tools and extend internal infrastructure to support scalable, reproducible, and high-performance training workflows;
  • Mentor and support engineers and researchers in adopting performance best practices across the team;
  • Help grow the team’s GPU and systems-level capabilities, and contribute to a culture of engineering excellence and rapid experimentation.

Requirements

  • Demonstrated experience optimizing neural network training in production or large-scale research settings - e.g. reducing training time, improving hardware utilization, or accelerating feedback cycles for ML researchers;
  • Extensive practical experience with ML frameworks such as PyTorch or JAX;
  • Hands-on experience with training and optimizing deep learning architectures such as LSTM and Transformer-based models, including different attention mechanisms;
  • Experience working with CUDA, Triton, or other low-level GPU technologies for performance tuning;
  • Proficiency in profiling and debugging training pipelines, using tools such as Nsight/cprofiler/CUDA/gdb/torch profiler;
  • Understanding of distributed training concepts (e.g. data/model/tensor/sequence/pipeline/context parallelism, memory and compute tradeoffs);
  • A collaborative and proactive mindset, with strong communication skills and the ability to mentor teammates and partner effectively within the team;
  • Strong proficiency in Python for building infrastructure-level tooling, debugging training systems, and integrating with ML frameworks and profiling tools;


What we offer

  • High base salary and social benefits;
  • Generous bonus structure. We are very flexible in discussing salary and conditions of employment;
  • Cutting-edge hardware and software in production as well as high technical expertise of the company which allows implementation of bold ideas and boosting great results. Ownership over initiatives that directly solve business problems;
  • Ability to trade on dozens of international exchanges;
  • Flexible workflow (lack of formalism and bureaucracy, no pressure and over-management) and working schedule;
  • Tuition reimbursement, conference and training sponsorship.
Pinely
Pinely

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say