Aleph Alpha

AI Inference Engineer - Large Language Models (f/m/d)

Berlin, Germany
Deep Learning Python PyTorch Rust API
Description

Overview:

You will join our product team in a position that sits at the intersection of artificial intelligence research and real-world solutions. We foster a highly collaborative work culture where you can expect to work closely with your teammates and have a high level of communication between teams through methodologies such as pair or mob programming. 

Your responsibilities:

  • Model Inference: Focus on inference optimization to ensure rapid response times and efficient resource utilization during real-time model interactions.  

  • Hardware Optimization: Run models on various hardware platforms, from high-performance GPUs to edge devices, ensuring optimal compatibility and performance.  

  • Experimentation and Testing: Regularly run experiments, analyze outcomes, and refine the strategies to achieve peak performance in varying deployment scenarios.  

  • Staying up to date with the current literature on MLSys  

Your profile:

  • You care about making something people want. You want to ship something that will bring value to our users. You want to deliver AI solutions end-to-end and not finish building a prototype.  

  • Bachelor's degree or higher in computer science or a related field. 

  • You understand how multimodal transformers work. 

  • You understand the characteristics of LLM inference (KV caching, flash attention, and model parallelization). 

  • You have experience in system design and optimization, particularly within AI or deep learning contexts. 

  • You are proficient in Python and have deep understanding of deep learning frameworks such as PyTorch. 

  • A deep understanding of the challenges associated with scaling AI models for large user bases. 
     

Nice if you have:  

  • Previous experience in a high-growth tech environment or a role focused on scaling AI solutions. 

  • Hands-on experience with large language models or other complex AI architectures. 

  • Expertise with CUDA and Triton programming and GPU optimization for neural network inference. 

  • Experience with Rust.  

  • Experience in adapting AI models to suit a range of hardware, including different accelerators. 

  • Experience in model quantization, pruning, and other neural network optimization methodologies. 

  • A track record of contributions to open-source projects (please provide links). 

  • Some Twitter presence discussing ML Sys topics. 

What you can expect from us:

  • Become part of an AI revolution!

  • 30 days of paid vacation

  • Public transport subsidy

  • Fitness and wellness offerings (Wellhub)

  • Mental health platform (nilo.health)

  • Flexible working hours and hybrid working model

  • Virtual Stock Option Plan

Aleph Alpha
Aleph Alpha
Artificial Intelligence (AI) Generative AI Machine Learning Natural Language Processing Software

0 applies

10 views

Similar Jobs

AI Framework Engineer

Shanghai, China

Data Scientist(6)

Noida, India

Senior Data Scientist

Remote Tunis, Tunisia

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 401 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say