Why do you charge job seekers to use EchoJobs?

We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.

How many software engineering jobs are on EchoJobs?

We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!

So, where do the jobs come from?

We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.

What makes EchoJobs different?

We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️

How often are new jobs added?

Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀

How fast can I find a job?

Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯

How often should I check EchoJobs?

Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

Description

Member of Technical Staff -- Inference

Location: Palo Alto

Department: engineer

About the Role

RadixArk is seeking a Member of Technical Staff — Inference to push the limits of large-scale AI inference.

You will work on the core systems that serve frontier models at scale, optimizing performance, latency, throughput, and cost across thousands of GPUs. This role sits at the intersection of systems engineering, ML infrastructure, and performance optimization.

Your work will directly shape how state-of-the-art models are deployed and experienced by users worldwide.

This is a deeply technical, high-impact role for engineers who enjoy working close to the hardware–software boundary and solving performance-critical problems at scale.

Requirements

5+ years of experience in systems engineering, ML infrastructure, or performance-critical backend systems
Strong expertise in large-scale inference systems for LLMs or generative models
Deep understanding of GPU architecture and performance characteristics
Experience optimizing latency- and throughput-critical production systems
Strong knowledge of distributed systems and networking fundamentals
Proficiency in C++, Rust, Go, or Python for production systems
Experience profiling and optimizing compute-intensive workloads
Strong debugging skills across system layers (model, runtime, kernel, network)

Strong Plus

Experience with LLM serving stacks (vLLM, TensorRT-LLM, SGLang, etc.)
Familiarity with CUDA, Triton, or custom kernel optimization
Experience with batching, KV-cache management, and scheduling strategies
Experience running inference at scale (1000+ GPUs)
Background in HPC or high-performance systems
Open-source contributions in ML or systems infrastructure

Responsibilities

Design and build large-scale inference systems for frontier AI models
Optimize latency, throughput, and GPU utilization in production inference
Develop and improve model serving architectures and runtimes
Work on batching, scheduling, and memory management strategies
Collaborate with kernel, compiler, and systems teams on performance optimization
Debug performance bottlenecks across the stack
Drive reliability and scalability of inference infrastructure
Build tooling for observability, profiling, and performance analysis
Contribute to long-term inference architecture and strategy

About RadixArk

RadixArk is an infrastructure-first company built by engineers who've shipped production AI systems, created SGLang (20K+ GitHub stars, the fastest open LLM serving engine), and developed Miles (our large-scale RL framework).

We're on a mission to democratize frontier-level AI infrastructure by building world-class open systems for inference and training.

Our team has optimized kernels serving billions of tokens daily and designed distributed systems coordinating 10,000+ GPUs across training and serving.

We're backed by leading infrastructure investors and collaborate with frontier AI labs and cloud providers.

Join us in building the infrastructure layer that powers the next generation of AI.

Compensation

We offer competitive compensation with meaningful equity, comprehensive benefits, and flexible work arrangements. Compensation depends on location, experience, and level.

Equal Opportunity

RadixArk is an Equal Opportunity Employer and welcomes candidates from all backgrounds.

RadixArk

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say