Member of Technical Staff - Foundations
Department: Machine Learning team
Location: San Francisco / Tel Aviv / Zurich
Employment Type: FullTime
Tzafon is a foundation model lab building scalable compute systems and advancing machine intelligence, with offices in San Francisco, Zurich & Tel Aviv. We’ve raised over $12m in funding to advance our mission of expanding the frontiers of machine intelligence.
We're a team of engineers and scientists with deep backgrounds in ML infrastructure & research. Founded by IOI and IMO medalists, PhDs, and alumni from leading tech companies, such as Google Deepmind, Character, and NVIDIA, we train models and build infrastructure for swarms of agents to automate work across real-world environments.
You'll work between our product and post-training teams to ship Large Action Models that actually work. Build evals, benchmarks, and fine-tuning pipelines. Define what good model behavior means and make it happen at scale.
What you'll do
Design and execute large scale training runs on our clusters
Build and optimize distributed training infrastructure across massive multi-node systems
Implement post-training pipelines at scale
Develop data pipelines that process and filter trillions of tokens for pre-training
Research and implement architectural improvements, scaling laws, and training optimizations
Debug training instabilities, loss spikes, and convergence issues in long-running jobs
Build tooling for cluster utilization, fault tolerance, and checkpoint management
Write custom CUDA/Triton kernels to optimize critical training operations (attention, normalization, activations)
Collaborate on research that advances the state of the art in foundation model training
We're looking for
Deep experience pre-training or post-training foundation models on large clusters
Expert-level at Python and ML frameworks (PyTorch, JAX, Torchtitan)
Strong systems skills: distributed training, FSDP/ZeRO, tensor parallelism, pipeline parallelism
Experience writing performant CUDA or Triton kernels for ML workloads
Track record of running stable multi-week training jobs and debugging distributed training failures
Understanding of cluster scheduling, networking bottlenecks, and GPU/TPU performance optimization
Preferred Experience
Trained foundation models at major AI labs (OpenAI, Anthropic, Google DeepMind, Meta, xAI, etc.)
Worked on large scale RL runs
Optimized critical training kernels (FlashAttention, fused optimizers, custom kernels)
Published research at top ML conferences (NeurIPS, ICML, ICLR)
Contributions to open source ML infrastructure (PyTorch, JAX, vLLM, etc.)
Experience with training data pipelines, data quality research, or synthetic data generation
Life at Tzafon
Full medical, dental, and vision coverage, plus 401(k) in the us
Office in SF, Zurich, and Tel Aviv
Early-stage equity in a future-defining company
Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.
Compensation starts at $200k-$500k + equity package, depending on experience & location.
We also offer a referral bonus of $5k for referral of successful hires (send to [email protected]).
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
