Senior Infrastructure Engineer (Backend/Data Performance)
Team: AI
Location: Remote
Commitment: Full-time
Workplace Type: remote
Salary:
In this role you will
- Design and optimize high-performance data pipelines for distributed training and storage (using tools like Arrow, DuckDB, LanceDB, BigQuery, vector databases).
- Focus on low-level optimizations (latency, throughput, reliability, GPU usage).
- Build monitoring and visualization tools for tracking data quality, pipeline performance, and experiments.
- Optimize distributed AI workloads for reliability, latency, and efficiency.
- Scope and supervise projects so that interns, PhD students, and post-docs can contribute and collaborate effectively.
- Support recruiting efforts and help shape the growth of the infrastructure team.
Your background looks something like
- 5+ years of backend or infrastructure engineering experience
- Strong Python programming skills (bonus points for lower-level languages)
- Experience with distributed systems and cloud platforms (AWS, GCP, Azure)
- Hands-on experience with containerization (Docker, Kubernetes) and infrastructure as code (Terraform)
- Experience building or supporting ML/AI infrastructure in production
- Experience with high-performance data tools (DuckDB, Apache Spark, Delta Lake)
- GPU orchestration and large-scale model training experienceFamiliarity with ML platforms (SageMaker, Vertex AI) and frameworks (PyTorch, JAX)
- Experience mentoring junior engineers, interns, or researchers and breaking down complex projects into manageable tasks
- Experience participating in technical hiring processes and evaluating candidates
It would be even better if you
- Have deep knowledge of training architectures, CUDA programming, or TPU optimization
- Have Full-stack development experience with frameworks like
- React for building web applications
- Experience managing HPC infrastructure with tools like Slurm or Kubernetes clusters
- Background in monitoring stacks (Prometheus, Grafana) for ML pipeline observability
About the hiring process
- Initial interview: 30-minute discussion to align on experience and expectations
- Technical screening: Two interviews and a take home exercise covering coding and system design
- Panel interview: Assess team alignment
- Final interview: Conversation with our Chief Scientist
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
