Abaka AI

AI Data Infrastructure Engineer

Palo Alto, CA
USD 110k - 160k
Python API LLM
Description

AI Data Infrastructure Engineer

Location: Palo Alto, CA

Department: Business Development

About Abaka AI
Abaka AI is built on one mission: to be the world’s most trusted data partner for AI companies. More than 1,000 industry leaders across Generative AI, Embodied AI, and Automotive AI rely on us to power their data pipelines. With our headquarters in Silicon Valley—and teams in Paris, Singapore, and Tokyo—we support global partners with fast, reliable, and scalable data solutions.
Our offerings include a diverse catalog of off-the-shelf datasets (image, video, multimodal, reasoning, 3D, and beyond) as well as comprehensive data collection and annotation services. Whether teams need raw data, curated datasets, or full-cycle data engineering, Abaka AI provides the foundation for building high-performance AI systems.
 
About the Role
We’re hiring an AI Data Infrastructure Engineer to build systems that power how large-scale datasets for LLM and multimodal models are discovered, evaluated, and scaled. This is a builder-first engineering role focused on designing LLM-powered agents, automation systems, and data pipelines. You’ll work on problems like:
  • Automatically discovering new data sources across the internet
  • Using LLMs and agents to evaluate and filter data sources at scale
  • Building systems that significantly increase data throughput without increasing headcount
  • This role sits at the intersection of data engineering, LLM systems, and applied AI infrastructure, and is ideal for someone who enjoys building from scratch and shipping fast.
 
Responsibilities
  • Build LLM-powered agents and automation systems for data discovery and evaluation
  • Design and implement data pipelines for ingesting, filtering, and transforming large-scale datasets
  • Develop internal tools for data quality scoring, ranking, and selection
  • Experiment with scraping, APIs, and programmatic data collection at scale
  • Rapidly prototype and iterate on systems that improve data acquisition speed and quality
  • Collaborate closely with Data Engineering and Research teams to align data systems with model needs
  • Build scalable systems that increase data throughput and efficiency
 
Qualifications
  • Strong technical foundation (engineering, scripting, systems, or data-focused background)
  • Experience building tools, automation, or pipelines from 0→1
  • Comfortable with Python, APIs, scraping, or backend workflows
  • Interest in LLMs, agents, or applied AI systems
  • Strong problem-solving ability and a builder mindset
  • Ability to operate independently in fast-paced, ambiguous environments
 
Nice to have:
  • Experience with LLM frameworks or agent systems
  • Experience with large-scale data processing or distributed systems
  • Familiarity with automation tools, workflow builders, or AI-assisted development (e.g., Cursor)
  • Startup or high-growth environment experience
 
Compensation & Benefits
The base salary range for this position is $110,000 - $160,000 USD annually.
Compensation may vary outside of this range depending on a number of factors, including a candidate’s qualifications, skills, competencies, and experience. Base pay is one part of the total package provided to compensate and recognize employees for their work at Abaka AI. This role is eligible for equity, as well as a comprehensive benefits package including health, dental, vision, PTO, and a flexible work schedule.
Abaka AI
Abaka AI

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say