Abaka AI

Data Engineer

Palo Alto, CA
Description

Data Engineer

Location: Palo Alto, CA

Department: Engineering

About Abaka AI
 
Abaka AI is built on one mission: to be the world’s most trusted data partner for AI companies. More than 1,000 industry leaders across Generative AI, Embodied AI, and Automotive AI rely on us to power their data pipelines. With our headquarters in Silicon Valley—and teams in Paris, Singapore, and Tokyo—we support global partners with fast, reliable, and scalable data solutions.
Our offerings include a diverse catalog of off-the-shelf datasets (image, video, multimodal, reasoning, 3D, and beyond) as well as comprehensive data collection and annotation services. Whether teams need raw data, curated datasets, or full-cycle data engineering, Abaka AI provides the foundation for building high-performance AI systems.
 
About the Role
 
We’re hiring our first Data Engineer in the United States, a foundational role that will shape Abaka AI’s data engineering standards, systems, and culture from day one. This is an opportunity to take full ownership of how multimodal data is sourced, processed, cleaned, annotated, and delivered to some of the world’s most advanced AI teams.
You won’t just be building pipelines—you’ll be developing the infrastructure that powers frontier AI models. You’ll partner directly with foundation model teams to understand their data needs, translate them into scalable workflows, and deliver high-quality multimodal datasets that meaningfully impact model performance.
As an early member of our engineering team, you’ll influence everything from our long-term roadmap to our internal tooling ecosystem. If you thrive in high-ownership environments and want to shape the machine learning foundation of a fast-moving AI company, this role offers an opportunity to make an immediate and lasting impact.
 
 

Responsibilities

  • Work closely with foundation model clients to understand their data requirements, and coordinate internal teams to create tailored delivery plans that ensure on-time, high-quality data delivery, including meeting expectations for format, precision, and volume.
  • Lead the development of mid- to long-term plans for the data engineering function. Build scalable, end-to-end pipelines for multimodal data (text, image, audio, video, 3D point cloud, etc.) across data sourcing, cleaning, annotation, QA, storage, and iterative optimization for training, fine-tuning, and evaluation.
  • Develop solutions to core technical challenges in multimodal data processing, including cross-modal alignment (for example, image-text semantic matching), large-scale data cleaning (deduplication, denoising, format normalization), annotation efficiency, and data encryption and security.
  • Partner with algorithm, product, and business teams by providing feedback on data bottlenecks, refining internal tooling and services, and supporting client-facing teams with technical documentation and pre-sales materials.
  • Evaluate and optimize the cost structure of data processing operations, including headcount, infrastructure, and tooling, to balance quality, efficiency, and scalability.
 

Qualifications

  • Strong background in computer science, data engineering, artificial intelligence, or related fields, with hands-on experience building or operating large-scale data systems.
  • 1+ years of experience in data engineering or data operations. Leadership experience is highly valued, and experience with LLM or multimodal dataset preparation is a strong plus.
  • Deep understanding of end-to-end multimodal data workflows, with hands-on experience in at least two modalities (text, images, audio, or video).
  • Proficiency in designing technical architectures for large-scale data pipelines, including distributed processing and automation frameworks, along with familiarity with data privacy and security best practices such as access control and data anonymization.
  • Strong execution and team management capabilities, with the ability to translate high-level objectives into actionable plans and drive team results.
  • Excellent communication and cross-functional collaboration skills, with the ability to clearly articulate technical and operational requirements, resolve conflicts, and manage stakeholder expectations.
  • High sense of ownership and resilience, with comfort working in a fast-paced, rapidly evolving AI environment and the ability to manage urgent delivery timelines.
 

Compensation & Benefits

The base salary range for this position is $150,000 - $225,000 USD annually.
Compensation may vary outside of this range depending on a number of factors, including a candidate’s qualifications, skills, competencies and experience. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work at Abaka AI. This role is eligible for equity, as well as a comprehensive benefits package (health, dental, vision, PTO, flexible work schedule).
Abaka AI
Abaka AI

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say