Guidehouse

Data Pipeline & Metadata Lead

Remote McLean, VA
USD 149k - 248k
Palantir Foundry Databricks Spark AWS Python SQL OCR
Description

Palantir Data Pipeline & Metadata Lead

Location: US - VA, McLean, US - Remote (Any location)

Time Type: Full time

Job Description

Job Family:

Data Engineering & Architecture Consulting


Travel Required:

Up to 25%


Clearance Required:

Ability to Obtain Public Trust

What You Will Do:
 

This role serves as a senior technical leader responsible for establishing and scaling the data pipeline and metadata foundation that enables enterprise analytics, AI-driven capabilities, and system integration. The ideal candidate combines deep, demonstrable Palantir (including AIP) expertise, strong data architecture and governance experience, and the ability to lead teams and grow capabilities in complex, regulated environments.

Roles & Responsibilities

  • Lead the design, development, and governance of enterprise data pipelines and metadata frameworks within Palantir Foundry and integrated data platforms.
  • Serve as the technical and functional lead for data ingestion, transformation, and metadata management across structured and unstructured data sources.
  • Define and enforce metadata standards, data models, ontologies, and data dictionaries to enable scalable search, analytics, and cross-system integration.
  • Oversee implementation of end-to-end data pipelines, including ingestion, validation, transformation, and delivery into downstream platforms (e.g., Palantir, Databricks).
  • Establish and govern data quality, validation, and exception handling processes, ensuring completeness, accuracy, and traceability of data assets.
  • Ensure alignment of pipelines and metadata with enterprise architecture, system-of-record requirements, and integration patterns.
  • Ensure pipelines effectively support document-based ingestion workflows, including integration with OCR/ICR outputs and downstream metadata extraction processes.
  • Enable AI-ready data foundations, supporting downstream capabilities such as semantic search, entity resolution, and advanced analytics.
  • Partner with data science and engineering teams to ensure data pipelines and metadata support AI/ML use cases and analytical workflows.
  • Drive data governance and lifecycle management, including schema versioning, lineage tracking, auditability, and compliance with security and privacy requirements.
  • Oversee integration across platforms (e.g., AWS, Databricks, Palantir), ensuring scalable, secure, and reliable data exchange.
  • Lead and mentor teams in a matrixed, cross-functional environment, providing technical direction and quality oversight.
  • Engage with senior stakeholders to define data strategy, prioritize initiatives, and translate business needs into technical solutions.
  • Operate within an Agile delivery model, overseeing backlog prioritization, technical design reviews, and iterative delivery across workstreams.


What You Will Need:

  • Bachelor’s degree
  • A Minimum of EIGHT (8) years of experience in data engineering, data architecture, or platform integration, with increasing leadership responsibility.
  • U.S. Citizenship required and ability to obtain and maintain a Public Trust clearance.
  • Demonstrated, hands-on expertise in Palantir Foundry (required), including:
    • Designing and implementing production-grade data pipelines
    • Developing ontologies, data models, and relationship mappings
    • Integrating data across multiple enterprise systems
  • Demonstrated experience with Palantir AIP (required), including enabling AI-driven workflows (e.g., search, analytics, or decision-support use cases).
  • Proven experience leading enterprise-scale Palantir implementations, including architecture, delivery, and governance.
  • Strong experience designing and managing large-scale data pipelines in cloud environments (AWS preferred).
  • Experience integrating Palantir with Databricks and/or Spark-based platforms for advanced data processing and analytics.
  • Expertise in metadata management and data governance, including:
    • Data dictionaries and controlled vocabularies
    • Data lineage and traceability
    • Schema versioning and change management
  • Experience implementing data quality frameworks, including validation rules, exception handling, and reconciliation processes.
  • Strong understanding of data lake/lakehouse architectures (e.g., AWS S3, Databricks).
  • Proficiency in Python, SQL, and/or other relevant languages for data pipeline development.
  • Experience designing systems that support AI/ML and analytics use cases, including unstructured data and document processing pipelines (e.g., OCR/ICR outputs).
  • Strong leadership and communication skills, with experience managing cross-functional and matrixed teams.
  • Experience delivering complex solutions in an Agile environment.


What Would Be Nice To Have:

  • Experience growing or leading Palantir capabilities and engagements within the federal sector, particularly in health or human services domains.
  • Bachelor’s degree in Computer Science, Engineering, Information Systems, or related field (Master’s preferred).
  • Experience designing AI-ready data architectures, including support for semantic search, entity resolution, and advanced analytics.
  • Experience with enterprise ML and data platforms (e.g., AWS SageMaker, Databricks Machine Learning) and integration with data pipelines.
  • Familiarity with OCR/ICR pipelines and document intelligence workflows, including integration of extracted text and metadata into downstream data platforms.
  • Experience implementing data governance frameworks in regulated environments, including auditability and compliance controls.
  • Prior experience working with federal agencies or highly regulated environments.
  • Experience in a consulting environment, including leading client engagements and supporting business development.
  • Experience supporting training, user enablement, and scaling Palantir/data capabilities across organizations.
  • Familiarity with graph-based data models, ontology-driven applications, and relationship analytics.
  • Relevant certifications (e.g., Palantir Foundry, AWS, Databricks) or equivalent demonstrated expertise.

The annual salary range for this position is $149,000.00-$248,000.00. Compensation decisions depend on a wide range of factors, including but not limited to skill sets, experience and training, security clearances, licensure and certifications, and other business and organizational needs.


What We Offer:

Guidehouse offers a comprehensive, total rewards package that includes competitive compensation and a flexible benefits package that reflects our commitment to creating a diverse and supportive workplace.

Benefits include:

  • Medical, Rx, Dental & Vision Insurance

  • Personal and Family Sick Time & Company Paid Holidays

  • Position may be eligible for a discretionary variable incentive bonus

  • Parental Leave and Adoption Assistance

  • 401(k) Retirement Plan

  • Basic Life & Supplemental Life

  • Health Savings Account, Dental/Vision & Dependent Care Flexible Spending Accounts

  • Short-Term & Long-Term Disability

  • Student Loan PayDown

  • Tuition Reimbursement, Personal Development & Learning Opportunities

  • Skills Development & Certifications

  • Employee Referral Program

  • Corporate Sponsored Events & Community Outreach

  • Emergency Back-Up Childcare Program

  • Mobility Stipend

About Guidehouse

Guidehouse is an Equal Opportunity Employer–Protected Veterans, Individuals with Disabilities or any other basis protected by law, ordinance, or regulation.

Guidehouse will consider for employment qualified applicants with criminal histories in a manner consistent with the requirements of applicable law or ordinance including the Fair Chance Ordinance of Los Angeles and San Francisco.

If you have visited our website for information about employment opportunities, or to apply for a position, and you require an accommodation, please contact Guidehouse Recruiting at 1-571-633-1711 or via email at [email protected]. All information you provide will be kept confidential and will be used only to the extent required to provide needed reasonable accommodation.

All communication regarding recruitment for a Guidehouse position will be sent from Guidehouse email domains including @guidehouse.com or [email protected].  Correspondence received by an applicant from any other domain should be considered unauthorized and will not be honored by Guidehouse.  Note that Guidehouse will never charge a fee or require a money transfer at any stage of the recruitment process and does not collect fees from educational institutions for participation in a recruitment event. Never provide your banking information to a third party purporting to need that information to proceed in the hiring process.

If any person or organization demands money related to a job opportunity with Guidehouse, please report the matter to Guidehouse’s Ethics Hotline. If you want to check the validity of correspondence you have received, please contact [email protected]. Guidehouse is not responsible for losses incurred (monetary or otherwise) from an applicant’s dealings with unauthorized third parties.

Guidehouse does not accept unsolicited resumes through or from search firms or staffing agencies. All unsolicited resumes will be considered the property of Guidehouse and Guidehouse will not be obligated to pay a placement fee.

Guidehouse
Guidehouse

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say