Elloe AI

LLM Red Team Intern, Evaluation Systems (Remote)

Remote Austin, TX
Machine Learning AI GPT-4 Claude Gemini
Description

LLM Red Team Intern (Evaluation Systems)

Location: Austin, Texas

Department: Engineering

Location Type: REMOTE

Employment Type: INTERN

Internship | Remote | LLM Evaluation | Reports to CTO or Safety Lead

About Elloe
Elloe is the immune system for AI.
We don’t train models — we protect their outputs. We trace every hallucination, enforce every policy boundary, and create an audit trail for every critical LLM interaction.

Our modules (TruthChecker™, AutoRAG™, Autopsy™) are embedded in hospitals, banks, and regulatory sandboxes. Our job is to make sure these systems are safe before anything hits production.
This role will help us break, stress-test, and harden the models used by governments and enterprises alike.

About the Role
You’ll red team real-world LLM deployments, design eval harnesses, and help scale Elloe’s output-level safety layer. This isn’t just prompt tuning — it’s forensic risk mapping.
You’ll work directly with product and safety leads to uncover failure patterns and codify guardrails for GenAI systems under real-world scrutiny.

What You’ll Own
1. Red Teaming & Risk Testing
  • Create prompts to trigger hallucinations, policy violations, or failure scenarios
  • Stress test Elloe-protected deployments using open and proprietary models
  • Document behavioral exploits across use cases (healthcare, compliance, gov)
2. Evaluation Design
  • Build truthsets and scoring rubrics tied to factuality, policy, or ethical standards
  • Benchmark Elloe’s modules across model types (Claude, GPT-4, Gemini, open models)
  • Collaborate with product to refine and expand our eval harnesses
3. Safety Intelligence
  • Identify blind spots in current detection logic
  • Recommend scoring methods or red flag thresholds for deployment
  • Support internal model comparison reports or customer safety audits

Who You Are
  • ML/AI researcher or engineer (undergrad, grad, or early career)
  • Experience working with LLMs, eval sets, and prompt design
  • Strong attention to detail, grounded in safety and adversarial thinking
  • Bonus: exposure to safety benchmarks like TruthfulQA, MMLU, or red teaming tools

Why This Matters
This is real-world alignment, not research theater.
You’ll be helping define how AI gets deployed responsibly — with traceability, transparency, and real-time protection.

You’ll leave this role with:
  • Exposure to high-stakes LLM safety deployments
  • Published frameworks or scoring methods used by enterprises
  • Mentorship from technical founders operating at the bleeding edge of AI safety

Logistics & Application
  • Start Date: Rolling
  • Duration: 12–16 weeks
  • Compensation: Research stipend
  • Location: Remote-first; flexible for global candidates
  • To Apply: Share a jailbreak or eval idea you’d love to run against GPT-4.
Elloe AI
Elloe AI

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say