AI Test Architect
Team: Data, AI, & Interoperability Platform - QA
Location: Bogotá, Colombia, Medellin
Commitment: Full Time - Permanent
Workplace Type: remote
What You’ll Be Doing
- 1.AI-Driven Quality Strategy & Architecture
- Architect a comprehensive "Quality Intelligence" platform using generative AI to predict defect hotspots, intelligently optimize regression suites, auto-generate tests, and enable self-healing automation.
- Define enterprise-wide AI-first testing strategy, including non-deterministic evaluation paradigms, continuous monitoring for drift/hallucination, and integration across the full SDLC.
- Establish governance for ethical AI testing, aligning with emerging standards
- 2.LLM & Agent Evaluation Frameworks
- Design and implement advanced benchmarks, red teaming protocols, and adversarial testing for internal AI agents and generative features—focusing on hallucination rates, bias/fairness, prompt injection, jailbreaks, and goal misalignment.
- Build evaluation pipelines with statistical rigor (e.g., multi-trial runs, LLM-as-judge, human-in-the-loop) using tools like LangFuse, LangSmith, DeepEval, RAGAS, or Arize Phoenix for metrics such as faithfulness, context precision, and safety compliance.
- Architect harnesses for agentic workflows, tool-calling, planning, multi-agent simulations, and post-deployment observability.
- 3.Infrastructure & Automation Architecture
- Partner with DevOps to embed AI-based testing into GitHub-based CI/CD pipelines (e.g., AI-generated tests, predictive flakiness detection, automated gating with quality signals).
- Lead design of self-healing test frameworks (integrating AI plugins with Playwright/Cypress or similar) that adapt to UI/model changes with minimal maintenance.
- Architect synthetic data generation, maintain golden data-sets and AI-powered data masking solutions to enable privacy-compliant, high-fidelity testing at scale.
- 4.Cross-Functional Leadership & Evangelism
- Collaborate with product, data science, ML engineering, and security teams to influence AI feature design with quality guardrails from day one.
- Evangelize and mentor: Upskill traditional QA engineers into AI-augmented testers through workshops, playbooks, and communities of practice.
- Drive adoption of AI quality best practices organization-wide, including metrics dashboards for DORA + AI-specific indicators (e.g., hallucination rate, red team success rate, self-healing coverage).
- 5.Observability, Metrics & Continuous Evolution
- Define and implement AI-specific quality telemetry (e.g., drift detection, faithfulness scoring, compliance incidents) integrated with tools like Langfuse.
- Establish feedback loops for model iteration, A/B testing guardrails, and proactive risk mitigation in production.
Challenges You'll Architect Solutions For
- Building reliable evaluation for non-deterministic, agentic AI in a fast-moving SaaS landscape.
- Scaling self-healing and generative test automation without introducing new flakiness or security debt.
- Balancing innovation speed with rigorous red teaming and ethical safeguards for customer-facing AI.
Success in the First 6-12 Months
- Launch the "Quality Intelligence" platform foundation with AI-augmented pipelines covering > 70%+ of critical paths.
- Establish red teaming/red-teaming-as-code processes that reduce high-severity AI risks by > 40%+.
- Upskill > 50%+ of QA/engineering teams on AI testing fundamentals and deliver measurable velocity/safety gains.
- Accuracy Baseline: Establish a baseline 90%+ Faithfulness score for all RAG-powered features.
What You Will Bring
- 8+ years in Quality Engineering/Test Architecture within cloud-native SaaS environments, with 2+ years focused on AI/ML/LLM testing and validation.
- Deep expertise in AWS (serverless, microservices, IaC with Terraform/CloudFormation) and GitHub CI/CD ecosystems.
- Proficiency architecting LLM-based applications and testing frameworks (LangChain/LangGraph/LangSmith strongly preferred; equivalents acceptable).
- Mastery of modern automation (Playwright, Cypress) with hands-on experience integrating self-healing AI plugins or generative test tools.
- Strong programming skills in JavaScript/TypeScript and/or Python; solid understanding of foundational AI concepts (transformers, embeddings, RAG, evaluation trade-offs).
- Experience with LLM evaluation tools like Bedrock Evaluations, Prompt Management, Guardrails, DeepEval, RAGAS, Arize Phoenix, Langfuse.
- Experience with Red teaming frameworks/tools (Cobalt Strike, Sliver, Nmap) and knowledge of adversarial testing methodologies is a bonus.
- Proven leadership: Mentoring teams, defining standards, and driving cross-functional change in ambiguous, high-growth settings.
- Bachelor's/Master's in Computer Science, AI/ML, or equivalent; relevant certifications a strong plus.
- Strong English language communication and collaboration skills
Perks & Benefits
- Contrato a termino Indefinido with all the legal benefits
- Prepaid Medicine
- Life insurance and funeral assistance
- Internet allowance
- Home office stipend
- Competitive compensation — above the market average
- 100% remote work environment and an excellent work-life balance
- Opportunity to work for a growing global SaaS leader company
- A culture that promotes independence, innovation, trust, and accountability
- Open space to be creative, innovative, and strategize for the future
- Mentorship by a highly experienced professional
- Budget for training, we want you to grow
- 5 Personal Time Off days per year
- Sick Leave Top up to total 100% of salary paid by the employer from Day 3 to 90.
- Recognition Award, additional paid time off in recognition of the corresponding year of service
- Upgrade vacation starting at 5 years of service
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
