Director - Machine Learning Engineering
Our mission at Capital One is to create trustworthy, reliable and human-in-the-loop AI systems, changing banking for good. For years, Capital One has been leading the industry in using machine learning to create real-time, intelligent, automated customer experiences. From informing customers about unusual charges to answering their questions in real time, our applications of AI & ML are bringing humanity and simplicity to banking. Because of our investments in public cloud infrastructure and machine learning platforms, we are now uniquely positioned to harness the power of AI.
We are committed to building world-class applied science and engineering teams and continue our industry leading capabilities with breakthrough product experiences and scalable, high-performance AI infrastructure. At Capital One, you will help bring the transformative power of emerging AI capabilities to reimagine how we serve our customers and businesses who have come to love the products and services we build.
We are looking for an experienced Director, Machine Learning Engineering in MLX Platform to help us build the Model Governance and Observability systems. In this role you will work on to build robust SDKs, platform components to collect metadata, traces and parameters of models running at scale and work on cutting edge Gen AI frameworks and their instrumentation. You will also lead the teams to analyze and optimize model performance, latency, and resource utilization to maintain high standards of efficiency, reliability and compliance. You will build and lead a highly talented software engineering team to unlock innovation, speed to market and real time processing. This leader must be a deep technical expert and thought leaders that help accelerate adoption of the engineering practices, up skill themselves with the industry innovations, trends and practices in Software Engineering and Machine Learning. Success in the role requires an innovative mind, a proven track record of delivering highly available, scalable and resilient governance and observability platforms.
What You’ll Do
Lead, manage and grow multiple teams of product focused software engineers and managers to build and scale Machine Learning Model Governance and AI Observability platforms & SDK’s
Mentor and guide professional and technical development of engineers on your team
Work with product leaders to define the strategy, roadmap and destination architecture
Bring a passion to stay on top of tech trends, experiment with and learn new technologies, participate in internal & external technology communities, and mentor other members of the engineering community
Encourage innovation, implementation of state of the art ( SOTA) research technologies, inclusion, outside-of-the-box thinking, teamwork, self-organization, and diversity
Work on cutting edge Gen AI frameworks/LLMs and provide observability using open Telemetry
Analyze and optimize model performance, latency, and resource utilization to maintain high standards of efficiency, reliability and compliance
Collaborate as part of a cross-functional Agile team to create and enhance software that enables state of the art, next generation big data and machine learning applications.
Basic Qualifications:
Bachelor's degree in Computer Science, Computer Engineering or a technical field
At least 15 years of experience programming with Python, Go, Scala, or C/C++
At least 5 years of experience designing and building and deploying enterprise AI or ML applications or platforms.
At least 3 years of experience implementing full lifecycle ML automation using ML Ops(scalable development to deployment of complex data science workflows)
At least 4 years of experience leading teams developing Machine Learning solutions and scaling
At least 10 years of people management experience and experience in managing managers.
Preferred Qualifications:
Master’s degree or PhD in Engineering, Computer Science, a related technical field, or equivalent practical experience with a focus on modern AI techniques.
Strong problem solving and analytical skills with the ability to work independently with ownership, and as a part of a team with a strong sense of responsibilities.
Experience designing large-scale distributed platforms and/or systems in cloud environments such as AWS, Azure, or GCP.
Experience architecting cloud systems for security, availability, performance, scalability, and cost.
Experience with delivering very large models through the ML Ops life cycle from exploration to serving
Ability to move fast in an environment with ambiguity at times, and with competing priorities and deadlines. Experience at tech and product-driven companies/startups preferred.
Ability to iterate rapidly with researchers and engineers to improve a product experience while building the core platform components for Observability and Model Governance
Experience with one or multiple areas of AI technology stack including prompt engineering, guardrails, vector databases/knowledge bases, LLM hosting, advanced RAG and fine-tuning
If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation, please contact Capital One Recruiting at 1-800-304-9102 or via email at RecruitingAccommodation@capitalone.com. All information you provide will be kept confidential and will be used only to the extent required to provide needed reasonable accommodations.
For technical support or questions about Capital One's recruiting process, please send an email to Careers@capitalone.com
Capital One does not provide, endorse nor guarantee and is not liable for third-party products, services, educational tools or other information available through this site.
Capital One Financial is made up of several different entities. Please note that any position posted in Canada is for Capital One Canada, any position posted in the United Kingdom is for Capital One Europe and any position posted in the Philippines is for Capital One Philippines Service Corp. (COPSSC).
Other Jobs from Capital One
Manager, Project Management
Lead Software Engineer, Full Stack
Similar Jobs
Staff Software Engineer - AI Infra
Senior Staff Software Engineer - AI Infra
Software Engineer Intern, Autonomous Vehicle - 2025
Senior Site Reliability Engineer - GPU Clusters
Senior Site Reliability Engineer
Senior Machine Learning Engineer
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say