Senior AI Platform Engineer, Atlas AI
Location: USA (Phoenix)
Department: Engineering
Cognite operates at the forefront of industrial digitalization, building AI and data solutions that solve some of the world’s hardest, highest-impact problems. With unmatched industrial heritage and a comprehensive suite of AI capabilities, including low-code AI agents, Cognite accelerates the digital transformation to drive operational improvements.
Our moonshot is bold: unlock $100B in customer value by 2035 and redefine how global industry works.
What Cognite is Relentless to achieve
We thrive in challenges. We challenge assumptions. We execute with speed and ownership. If you view obstacles as signals to step forward - not step back - you’ll feel at home here. Join us in this venture where AI and data meet ingenuity, and together, we forge the path to a smarter, more connected industrial future.
How you’ll demonstrate Ownership
We are seeking an AI Platform Engineer to join the Cognite Atlas AI Product team in Phoenix, AZ, to engineer, build, and operate the production-grade, multi-cloud platform that enables our internal and partner teams to build, deploy, and manage industrial AI agents. You will be responsible for creating the core services, frameworks, and infrastructure for our "agent builder workbench" and agent runtime, focusing on scalability, reliability, cost-efficiency, and security. Your work will directly impact industrial efficiency and sustainability, which is critical to our mission of powering a high-tech, sustainable, and profitable industrial future.
The Impact you bring to Cognite
- Design, build, and maintain the core Python SDKs and services for the Atlas AI platform. Create clean abstractions that empower Solution Engineers to easily define and test agents and workflows.
- Build the core agentic runtime, ensuring it is scalable, meets its SLOs, and can reliably manage the state, orchestration, and execution of industrial agents.
- Develop a robust, governed, and secure framework for AI agent tool-use. Engineer the platform components that allow solution engineers to safely add new tools (e.g., API calls, database queries) and that manage the secure execution, monitoring, and access control for those tools.
- Manage the LLM serving layer, including deploying and optimizing models for low-latency/high-throughput inference. Build and maintain model routing logic to select the most appropriate model (e.g., performance vs. cost) for a given task.
- Implement evaluation and observability for all AI services. Create standardized frameworks for systematically evaluating the performance, accuracy, cost, and safety of LLMs and agentic workflows. Drive the implementation of robust, automated testing strategies for LLM-based systems.
- Own the full development lifecycle for services in a production SaaS environment. This includes establishing automated code coverage goals, rigorous code reviews, defining SLOs, participating in on-call rotations, and ensuring a fast and effective incident response process.
- Work closely with the Lead Architect to translate the technical vision into implemented, production-grade services. Act as a key partner for the Solution Engineers (your internal customers) to understand their needs and abstract common patterns into reusable, robust platform components.
- Stay up to date on the latest developments in the field, and mentor junior developers.
Required Qualifications
- Bachelor's or Master’s degree in Computer Science or a related field, or equivalent practical experience.
- 8+ years of professional experience in backend software engineering, platform engineering, or MLOps, with a proven track record of architecting and operating complex systems at scale.
- 2+ years of hands-on experience building applications or platforms on top of AI/ML models or LLMs.
- Expert-level proficiency in Python and a strong background in software architecture, robust API design, and building maintainable, well-documented SDKs for other developers.
- Hands-on experience with Kubernetes (K8s) and building services on managed PaaS in a multi-cloud environment (AWS, Azure, GCP). Strong understanding of Infrastructure as Code (e.g., Terraform).
- Proven experience building and operating production-grade SaaS software. Understanding of the full development life cycle, including CI/CD, monitoring, telemetry, and on-call incident response.
- Practical experience with LLM orchestration frameworks (Bedrock, Vertex, Semantic Kernel, LangChain).
- Strong verbal and written communication skills, with the ability to articulate complex technical designs and decisions clearly.
Preferred Experience
- Hands-on experience deploying and managing LLMs in production using high-performance serving frameworks.
- Experience with MLOps/LLMOps tools for tracing, monitoring, and evaluating LLM applications (LangSmith, Arize, Phoenix, or equivalent).
- Experience with RAG Infrastructure, embedding generation pipelines, vector database integrations, and high-performance vector similarity search APIs.
A snapshot of our many perks and benefits as a Cogniter
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
