Software Architect
Location: Benguluru, India
Department: Engineering
Experience: 10 - 15 Years
Key Responsibilities
- Design and evolve the end-to-end infrastructure supporting ASR/TTS, LLM orchestration, Agentic RAG, and self-learning workflows.
- Architect low-latency pipelines for real-time conversational AI, ensuring sub-second response times across voice and chat.
- Build multi-cloud, distributed systems (AWS, GCP, Azure) with elastic scaling to handle spiky workloads.
- Define and enforce SLAs around latency, uptime, and throughput for AI services.
- Drive observability, monitoring, and resilience strategies to handle failures gracefully.
- Optimize GPU/TPU utilization for cost-effective training and inference.
- Partner with InfoSec to embed security-by-design across all AI/ML workloads.
- Implement controls to protect sensitive enterprise data while meeting global compliance standards (SOC2, ISO 27001, GDPR, DPDP).
- Work closely with the Head of AI to translate cutting-edge research into production-grade platforms.
- Provide technical mentorship to engineering teams, ensuring best practices in distributed systems and infra design.
- Evaluate and adopt emerging technologies (e.g., SSMs, inference optimizers like Triton, Riva, vLLM) to stay ahead of the curve.
Required Qualifications & Skills
- 10 - 15 years of experience in large-scale systems architecture, with at least 5 years in principal architect-level roles.
- Proven expertise in distributed systems, cloud-native architectures, and real-time pipelines.
- Hands-on experience with containerization, orchestration (Kubernetes), and microservices.
- Strong background in scalable ML infrastructure, including model serving, GPU/accelerator utilization, and CI/CD for ML.
- Demonstrated ability to architect systems with low latency (<300ms), high throughput, and enterprise reliability.
- Experience in conversational AI, speech systems, or real-time inference workloads.
- Deep knowledge of MLOps platforms (Kubeflow, MLflow, VertexAI, SageMaker).
- Familiarity with state-of-the-art inference optimization frameworks (e.g., Triton, Nvidia Riva, vLLM, SGLang).
- Open-source contributions or patents in distributed systems, infra, or ML tooling.
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
