Description
About Us
Observe.AI is the fastest way to boost contact center performance with live conversation intelligence. Built on the most accurate AI engine in the industry, Observe.AI uncovers insights from 100% of customer interactions and maximizes frontline team performance through coaching and end-to-end workflow automation. With Observe.AI, companies can act faster with real-time insights and guidance to improve performance, from more sales to higher retention.
Observe.AI is trusted by hundreds of customers and partners, including Pearson, Accolade, Group 1 Automotive, Southeast Trans, and Public Storage. Our recent $125 million Series C led by Softbank Vision Fund 2 with participation from Zoom Video Communications, Inc., brings our total funding to date to $213M, with investments from Menlo Ventures, Next47, NGP Capital, Emergent Ventures, Scale Ventures, Nexus Ventures, and Y-Combinator. For more information, visit www.observe.ai.
The Opportunity
We are looking for a seasoned Lead Site Reliability Engineer to join our team and ensure the utmost functionality and uptime of our production services across multiple clusters. This role offers the unique challenge of managing systems that rely on bidirectional communication with clients, primarily built over websockets, which introduces additional complexities in system design, uptime and observability. As a key member of our SRE team, you will bring your expertise in maintaining high SLAs, resolution, problem-solving and cloud to maintain and enhance our system architecture.
About the Team
Our team is a dynamic mix of engineers, ranging from recent college graduates to seasoned principal engineers. Several team members have been with the company for over five years, witnessing and contributing to the significant evolution of our systems and the company's growth. Our engineers share a passion for a fast-paced work environment, emphasizing quality and fostering healthy competition.
In this role, you'll have the unique opportunity to work directly with principal engineers, the Director of Product, and other executives. This position offers high visibility within the organization, as it is central to a new product line that the company is heavily invested in. Your contributions will not only influence the immediate project but also the broader strategic direction of the company.
What you'll do day to day as a Lead Site Reliability Engineer
- System Optimization: Implement and maintain strategies to enhance the reliability and observability of our complex systems. Ensure robust documentation and troubleshooting procedures are in place to support operational excellence.
- Team Collaboration and Leadership: Work closely with a team of skilled engineers who will look up to you for guidance and standards with regards to reliability, resilience and scalability. You will lead and collaborate on projects, sometimes working independently to deliver critical system improvements.
- Cross-Geographical Collaboration: Collaborate with engineering teams in India, ensuring smooth coordination and communication.
- Cluster Synchronization: Develop and execute strategies to ensure multiple production clusters are synchronized in terms of features, uptime, functionality, and SLA compliance. You will be responsible for creating repeatable processes that maintain consistency across different environments, enhancing the reliability and performance of our systems.
- Comprehensive Testing Strategies: Design and document comprehensive test plans that encompass various aspects of system reliability, including integration tests and periodic production tests. These plans will aim to proactively identify and mitigate risks, ensuring continuous system integrity and performance.
- Technical Troubleshooting and Problem-Solving: Engage in detailed analysis and troubleshooting of system issues. Develop and refine observability tools and practices to proactively monitor and address system performance.
- Scripting and Automation for efficiency: Identify common resolution/debugging/release patterns and taking steps towards their efficiency and automation.
What you'll bring to the role
- Experience: Ideal candidate would have around 7 years of experience in site reliability engineering or related fields, with a strong background in managing high-availability systems.
- Technical Expertise: Proficient in one of python or shell scripting, and extensive experience with AWS cloud environments and infrastructure management. Knowledge of system design complexities related to real-time, bidirectional communications.
- Leadership and Mentorship: Demonstrated ability to lead projects and mentor junior engineers, fostering a collaborative and productive environment.
- Communication Skills: Excellent communication skills to effectively manage team interactions and articulate technical challenges and solutions to stakeholders.
- Adaptability: Comfort with working in predetermined flexible hours to interact with teams across different time zones, ensuring project alignment and timely delivery.
Compensation, Benefits and Perks
Competitive compensation including equity
Excellent medical, dental, and vision insurance options
Flexible time off
Generous holidays and parental leave policies
401K plan
Learning & Development fund to support you in your continuing education journey and professional development
Fun events to drive towards our culture supporting a community of Connect, Collaborate, Celebrate
Our Commitment to Inclusion and Belonging
Observe.AI is an Equal Employment Opportunity employer that proudly pursues and hires a diverse workforce. Observe AI does not make hiring or employment decisions on the basis of race, color, religion or religious belief, ethnic or national origin, nationality, sex, gender, gender identity, sexual orientation, disability, age, military or veteran status, or any other basis protected by applicable local, state, or federal laws or prohibited by Company policy. Observe.AI also strives for a healthy and safe workplace and strictly prohibits harassment of any kind.
We welcome all people. We celebrate diversity of all kinds and are committed to creating an inclusive culture built on a foundation of respect for all individuals. We seek to hire, develop, and retain talented people from all backgrounds. Individuals from non-traditional backgrounds, historically marginalized or underrepresented groups are strongly encouraged to apply.
If you are ambitious, make an impact wherever you go, and you're ready to shape the future of Observe.AI, we encourage you to apply. For more information, visit www.observe.ai.
Observe.AI
Artificial Intelligence
Call Center
Machine Learning
Natural Language Processing
SaaS
Software
Speech Recognition
0 applies
36 views
Jobs from our Partners
Senior System Integration and Test Engineer
San Diego, CA
US
Critical Infrastructure Engineer
Plano, TX
US
Critical Infrastructure Engineer
Irvine, CA
US
IDT Software Engineer - Huntsville
Huntsville, AL
US
Automated Software Test Developer - Huntsville
Huntsville, AL
US
Other Jobs from Observe.AI
DevOps Engineer - I
Bengaluru, India
Remote Hybrid
Software Development Engineer III (Frontend)
Bengaluru, India
Remote Hybrid
Senior Machine Learning Engineer - NLP
Austin, TX
Remote Hybrid
Senior Software Engineer - Backend
Remote Hybrid
Senior ML Engineer (Speech/ASR)
Bengaluru, India
Remote Hybrid
Similar Jobs
Principal Engineer AI/ML
Remote
India
Lead - Software Engineering (Mainframe/Java)
Chennai, India
Associate / Sr Associate, Software Engineer
Manila, Philippines
Lead Data Architect
Chicago, IL
US
Site Reliability Engineer
Remote
Sr QA Automation Engineer (Cypress)
Gurgaon, India
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
50,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 264 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
Cancel anytime / Money-back guarantee