NICE

Senior Specialist Site Reliability Engineer

Pune, India
DynamoDB Terraform Kubernetes Puppet Docker SQL Python Go Chef GCP Ansible Java C# Bash PowerShell AWS Microservices
Description

At NICE, we don’t limit our challenges. We challenge our limits. Constantly. We’re relentless. We’re ambitious. And we make an impact. Our NICErs bring their A game and spend each day turning it into an A+. And if you’re like us, we can offer you the kind of challenge that will light a fire within you.

Senior Specialist Site Reliability Engineer

NICE is looking for a Senior Site Reliability Engineer. Candidates will work supporting large complex enterprise software clients including applications, servers, SQL, network and must have excellent problem-solving skills. As we expand our customer deployments, we are currently seeking an experienced SRE to deliver insights from massive scale data in real time. Specifically, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences at every interaction.

Brief Description

The Senior Site Reliability Engineer works as an software developer in reliability for a specific software application or suite of applications and accompanying infrastructure. This includes implementation of new systems as well as providing mid-level and escalation support for other groups and working to resolve production issues in conjunction with development, operational, and architectural resources.

Objectives of this Role:

  • Run the production environment by monitoring availability and taking a holistic view of system health
  • Build software and systems to manage platform infrastructure and applications
  • Improve reliability, quality, and time-to-market of our suite of software solutions
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
  • Provide primary operational support and engineering for multiple large distributed software applications

Daily and Monthly Responsibilities:

  • Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
  • Partner with development teams to improve services through rigorous testing and release procedures
  • Participate in system design consulting, platform management, and capacity planning
  • Create sustainable systems and services through automation and uplifts
  • Balance feature development speed and reliability with well-defined service level objectives

Required Skills and Qualifications:

  • Bachelor’s degree in computer science, Engineering, or related field (or equivalent experience).
  • 8-12 years of working experience in a similar role, with a focus on systems engineering, automation, and reliability.
  • Proficiency in at least one programming language (e.g., Python, Go, Java, C#) and experience with scripting languages (e.g., Bash, PowerShell).
  • Deep understanding of cloud computing platforms (e.g., AWS), the working and reliability constraints of some of the prominent services (e.g., EC2, ECS, Lambda, DynamoDB etc)
  • Experience with infrastructure as code tools such as CloudFormation, Terraform.
  • Deep understanding of CI/CD concepts and experience with CI/CD tools such as Jenkins, GitLab CI/CD, or CircleCI.
  • Strong knowledge of containerization technologies (e.g., Docker, Kubernetes) and microservices architecture.
  • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack, Cloudwatch).
  • Excellent problem-solving skills and the ability to troubleshoot complex issues in distributed systems.
  • Experience of Incident management and blameless postmortems that includes driving the incident response efforts during outages and other critical incidents, resolution, and communication in a cross-functional team setup.

Good to have skills:

  • Handson experience of working with large Kubernetes Cluster. Certification will be an added plus.
  • Working experience of Grafana Observability Suite (Loki, Mimir, Tempo).
  • Administration and/or development experience of standard monitoring and automation tools such as Splunk, Datadog, Pagerduty Rundeck. 
  • Familiarity with configuration management tools like Ansible, Puppet, or Chef.
  • Certifications such as AWS Certified DevOps Engineer, Google Cloud Professional DevOps Engineer, or equivalent.

Personal attributes:

  • Strong communication skills and the ability to collaborate effectively with cross-functional teams.
  • Team player - ability to work well in a close team environment.
  • Fast learner with ability to educate her/himself on relevant technologies
  • Ability to multitask and prioritize work
  • Ability to remain focused and calm under pressure

 

About NICE

NICE Ltd. (NASDAQ: NICE) software products are used by 25,000+ global businesses, including 85 of the Fortune 100 corporations, to deliver extraordinary customer experiences, fight financial crime and ensure public safety. Every day, NICE software manages more than 120 million customer interactions and monitors 3+ billion financial transactions.

Known as an innovation powerhouse that excels in AI, cloud and digital, NICE is consistently recognized as the market leader in its domains, with over 8,500 employees across 30+ countries.

NICE is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, age, sex, marital status, ancestry, neurotype, physical or mental disability, veteran status, gender identity, sexual orientation or any other category protected by law.

 

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

50,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 232 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

Cancel anytime / Money-back guarantee

Wall of love from fellow engineers