Arista Networks

Site Reliability Engineer

Remote Pune, India
Docker Kubernetes MongoDB PostgreSQL Bash Python Chef AWS GCP Puppet
Description

Company Description

Arista Networks pioneered software-driven, cognitive cloud networking for large-scale
datacenter and campus environments. Arista's award-winning platforms, ranging in
Ethernet speeds from 10 to 400 gigabits per second, redefine scalability, agility and
resilience. Arista has shipped more than 20 million cloud networking ports worldwide
with CloudVision and EOS, an advanced network operating system. Committed to open
standards, Arista is a founding member of the 25/50G consortium. Arista Networks
products are available worldwide directly and through partners.

Site Reliability Engineers at Arista are critical team members that have a breadth of knowledge encompassing all aspects of service delivery. They develop software solutions to enhance, harden and support our service delivery processes. This can include building and managing CI/CD deployment pipelines, automated testing, capacity planning, performance analysis, monitoring, alerting, chaos engineering, and auto-remediation.

The SRE should have an “automate everything” mindset, helping us bring value to our customers by deploying services with incredible speed, consistency, and availability.

The SRE constantly evaluates products and services before and after production releases to prevent, identify and fix problems that impact service availability in deploying, configuring, releasing, monitoring, recovering, and scaling.

We are hiring for 1 year Contract initially and then convert Full Time based on performance. 

Job Description

Responsibilities:

  • Ensure the scalability, performance, and resilience of our suite of products
  • Work with the development and product team to establish the right monitoring and alerting strategy
  • Develop build, test, and deployment automation that seamlessly targets multiple cloud regions
  • Define and implement standards and best practices related to, system architecture, service delivery, metrics, and the automation of operational tasks
  • Optimize telemetry platform to identify customer-impacting events while providing relevant data to drive debugging
  • Partner with the engineering team to optimize the performance of services for cloud architecture
  • Debug Live Site events and conduct follow-up post-mortem and RCA analysis

Qualifications

  • B.E/B.Tech in Computer Science or equivalent
  • 7+ years of relevant experience 
  • Scripting languages like Bash, Python, etc.
  • Exposure to operational knowledge of managing applications in AWS/GCP
  • Experienced in automating software build, deployment, and server configuration management using tools such as Puppet, Chef, and Jenkins
  • Hands-on experience with Linux/Unix Administration
  • Good understanding of containerization concepts - docker, ECS, EKS, Kubernetes
  • Experience with building tools such as Jenkins
  • Working experience with NoSQL databases such as MongoDB, PostgreSQL, etc.
  • Understanding of basic networking concepts

Additional Information

All your information will be kept confidential according to EEO guidelines.

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

50,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 241 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

Cancel anytime / Money-back guarantee

Wall of love from fellow engineers