Site Reliability Engineers (SREs) at Coupang is a mission-critical role that combines software and system engineering to build, run, and scale our complex, large-scale e-commerce systems. As part of the Site Reliability Engineering team, you will be responsible for ensuring all our customer-facing services are healthy, monitored, automated, and designed to scale. As an SRE organization, we take pride in handling “operations as an engineering” problem with automation automation-first approach. You will use your background to build best-in-class infrastructure automation for areas such as Observability, Incident Management, Disaster Recovery, Load testing, Capacity engineering, and many more. In this role, you will work very closely with our product development teams from an early stage of design to all the way helping resolve any production incidents, maintaining SLI/SLA bar for production services, and influencing them with SRE principles and best practices. If you take pride in complete ownership, have a passion for solving complex technical challenges for large-scale distributed systems, and demeanor to work and communicate effectively across team boundaries, this is the role for you!
Key Responsibilities:
Serve as a primary point responsible for the reliability, health, and performance of all Coupang customer-facing services.
Gain deep knowledge of Coupang application workflow and dependencies.
Spearheading and conceptualizing revolutionary designs in critical service architecture.
Conducting comprehensive architecture reviews leading re-architecting initiatives to set industry-leading benchmarks in performance, reliability, and availability.
Lead and drive large-scale technical initiatives across multiple engineering teams.
Be able to drive collaboration effectively across organizational boundaries, be able to build strong stakeholder relationships to achieve broad organizational objectives.
Identify and implement scalable solutions for complex technical problems. Be the change driver.
Self-motivated to be able to navigate the ambiguity with large initiatives and find solutions to accomplish the goal.
Be the SRE champion/lead working with the rest of the technical leaders across Coupang to define and drive the engineering roadmap.
Contribute towards hiring and building a world-class team. Mentor and coach junior engineers on the team.
Communicate effectively with people at all levels of the organization.
Essential Qualifications:
10+ years of industry experience building and operating large-scale distributed systems.
Deep UNIX/Linux systems knowledge and administration background.
Strong programming skills in one or more of Python, Java, Golang, or C++.
Strong problem-solving and analytical skills spanning systems, networks (TCP/IP), and code, with a focus on data-driven decision-making.
Proficient with cloud-based infrastructure, including AWS, Azure, or Google Cloud Platform.
Strong understanding of DevOps and SRE practices, including continuous integration, continuous delivery, and infrastructure as code (IaC).
Proficient with containerization and orchestration technologies, such as Docker and Kubernetes.
Knowledge of observability ecosystem including metrics, logging, tracing, and tools, such as Prometheus, Grafana, Elastic Stack, Datadog, or New Relic.
Excellent communication and collaboration skills, with the ability to work with teams across distinct functions and technical domains.
Preferred Qualifications:
Master’s degree in computer science, Engineering, or a related technical field.
Prior experience working with large-scale web-based Java architectures and JVM configuration.
Professional certifications in cloud platforms, monitoring tools, or related technologies.
Previous experience working on a large-scale e-commerce platform.
0 applies
2 views
Other Jobs from Coupang
Staff iOS Engineer (Coupang Play)
Senior iOS Engineer (Coupang Play)
Principal Engineer, Coupang Media Group (Ads)
Staff Data Analyst
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 401 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say