ClickUp

Sr or Staff Site Reliability Engineer, Poland

Poland
Terraform Docker PostgreSQL Node.js Ansible Redis
This job is closed! Check out or
Description

ClickUp is the world's only all-in-one productivity platform that flexes to the way people want to work. It replaces all individual workplace productivity tools with a single, unified platform including project management, document collaboration, spreadsheets, chat, goals, and more. On a mission to make the world more productive, ClickUp is headquartered in San Diego and scaling remotely and internationally. As one of the fastest-growing SaaS companies in the world, ClickUp helps millions of users to be more productive and save at least one day every week. 🦄

 

Remote Position - Live Anywhere in Poland

 

ClickUp is looking for a driven and innovative SRE engineer to help us make ClickUp the “one app to rule them all”. As an Infrastructure Engineer at ClickUp, your primary role will be managing the stability of our globally distributed and cloud-based infrastructure that powers our app for thousands of users daily. You will also work closely with the other engineering teams in order to deploy, manage and maintain a highly secure and fault-tolerant environment. It is critical that our infrastructure can support an inevitable and drastic increase in load to ensure high standards of uptime for all our users around the globe, as well as develop repeatable, scalable, and predictable methods to release our latest features at an unmatched rate.

Collaboration and teamwork are vital to how ClickUp operates. A significant portion of your responsibilities will include creating tools to improve processes across the different engineering teams. You will also be in charge of detecting inefficiencies introduced in different parts of the app and propose solutions on how to improve them. A highly developed set of communication skills is also important for successfully fulfilling these responsibilities and many others.We’re scaling quickly, so we’re recruiting teammates who share our core values, know how to get sh*t done, and would add a lot to our extremely driven culture. 

 

The Role:

  • Participate in designing and building systems for maximum performance, reliability, and scalability.
  • Work with the engineering teams on product design, decisions, and troubleshooting.
  • Increase general stability, observability, and metrics surrounding both uptime and stability.
  • Champion our monitoring infrastructure.
  • Implement and improve our general site reliability posture (error and downtime budgets, MTTD and MTTR improvements, improving alerting and notifications, minimizing customer impact from incidents, etc.)
  • Respond to and troubleshoot downtime events while actively developing safeguards to prevent them.
  • Participate in brainstorming sessions with the engineering team and contribute ideas to our technology and algorithms.
  • Build a deep understanding of how ClickUp's systems behave, scale, interact and fail, and use that insight to identity risks and opportunities for remediation
  • Own, drive and improve the incident management process across engineering org and participate in the team's follow-the-sun model
  • Define SLOs and SLIs for all of our services and introduce error budgeting
  • Own and improve our observability on all of our services
  • Build software solutions to enable reliability and operability of large scale distributed systems handling petabytes of data and serving
  • Build tools and automation to eliminate toil and reduce operational overhead. Create frameworks, processes and best practices to be used across ClickUp Engineering
  • Automate critical portions of ClickUp engineering processes, to minimize risk and maximize the speed of innovation
  • Manage capacity and performance to help scale our infrastructure both on public and private clouds around the world

Qualifications:

  • 6 + years of knowledge of the Amazon Web Services ecosystem (Elastic Beanstalk, CDK, EC2, ECS, VPC, Redis, RDS, ALB etc.).
  • 6 + years of experience in managing production-critical infrastructures and DevOps environments.
  • 6 + years of experience in implementing SRE best practices and procedures.
  • Experience with IaC (CDK, Terraform), CI/CD (GitHub Actions, TravisCI, CircleCI), 
  • Familiar with Containerisation (Docker),
  • Knowledgeable in network, firewall, and security best practices.
  • Experience with self-healing automation and monitoring tools (DataDog, CloudWatch, Grafana)
  • Knowledge of relational databases (preferably PostgreSQL).
  • A strong self-starter, operationally-focused; a problem-solver.
  • Excellent interpersonal, written, and oral communication skills.
  • Experience with application security testing is a plus (not mandatory)
  • Familiarity or experience with Node.js is a plus (not mandatory).
  • Familiar with configuration management tooling (eg: Ansible) is a plus.
  • Experience with management of Linux-based EC2 instances.

#LI-remote #LI-RS1

 

ClickUp was founded on a culture of hard work, consistent growth, and a desire to break norms. We’re a values-driven company and hire based on ambition, merit, and a willingness to do what it takes to succeed. We don’t care where you’re from, what you look like, or who you’re in a relationship with—we hire the best people for the job, and create an environment that supports employees on their journey to do the most exciting work of their lives! ClickUp is an Equal Opportunity Employer, and qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, or national origin.

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

50,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 210 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

Cancel anytime / Money-back guarantee

Wall of love from fellow engineers