Coralogix is a modern, full-stack observability platform transforming how businesses process and understand their data. Our unique architecture powers in-stream analytics without reliance on expensive indexing or hot storage. We specialize in comprehensive monitoring of logs, metrics, trace and security events with features such as APM, RUM, SIEM, Kubernetes monitoring and more, all enhancing operational efficiency and reducing observability spend by up to 70%.
We are seeking a skilled Site Reliability Engineer (SRE) with a strong background in Elasticsearch/OpenSearch to join our team. The ideal candidate will manage and optimize large-scale Elasticsearch/OpenSearch clusters, ensuring the infrastructure's stability, performance, and scalability. You'll work closely with development and operations teams to build robust and efficient systems.
Key Responsibilities:
- Manage & Monitor: Oversee the performance, reliability, and availability of large-scale Elasticsearch/OpenSearch clusters.
- Optimize & Scale: Implement best practices for scaling, indexing, and querying to ensure optimal performance.
- Automate & Streamline: Develop and maintain automated performance testing or benchmarking, monitoring, and alerting for Elasticsearch/OpenSearch clusters.
- Troubleshoot & Resolve: Quickly identify and resolve issues related to cluster health, data integrity, performance bottlenecks, and search accuracy.
- Collaborate: Work closely with development, DevOps, and other teams to design and implement enhancements to cluster architecture, stability, performance, and data management flows.
- Experience: Proven experience as an SRE or in a similar role, with specific expertise in managing Elasticsearch or OpenSearch clusters.
- Technical Skills:
- Strong knowledge of Elasticsearch/OpenSearch architecture, including index management, sharding, and replication.
- Experience with performance tuning, scaling, and cluster optimization.
- Understanding of JVM concepts and ability to code with Java or Scala, Python, Go.
- Familiarity with monitoring tools (e.g., Prometheus, Grafana)
- Experience with configuration management and automation tools (e.g., Ansible, Terraform, Kubernetes).
- Problem Solving: Ability to diagnose and troubleshoot complex performance and stability issues in large-scale distributed systems.
- Communication: Strong verbal and written communication skills to collaborate across teams and document processes clearly.
Preferred Skills:
- Familiarity with other other distributed systems (e.g., Apache Solr, Kafka).
- Knowledge of CI/CD pipelines and experience with DevOps practices.
- Experience with cloud providers (AWS, Azure, GCP).
Other Jobs from Coralogix
Cloud infrastructure Team Lead
Database Reliability Engineer (DBRE)
Software Engineering Group Leader (Metrics)
Senior DevOps Engineer
Software Engineering Manager, Distributed Query Engine
Similar Jobs
Data Engineer
Data Architect- Manager
Data Engineer
Data Scientist
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
π₯³π₯³π₯³ 401 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineersβ¦ in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. π οΈ
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. π
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. π―
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. π
What Fellow Engineers Say