Principal Big Data Site Reliability Developer (US Citizenship Required) US REMOTE
Location: United States
This role requires U.S. Citizenship and eligibility for a Federal Security Clearance
Our Team
Building off our Cloud momentum, Oracle has formed a new organization - Oracle Health Data, Analytics Platform. This team will focus on product development and product strategy for Oracle Health, while building out a complete platform supporting modernized, automated healthcare. This is a net new line of business, constructed with an entrepreneurial spirit that promotes an energetic and creative environment. We are unencumbered and will need your contribution to make it a world class engineering center with the focus on excellence.
Oracle Health Data, Analytics Platform has a rare opportunity to play a critical role in how Oracle Health products impact and disrupt the healthcare industry by transforming how healthcare and technology intersect.
You will have the opportunity to:
- Reach billions of people with our products & services
- Create technology in which truly impacts the world
- Ability to have immediate impact on developing technology
- Unlimited growth potential with inspiring work
- Work with the best minds in the industry
- Enjoy working in an open, diverse, and productive environment
About The Job
This role provides technical leadership for the core data platforms behind Oracle Health’s Data & Analytics Platform. As a Principal Site Reliability Engineer (SRE), you will own shared, mission-critical systems used by multiple products and teams.
You will lead the design and operation of large-scale, stateful distributed platforms, including Hadoop ecosystem components (HDFS, YARN, HBase) deployed on Oracle Big Data Service (BDS), Kafka, and Storm. These multi-tenant platforms are deployed and operated through Ansible- and Terraform-based automation and require strong architectural ownership to manage scale, change, and broad blast radius.
What You'll Do
Platform Ownership & Technical Leadership
- Own the end-to-end reliability, scalability, and operability of shared data platforms
- Define platform standards, architectural direction, and operational guardrails
- Influence cross-team technical decisions and long-term platform strategy
- Drive long-term platform evolution and influence reliability strategy across the data ecosystem
Architecture & Design
- Lead platform architecture and design reviews
- Clearly articulate system behavior, dependencies, and failure modes
- Make principled trade-offs between reliability, performance, cost, and complexity
- Provide guidance and guardrails that enable downstream teams to use platforms safely and effectively
Operations Engineering
- Establish capacity models, scaling strategies, and operational best practices
- Design platforms that behave predictably under load, failure, and change
- Own platform lifecycle events: upgrades, expansions, decommissioning, and recovery
Distributed Systems Expertise
- Operate and evolve stateful distributed systems where data placement, replication, and recovery are critical
- Reason about failure modes such as backpressure, rebalancing, region movement, replication lag, and rolling upgrades
Security
- Operate and maintain Kerberized platforms, including authentication, authorization, and secure service-to-service communication
- Treat security as a first-class architectural concern
Automation
- Design and evolve an Ansible- and Terraform-driven automation framework
- Treat automation as production software: versioned, reviewed, tested, and improved
- Eliminate operational toil by encoding reliability and safety into the platform
Incident Leadership & Prevention
- Serve as the ultimate escalation point for complex or ambiguous incidents
- Focus on eliminating entire classes of failure, not just resolving individual issues
Representation
- Represent SRE and platform engineering in high-visibility and sensitive forums
- Communicate clearly with engineering leadership and partner teams
About Us
Only Oracle brings together the data, infrastructure, applications, and expertise to power everything from industry innovations to life-saving care. And with AI embedded across our products and services, we help customers turn that promise into a better future for all. Discover your potential at a company leading the way in AI and cloud solutions that impact billions of lives.
True innovation starts when everyone is empowered to contribute. That’s why we’re committed to growing a workforce that promotes opportunities for all with competitive benefits that support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing [email protected] or by calling 1-888-404-2494 in the United States.
Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
This role provides technical leadership for the core data platforms behind Oracle Health’s Data & Analytics Platform. As a Big Data Principal Site Reliability Engineer (SRE), you will own shared, mission-critical systems used by multiple products and teams. This is a full time remote role in the US and candidates must be US Citizens with the ability to get Public Trust Clearance if hiredThere are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
