Job Title: Platform Reliability Engineer
Career Level - E
Introduction to role:
Join us as a Platform Reliability Engineer in our Commercial IT – SSD, Data, Analytics and AI Platform Success Team. Your primary focus will be to ensure the stability, performance, and reliability of our Data, Analytics, and AI systems. You will bridge the gap between development and operations by generating insights into sub-optimal processes and optimization opportunities. This role offers an exciting opportunity to integrate Agile, Lean and SaFe practices within monitoring and observability initiatives and to continuously improve delivery cycle times.
Accountabilities:
As a Platform Reliability Engineer, you will be responsible for the evaluation, selection, and deployment of monitoring & observability technologies. You will manage and maintain monitoring infrastructure, ensuring it aligns with industry best practices. You will collaborate with DevOps, CriticalOps and IT leadership teams to understand system requirements and design effective monitoring strategies. You will also develop and implement monitoring solutions for infrastructure, applications, and services.
AstraZeneca is a global, innovation-driven biopharmaceutical business with a primary focus on the discovery, development and commercialization of prescription medicines. Our purpose as a company is to push the boundaries of science to deliver life-changing medicines and greater efficiency & innovation in healthcare.
As science moves forward, technology needs to keep pace. AstraZeneca has created a world class IT organization by radically reinventing the current IT operating model and organization design; supplier ecosystem optimization and insourcing; establishment of a network of global Technology Centers; significant Infrastructure and Technology transformation; cultural change and risk management.
As an individual contributor within the Commercial IT – SSD, Data, Analytics and AI Platform Success Team the Platform Reliability Engineer, responsibilities include the following:
· Ensuring the stability, performance and reliability of Data, Analytics and AI systems by implementing and maintaining robust monitoring and observability solutions
· Primary focus will be to design, deploy, and manage monitoring tools and practices that provide insights into the health and performance of our data infrastructure and analytics processes
· Help bridge gap between development and operations by generating insights into sub-optimal processes and optimization opportunities.
· Maintaining working knowledge of platform architecture and business acumen
· Ability to integrate Agile, Lean and SaFe practices within monitoring and observability initiatives and to continuously improve delivery cycle times
· Exploring and implementing new ways to automate systems - Designing and testing automation processes, identifying quality issues and supporting IT platform teams to eliminate defects and errors with product and platform development.
Experience leveraging AIOps capabilities to uplift existing production operations
Technology/Tool Management
- Responsible for the evaluation, selection, and deployment of monitoring & observability technologies (internal or market available) suitable for the organization’s needs – this includes creation of effective business case(s) to influence investment and innovation
- Manage and maintain monitoring infrastructure, ensuring it aligns with industry best practices
Monitoring & Observability Practice Management
- Collaborate with DevOps, CriticalOps and IT leadership teams to understand system requirements and design effective monitoring strategies that align with organizational goals and objectives
- Establish key metrics and KPIs that enable insights and analytics to achieve data-driven continuous improvement backlog
- Provide training and support to other teams on using monitoring tools effectively
- Create and maintain documentation for monitoring and observability practices, including standard operating procedures and best practices
- Stay abreast of industry trends, emerging technologies, and best practices related to monitoring and observability platforms
Monitoring & Observability Implementation & Operations
- Develop and implement monitoring solutions for infrastructure, applications, and services
- Design and configure alerting mechanisms to deter and respond to potential issues proactively
- Use monitoring tools to identify and troubleshoot issues in real-time
- Collaborate with other teams to resolve incidents promptly and prevent reoccurrence
- Analyze monitoring data to identify performance bottlenecks and areas for improvement
- Work with development and operations teams to optimize system performance based on monitoring insights
- Implement automation scripts and workflows to streamline monitoring processes
- Integrate monitoring solutions with existing frameworks for seamless operation
- Identify and evaluate “self-healing” opportunities based on production issue trend analysis to inform AIOps roadmap
Essential
- Degree level education in computer science, information technology, or a related field
- Proven experience as a monitoring and observability engineer or a similar role
- Proficient in developing monitoring capabilities and configuring integration with tools such as Prometheus, Grafana, Splunk, SumoLogic, DataDog, DynaTrace, etc.
- Strong scripting skills (e.g., Python) for automation in data environments
- Familiarity with logging, tracing, and APM (Application Performance Monitoring) solutions
Desirable
- Customer engagement experience
- Knowledge of data processing frameworks (e.g. Apache Spark) and data storage solutions (e.g. data lakes, warehouses)
- Experience with data orchestration tools (e.g. Apache Airflow)
- Understanding of data lineage and metadata management
When we put unexpected teams in the same room, we unleash bold thinking with the power to inspire life-changing medicines. In-person working gives us the platform we need to connect, work at pace and challenge perceptions. That’s why we work, on average, a minimum of three days per week from the office. But that doesn't mean we’re not flexible. We balance the expectation of being in the office while respecting individual flexibility. Join us in our unique and ambitious world.
Why AstraZeneca?
Join us at a crucial stage of our journey in becoming a digital and data-led enterprise. Make the impossible possible by building partnerships and ecosystems, creating new ways of working and driving scale and speed to deliver exponential growth. Focused and committed, and backed with the investment to succeed, we're driving cross-company change to disrupt the entire industry. Our work unlocks the potential of science. We optimise and revolutionise AstraZeneca by maximising efficiencies and finding new ways to drive productivity. From automation to data simplification.
Ready to make a difference? Apply today and be part of a team that has the backing to innovate, disrupt an industry and change lives.
Date Posted
05-Nov-2024Closing Date
07-Nov-2024AstraZeneca embraces diversity and equality of opportunity. We are committed to building an inclusive and diverse team representing all backgrounds, with as wide a range of perspectives as possible, and harnessing industry-leading skills. We believe that the more inclusive we are, the better our work will be. We welcome and consider applications to join our team from all qualified candidates, regardless of their characteristics. We comply with all applicable laws and regulations on non-discrimination in employment (and recruitment), as well as work authorization and employment eligibility verification requirements.
0 applies
0 views
Other Jobs from AstraZeneca
Senior Information Architect – Master and Reference Data Management
GBS Project Services – Senior Project Manager
IT Network Strategy & Supply Chain Planning Product Manager
Similar Jobs
Lead Data Analyst
Site Reliability Engineer - Platform Microservices Reliability
Senior Site Relibility Engineer
Senior Site Reliability Engineer - Platform Microservices Reliability
Embedded Systems Software Engineer
Staff Software Engineer, Payments
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 401 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say