Job Description:
Business Overview
The AI & Data Division (AIDD) creates powerful, customer-focused search, recommendation, data science, advertising, marketing, price and inventory optimization solutions to a variety of businesses in commerce industries. We design, develop, and deploy high performance, fault-tolerant distributed systems used by millions of Rakuten customers every day. We strive to deliver the most innovative solutions that are helpful to people and societies around the world.
Department Overview
The Merchandising and Advertisement Department (MAD) is a dynamic, cross-functional team comprised of talented individuals from around the globe. Our mission is to enhance the Rakuten experience for millions of consumers, merchants, advertisers, and partners worldwide. We specialize in cutting-edge solutions, including personalized recommendations, search advertising, retargeting, and incentive optimization. By harnessing our scale, data, and advanced machine learning algorithms, we strive to create innovative, optimized experiences that benefit society both online and offline.
Position:
Position Details
We are looking for a talented Platform Engineer with a strong passion for DevOps, DevSecOps, and MLOps engineering. The ideal candidate will have hands-on experience with technologies such as Kubernetes (k8s), Docker, Terraform, Prometheus, CephFS, JupyterHub, and Python. In this role, you will leverage your expertise to configure and maintain our Kubernetes-based platform, manage CephFS-based distributed storage systems, and enhance in-house frameworks. If you enjoy working in a collaborative environment and are excited by the prospect of solving complex challenges, weβd love to have you join our team!
Key Responsibilities:
Contribute to Platform: Design, code, test, release, and maintain various components of the platform.
Collaboration and Delivery: Collaborate with software engineers & data scientists to address their requests and ensure timely delivery of software solutions.
Continuous Improvement: Proactively propose and implement system and process improvements, such as refactoring, adopting new technologies, and enhancing system architecture.
Cloud Infrastructure Management: Design, deploy, and manage cloud infrastructure on AWS, Azure, or Google Cloud. Optimize resource usage and control costs in the cloud environment.
Containerization & Orchestration: Implement and manage containerized applications using Docker and Kubernetes. Automate deployments and manage container orchestration to ensure scalability and availability.
Infrastructure as Code (IaC): Develop and maintain infrastructure using IaC tools like Terraform to ensure consistency in deployments and version control.
CI/CD Pipelines: Design and implement continuous integration and continuous deployment pipelines. Automate testing and deployment processes to streamline software delivery.
Automation & Scripting: Write scripts in Python or Bash to automate repetitive tasks, improving operational efficiency. Create tools to simplify the management and monitoring of applications and infrastructure.
Monitoring & Logging: Implement monitoring solutions such as Prometheus and Grafana to track system performance and health. Set up logging frameworks like the ELK stack for log analysis and issue diagnosis.
Documentation & Support: Document processes, configurations, and best practices for future reference and onboarding. Provide support and guidance to development teams on deployment and operational concerns.
Mandatory Qualifications:
Bachelor's Degree (BS) in Computer Science or a related field, or equivalent education and experience.
Knowledge of Linux administration (Red Hat/Ubuntu) including encryption (LUKS/x.509), scripting, monitoring, security, logging, networking, and SSH.
Over three years of proven experience as a Platform Developer.
Familiar with Kubernetes, with hands-on experience in developing and deploying operators(Airflow, Flink, Spark) on it.
Experience working with database systems such as Couchbase, Redis, or Cassandra.
Familiarity with CI/CD tools like Jenkins, Laminar etc.
Able to work independently and as part of a team.
Strong communication and collaboration skills.
Desired Qualifications:
Familiarity with distributed filesystems is a plus, such as CephFS, HDFS, Lustre, or BeeGFS.
Experience with Infrastructure as Code tools is a plus, including Chef, Ansible, Salt, Puppet, or Terraform.
Knowledge of cloud platforms like AWS, Azure, or Google Cloud Platform is a plus.
Certifications, such as Certified Kubernetes Administrator or Certified Kubernetes Application Developer, are a plus.
Knowledge of big data processing frameworks like Hadoop, Spark, Kafka, Flink, or Druid is a plus.
#ai #engineer #devops #mlops #platform #kubernetes #ceph #cloud
Languages:
English (Overall - 4 - Fluent)Other Jobs from Rakuten
Similar Jobs
Senior Unix/Linux Infrastructure Engineer
Associate Site Reliability Engineer/Site Reliability Engineer
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
π₯³π₯³π₯³ 401 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineersβ¦ in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. π οΈ
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. π
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. π―
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. π
What Fellow Engineers Say