Cloudera

Senior Data Scientist

Remote
Python R SQL Pandas NumPy scikit-learn Generative AI LLM Streamlit Gradio FastAPI GitHub
Description

Senior Data Scientist

Location: Costa Rica-Remote

Remote Type: Remote

Time Type: Full time

Job Description

Business Area:

IT

Seniority Level:

Mid-Senior level

Job Description: 

At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry.  Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.

At Cloudera, we believe that data can make what is impossible today, possible tomorrow. We empower organizations to transform complex data into clear and actionable outcomes. Join us in our mission to harness the power of data.

We are seeking a talented and curious Senior Data Scientist to join our fast-paced, data-driven organization. In this role, you will design and deliver AI-powered systems and applications that accelerate decision-making and enhance operational excellence.

You will combine strong statistical foundations, advanced programming expertise, and modern Generative AI techniques to build scalable, production-ready solutions. This is a builder-focused role. You will move beyond analysis to develop internal copilots, AI-enabled workflows, and reusable platform components that embed intelligence directly into business processes.

Our work empowers leadership and operational teams by creating measurable, AI-enabled capabilities. We seek a thoughtful and pragmatic innovator who is enthusiastic about GenAI, disciplined experimentation, and building durable internal AI infrastructure.

To succeed in this role, you will demonstrate technical depth, intellectual curiosity, and a strong builder mindset:

  • Data Science & Machine Learning Expertise: Proficiency in Python (or R) for data preparation, feature engineering, statistical modeling, and machine learning. Experience with core data science libraries (e.g., Pandas, NumPy, scikit-learn) and a solid understanding of supervised and unsupervised learning methods.

  • SQL & Data Fluency: Strong understanding of relational databases and the ability to quickly learn new schemas and data environments. Comfortable writing efficient, production-grade SQL to support modeling, experimentation, and AI-enabled applications.

  • Generative AI & LLM Engineering: Hands-on experience working with large language models (LLMs) and modern AI tooling. This includes prompt design, structured output generation, retrieval-augmented generation (RAG), evaluation strategies, and workflow automation. Ability to translate GenAI capabilities into reliable, enterprise-ready solutions that integrate with existing systems and data sources.

  • AI Application Development Experience: rapidly prototyping and iterating on internal applications, copilots, or AI-enabled workflow tools. Comfortable evolving prototypes into maintainable, production-grade solutions. Familiarity with modern development frameworks (e.g., Streamlit, Gradio, FastAPI, or similar) is beneficial.

  • Platform-Oriented Thinking: Demonstrated ability to design reusable components such as shared prompt libraries, retrieval pipelines, evaluation frameworks, and standardized integration patterns that enable scalable AI adoption.

  • Strong Mathematical and Statistical Foundation: Deep understanding of probability, statistical inference, experimentation, and quantitative reasoning to ensure model robustness and reliability.

  • Collaborative Development Experience: Experience working in collaborative environments such as Cloudera Data Science Workbench, Jupyter, Zeppelin, or similar platforms.

  • GitHub Proficiency: Experience using version control to support collaboration, code review, documentation, and long-term maintainability.

  • Exceptional Communication Skills: Ability to translate complex business challenges into technical solutions and clearly communicate findings, trade-offs, and recommendations to both technical and non-technical stakeholders.

As a Senior Data Scientist, you will:

You will apply rigorous analytical thinking and modern AI capabilities to design, build, and scale high-impact solutions.

  • Design, develop, and deploy GenAI-powered internal applications, copilots, and workflow accelerators.

  • Build reusable AI components, including retrieval pipelines, structured prompting patterns, orchestration workflows, and evaluation harnesses.

  • Develop and maintain statistical and machine learning models to support automation, optimization, forecasting, and classification use cases.

  • Design retrieval strategies that connect LLMs to trusted internal knowledge sources, ensuring grounded and reliable outputs.

  • Implement evaluation and validation frameworks to measure quality, accuracy, and consistency of AI-driven systems.

  • Partner cross-functionally to identify high-value opportunities for AI enablement across the organization.

  • Create reusable datasets, feature pipelines, and experimentation frameworks to support iterative development.

  • Document methodologies, assumptions, and implementation details to ensure transparency and reproducibility.

  • Uphold high standards for quality, reliability, and responsible AI practices.

  • Contribute to peer review processes to ensure technical rigor and maintainability.

We are excited if you have (Required Experience):

  • 5+ years of relevant experience in Data Science, Machine Learning, or AI-focused roles.

  • Demonstrated experience applying machine learning techniques in production or enterprise environments.

  • Hands-on experience building applications or workflows powered by large language models (LLMs).

  • Evidence of a builder mindset through shipped AI tools, internal platforms, or automation solutions.

  • Strong curiosity for emerging AI technologies and the ability to evaluate and adopt them responsibly.

  • Academic background in a quantitative discipline such as Statistics, Mathematics, Computer Science, Engineering, Economics, or a related field.

You may also have: (Preferred Qualifications)

  • Experience designing internal AI platforms or shared enablement frameworks.

  • Familiarity with API-driven architectures and integrating AI capabilities into enterprise systems.

  • Experience with vector databases, embedding models, or semantic retrieval systems.

  • Exposure to responsible AI practices, governance frameworks, or model lifecycle management.

This role is not eligible for immigration sponsorship.

What you can expect from us:

  • Generous PTO Policy 

  • Support work life balance with Unplugged Days

  • Flexible WFH Policy 

  • Mental & Physical Wellness programs 

  • Phone and Internet Reimbursement program 

  • Access to Continued Career Development 

  • Comprehensive Benefits and Competitive Packages 

  • Paid Volunteer Time

  • Employee Resource Groups

EEO/VEVRAA

#LI-MH2

#LI-REMOTE

Cloudera
Cloudera

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say