PepsiCo

Data Science Associate Analyst

Gurugram, India
Machine Learning AI Spark Databricks Azure Git Jenkins Docker SQL Python PySpark Hive Pig AWS GCP GitHub MLflow Kubeflow Azure Pipelines Azure Data Factory Time Series Reinforcement Learning Bayesian Methods Causal Inference NLP CI/CD EDA MLOps DevOps
Description

Associate Manager - Data Science

Location: Gurugram, Haryana, India

Overview Data Science Team works in developing Machine Learning (ML) and Artificial Intelligence (AI) projects. Specific scope of this role is to develop ML solution in support of ML/AI projects using big analytics toolsets in a CI/CD environment. Analytics toolsets may include DS tools/Spark/Databricks, and other technologies offered by Microsoft Azure or open-source toolsets. This role will also help automate the end-to-end cycle with Azure Pipelines. You will be part of a collaborative interdisciplinary team around data, where you will be responsible of our continuous delivery of statistical/ML models. You will work closely with process owners, product owners and final business users. This will provide you the correct visibility and understanding of criticality of your developments. Responsibilities Delivery of key Advanced Analytics/Data Science projects within time and budget, particularly around DevOps/MLOps and Machine Learning models in scope Active contributor to code & development in projects and services Partner with data engineers to ensure data access for discovery and proper data is prepared for model consumption. Partner with ML engineers working on industrialization. Communicate with business stakeholders in the process of service design, training and knowledge transfer. Support large-scale experimentation and build data-driven models. Refine requirements into modelling problems. Influence product teams through data-based recommendations. Research in state-of-the-art methodologies. Create documentation for learnings and knowledge transfer. Create reusable packages or libraries. Ensure on time and on budget delivery which satisfies project requirements, while adhering to enterprise architecture standards Leverage big data technologies to help process data and build scaled data pipelines (batch to real time) Implement end-to-end ML lifecycle with Azure Databricks and Azure Pipelines Automate ML models deployments Qualifications BE/B.Tech in Computer Science, Maths, technical fields. Overall 2-4 years of experience working as a Data Scientist. 2+ years’ experience building solutions in the commercial or in the supply chain space. 2+ years working in a team to deliver production level analytic solutions. Fluent in git (version control). Understanding of Jenkins, Docker are a plus. Fluent in SQL syntaxis. 2+ years’ experience in Statistical/ML techniques to solve supervised (regression, classification) and unsupervised problems. 2+ years’ experience in developing business problem related statistical/ML modeling with industry tools with primary focus on Python or Pyspark development. Data Science – Hands on experience and strong knowledge of building machine learning models – supervised and unsupervised models. Knowledge of Time series/Demand Forecast models is a plus Programming Skills – Hands-on experience in statistical programming languages like Python, Pyspark and database query languages like SQL Statistics – Good applied statistical skills, including knowledge of statistical tests, distributions, regression, maximum likelihood estimators Cloud (Azure) – Experience in Databricks and ADF is desirable Familiarity with Spark, Hive, Pig is an added advantage Business storytelling and communicating data insights in business consumable format. Fluent in one Visualization tool. Strong communications and organizational skills with the ability to deal with ambiguity while juggling multiple priorities Experience with Agile methodology for team work and analytics ‘product’ creation. Experience in Reinforcement Learning is a plus. Experience in Simulation and Optimization problems in any space is a plus. Experience with Bayesian methods is a plus. Experience with Causal inference is a plus. Experience with NLP is a plus. Experience with Responsible AI is a plus. Experience with distributed machine learning is a plus Experience in DevOps, hands-on experience with one or more cloud service providers AWS, GCP, Azure(preferred) Model deployment experience is a plus Experience with version control systems like GitHub and CI/CD tools Experience in Exploratory data Analysis Knowledge of ML Ops / DevOps and deploying ML models is preferred Experience using MLFlow, Kubeflow etc. will be preferred Experience executing and contributing to ML OPS automation infrastructure is good to have Exceptional analytical and problem-solving skills Stakeholder engagement-BU, Vendors. Experience building statistical models in the Retail or Supply chain space is a plus

Responsibilities

Delivery of key Advanced Analytics/Data Science projects within time and budget, particularly around DevOps/MLOps and Machine Learning models in scope Active contributor to code & development in projects and services Partner with data engineers to ensure data access for discovery and proper data is prepared for model consumption. Partner with ML engineers working on industrialization. Communicate with business stakeholders in the process of service design, training and knowledge transfer. Support large-scale experimentation and build data-driven models. Refine requirements into modelling problems. Influence product teams through data-based recommendations. Research in state-of-the-art methodologies. Create documentation for learnings and knowledge transfer. Create reusable packages or libraries. Ensure on time and on budget delivery which satisfies project requirements, while adhering to enterprise architecture standards Leverage big data technologies to help process data and build scaled data pipelines (batch to real time) Implement end-to-end ML lifecycle with Azure Databricks and Azure Pipelines Automate ML models deployments

Qualifications

BE/B.Tech in Computer Science, Maths, technical fields. Overall 2-4 years of experience working as a Data Scientist. 2+ years’ experience building solutions in the commercial or in the supply chain space. 2+ years working in a team to deliver production level analytic solutions. Fluent in git (version control). Understanding of Jenkins, Docker are a plus. Fluent in SQL syntaxis. 2+ years’ experience in Statistical/ML techniques to solve supervised (regression, classification) and unsupervised problems. 2+ years’ experience in developing business problem related statistical/ML modeling with industry tools with primary focus on Python or Pyspark development. Data Science – Hands on experience and strong knowledge of building machine learning models – supervised and unsupervised models. Knowledge of Time series/Demand Forecast models is a plus Programming Skills – Hands-on experience in statistical programming languages like Python, Pyspark and database query languages like SQL Statistics – Good applied statistical skills, including knowledge of statistical tests, distributions, regression, maximum likelihood estimators Cloud (Azure) – Experience in Databricks and ADF is desirable Familiarity with Spark, Hive, Pig is an added advantage Business storytelling and communicating data insights in business consumable format. Fluent in one Visualization tool. Strong communications and organizational skills with the ability to deal with ambiguity while juggling multiple priorities Experience with Agile methodology for team work and analytics ‘product’ creation. Experience in Reinforcement Learning is a plus. Experience in Simulation and Optimization problems in any space is a plus. Experience with Bayesian methods is a plus. Experience with Causal inference is a plus. Experience with NLP is a plus. Experience with Responsible AI is a plus. Experience with distributed machine learning is a plus Experience in DevOps, hands-on experience with one or more cloud service providers AWS, GCP, Azure(preferred) Model deployment experience is a plus Experience with version control systems like GitHub and CI/CD tools Experience in Exploratory data Analysis Knowledge of ML Ops / DevOps and deploying ML models is preferred Experience using MLFlow, Kubeflow etc. will be preferred Experience executing and contributing to ML OPS automation infrastructure is good to have Exceptional analytical and problem-solving skills Stakeholder engagement-BU, Vendors. Experience building statistical models in the Retail or Supply chain space is a plus
PepsiCo
PepsiCo

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say