Instacart, an online grocery store, with already very good sales, is interested in uncovering more information about their sales and the variety of customers along with their purchasing behaviours.
This analysis aims to optimize marketing and sales strategies by understanding customer behavior. It requires identifying peak order times to schedule ads effectively, pinpointing high-spending periods to target product promotions, and simplifying price range groupings. Furthermore, it involves analyzing product popularity, segmenting customer types based on ordering behaviors and loyalty, and exploring regional and demographic influences like age, family status, and income on purchasing habits. Ultimately, the goal is to tailor marketing efforts to distinct customer profiles for increased efficiency and sales.
Data analyst
Marketing & sales departments
Excel
Python (Pandas - Numpy - MatplotLib - Seaborn - Plotly)
Most files were collected on “The Instacart Online Grocery Shopping Dataset 2017”, Accessed from www.instacart.com/datasets/grocery-shopping-2017 via Kaggle, downloaded on May 20th 2024.
NOTE: Data about customers has been fabricated by Careerfoundry for the purpose of the tuition and is not representative of real instacart customers
Cleaning: To make sense out of the five datasets it was necessary to check the quality of the data by dealing with the data types, missing values, duplicates and variable's names.
Merging: Then, in order to have a global view, and making sure the most important file named orders_products_prior.csv was entirely preserved, only inner merges from this dataset were performed. The missing values resulted from this inner merge will be dealt before analysing the final dataset.
Exclusion rule: The sales department were only interested with faithful customers, so they decided to exclude all data from the clients having a purchase history of less than five orders. These excluded profiles could be the base of a new study later in order to find ways to retain them and foster more purchases.
Main insights: