# Calendar Views on Consumption

In this project, we started tackling the visualization of consumption patterns in 729 Portuguese supermarkets, from May 2012 to April 2014. In order to represent the data, the Calendar View uses a small-multiples approach to show deviations from typical consumptions. Small-multiples, can use the display space efficiently by maximizing data density as they minimize the use of ink. We analyzed the consumptions by Department (the highest level on the product hierarchy), by Business Unit, and finally by Product (bottom level).

### Calendar: Drinks

Looking at drink’s consumption, noticeable positive deviations can be found for Christmas, New Year’s Eve, on Summer and on other Portuguese Holidays such as São João, on July 23-34.

### Calendar: Cod Fish

Having cod fish during Christmas Eve is an important Portuguese tradition. That can be correlated with the positive deviations observed on December 2013. Furthermore, high deviations can also be found for July 27-28 coinciding with discounts on cod fish during that period.

### Calendar: Condoms

Portuguese people also enjoy fishing in other waters. Condom’s sales increase on Valentine’s Day, New Year’s Eve, Summer overall and on May 1st – Labor’s Day.

### Data

The dataset has a size of 278 GB for 2.86 billions of transactions in 729 supermarkets, from May 2012 to April 2014. Each transaction has the product acquired, the time of purchase, and the anonymized customer id.

### Finding Deviations: Baselines

After an initial analysis, we detected the repetition of a weekly behavior for most of the weeks. Having this periodic behavior, we created a mechanism to emphasize atypical days by computing a week-based baseline. The baselines are computed by clustering similar normalized patterns. Two patterns are considered similar if the Euclidean distance is less than a certain threshold. Our clustering approach is a centroid-based algorithm that assigns points to a cluster accordingly with their distances to the cluster’s centroid.

### Visualization: Calendar View

We developed a Calendar View to create an overview of the deviations from the baselines, enabling the comparison between deviations. Each day is represented by a rectangle. The top and bottom edges of the rectangles represent, respectively, the maximum and minimum consumption values. The baseline is a black horizontal line positioned over the rectangle. From each baseline, we draw a rectangle, with a height corresponding to the deviation in consumption for that day, being red, if it is positive, and Persian green, if it is negative.

This visualization gives us two levels of information: (i) a general overview of all the days with higher deviations, and (ii) a local view that enables to quantify the specific deviation. The Calendar View highlights the deviations along time, eliminating periodic repetitions, and emphasizing singular moments.

### Interactive Application

We have created an application to ease the selection and exploration of calendars for any product category.

Publication

• C. Maçãs, P. Cruz, H. Amaro, E. Polisciuc, T. Carvalho, F. Santos, and P. Machado, “Time-series Application on Big Data – Visualization of Consumption in Supermarkets,” in IVAPP 2015 – Proceedings of the 6th International Conference on Information Visualization Theory and Applications, Berlin, Germany, 11-14 March, 2015., 2015, pp. 239-246.

Author

Catarina Maçãs

Pedro Miguel Cruz

Evgheni Polisciuc

Hugo Amaro