Transitions of Customers Among Supermarkets

Representing large amounts of flowing data involves dealing with the representation of directionality and the reduction of visual clutter.
This project applies a flow representation technique to the visualization of transitions of customers among supermarkets over time.
This approach uses a swarm-based system in order to reduce visual clutter, bundling edges in an organic fashion and improving clarity.
 
 

A Customer’s Individual Story

 
Every time a customer of supermarket makes a purchase, he leaves a digital footprint. A customer’s buying history can reveal a change in supermarket. In the following example, a customer spends mainly in two places – Braga and Covilhã. By visualizing buying history we can observe the customer’s routine and the deviations from his usual locations of purchases (e.g. Barcelos, Reguengos de Monsaraz and Lisbon).
 

 
 

Summer Story

 
The pattern of an individual customer may tell interesting stories for the customer. However, the transitions of numerous customers can tell even more for all of us. By visualizing flows of customers we were able to reveal seasonal routines. More particularly, in the beginning of Summer there is a huge flow towards the South of the country. In the middle of Summer the flow between big cities and the Southern coast is more balanced, with a bi-directional flow of similar intensities in both directions. By the end of Summer more and more customers return to their usual routines. As was expected this pattern repeats every year.
 

 
 

Christmas Routine

 
Another seasonal pattern that was revealed in the visualization is during Christmas time. The color orange indicates that there are far more transitions than expected. Moreover, these unexpected deviations start to appear one week before Christmas and the deviation gets higher as we get closer to Christmas eve. Surprisingly, during this period, customers are used to make purchases far from their local supermarkets. Additionally, we observed substantial flows towards supermarkets located at shopping malls.
 

 
 

Store Inaugurations

 
While exploring the data we encountered what seemed random events of high peaks of positive deviations. At particular times there was a substantial positive flow towards one single supermarket, that then slowly stabilized. These events were inaugurations of new supermarkets. Indeed, facts show that a lot of people that live nearby are interested in visiting a new store and the observed highest peak happens on the first day, regardless the day of the week.
 

Inauguration of a supermarket in Santa Maria da Feira in 28 of November of 2012
Figure 1

Inauguration of a supermarket in Santa Maria da Feira in 28 of November of 2012


 

Data

 
Our dataset consists of 278GB of information about customer purchases in 729 supermarkets in Portugal in a time span of 24 months (from May, 2012 until April, 2014), including the geo localization of 682 supermarkets, as well as the regions of the country they belong to. The dataset comprises approximately 2.86 billions of transactions where each transaction has the following attributes: customer card id, amount spent, product designation, quantity of the purchased products and the date and time of the transaction. It is important to note that several individuals may hold the same customer card with an unique client id (e.g. members of a family). The dataset has a total of 6.6 Million unique card ids.
 
The extraction of transitions among supermarkets was done as follows: first the data is aggregated by day (24 hours); then for each client the sequence of transitions is computed by excluding subsequences of repeated places.
  

Extraction of transitions diagram
Figure 2

Extraction of transitions diagram



 

Visualization: swarm-based edge bundling

 
In order to reflect the flowing nature of the information we resort to a swarm system, which is comprised by artificial agents (boids) that react to the presence and characteristics of neighboring boids. When running the system each boid tries to draw a specific transition, adapting its path in order approximate other similar paths from other boids. This way, visual patterns emerge, representing packed flows of transitions.
 
Each boid in the system is characterized by direction, speed, radius of vision, the number of transitions that it represents, a set of behavioral rules, and its unique origin-destination points. During the simulation each boid leaves a persistent trail with information on the speed at each point.
 

 
The behavioral rules of each boid are determined through the interaction with the trails of other boids. Pairwise comparison between boids and trail points establishes the relationship between them and their behavior. If the boids were advancing in similar directions, they are considered friendly. If the agents advance in opposite directions, they are considered unfriendly. Otherwise, they ignore each other. The degree of similarity affects the force of attraction or repulsion between agents and trail points. Therefore, friendly agents advance together as a group and unfriendly agents repel from each other avoiding collisions.
 

Pairwise comparison between one boid and neighboring trail points. The black dot and the arrow in the center are the current boid and its direction. The gray dots are trail points left by other boids. The dashed circle is the radius of vision and the dashed lines are the relations with the trails where green and red lines represent friendly and unfriendly relationships respectively. A gray dashed line connects a trail point that is ignored.
Figure 3

Pairwise comparison between one boid and neighboring trail points. The black dot and the arrow in the center are the current boid and its direction. The gray dots are trail points left by other boids. The dashed circle is the radius of vision and the dashed lines are the relations with the trails where green and red lines represent friendly and unfriendly relationships, respectively. A gray dashed line connects a trail point that is ignored.


Results

 
In the swarm system each boid attempts to avoid boids with opposite directions, making paths with opposite directions distinguishable.
 

General flows
Figure 4

General flows


Since the boids in the system attempt to avoid static points, the Supermarkets, nodes encoded with white circles, are clearly visible and do not visually interfere with the lines drawn by the swarming algorithm.
 

Flows zoomed in, displaying the metropolitan area of Lisbon.
Figure 5

Flows zoomed in, displaying the metropolitan area of Lisbon.

Interactive Application

 
An interactive application allows the user to explore the data from different perspectives. In terms of functionality, the user can navigate on a timeline, and zoom and pan the map while filtering the data and switching between two modes of representation (static and animated).
 

Screen capture of the interactive application
Figure 6

Screen capture of the interactive application


 

Author

Evgheni Polisciuc

Pedro Miguel Cruz

Catarina Maçãs

Hugo Amaro

Penousal Machado


Acknowledgements

This project was developed within a partnership with Sonae: Sonae Viz — Big Data Visualization for retail. Special thanks to Tiago Carvalho, Frederico Santos and João Amaral.