CroP – Coordinated Panels Visualization

 

CroP is a data visualization tool that utilizes multiple coordinated views to portray complex networks and multivariate data, in particular time-series. Through various dynamic visualization models, CroP aims to highlight and analyze patterns of data, such as relationships between groups of data points, significant shifts in values and behaviors over time.

 

While the primary objective of the development of CroP has been the analysis of biological datasets, such as gene expression time-series over protein-protein interaction networks, it is also capable of loading generic networks, temporal and multi-variate data.

 

Figure 1

Screenshot of CroP representing a clustered protein-protein interaction network embedded with a 24 hour gene expression profile of the HIV-1 virus.

 

Data visualization are contained within panels that can be switched, resized and moved in order to adapt the workspace to the data being analyzed or to specific queries from the user. Multiple datasets can be uploaded and visualized simultaneously through different panels, while their data can be analyzed using various functionalities, including clustering and filters, and at different levels of detail through four visualization models:

 
 

Data Table  — The data table is the visualization that shows every data point at its lowest level, listing them in a sortable table where rows can be selected to access the proprieties of each individual element, or the aggregated proprieties of clusters of data points.

 

Figure 2

Different groups of data tables showcasing a selected gene and its profile (left), a list of genes within a cluster (middle) and the Gene Ontology proprieties of the genes within that cluster (right).

 
 

Network  — The network panel provides multiple layouts to position nodes based on either their direct relationships or their similarity in values, either between multiple variables or over time. When data is clustered, the network reacts dynamically and sorts nodes into visual groups while keeping in mind their relationships. The mouse can also be used as a lens that aggregates the proprieties of every node that is brushed, making it easy to identify general proprieties and trends within the network.

 

Figure 3

Three network panels with different layouts. The Yifan Hu layout sorts nodes based on relationships (left), the T-SNE layout sorts nodes based on values (middle), and the clustering layout sorts nodes into groups (right).

 

Figure 4

Brushing nodes with the mouse lens on the network panel to analyze Gene Ontology proprieties, while the same nodes are selected in the data tables, which show an aggregated profile.

 
 

Time Curve  — The time curve panel provides visualizations and tools meant for the analysis of time-series datasets. A time curve consists of a timeline that is bent so that its time points are placed relatively to each other according to their similarity, creating visualizations capable of portraying behaviors over time, such as cycles, periods of stagnation and instances with significant changes. Temporal glyphs are used to showcase the state of the data at each time step, and a supporting timeline graph portrays how the data throughout time. Moreover, the mouse can also be used as a lens to help identify the data points responsible for the behaviors previously represented.

 

Figure 5

Time curves panels showcasing three different layouts. Initially time points are displayed linearly (left) and then a force-directed layout will bend this timeline based on their similarity, creating a Time Curve (middle). This layout can then be smoothened to create a more fluid visualization and remove visual noise (right).

 

Figure 6

Various visualizations of two datasets created through Time Paths, the smoothening layout.

 
 

Variables View  — The variables view panel portrays all the loaded variables as points in two-dimensional space, providing layouts that may help reveal patterns between their values within large and complex datasets. The functionalities of this panel are similar to those of the time curve panel, but for data that does not have an explicit order.

 

Figure 7

The initial layout of the variables view panel displays each variable as a node in a grid (left), which can then be sorted by similarity of values with the T-SNE layout (middle). The mouse lens is also available in this panel, showing differences in variables between various clusters (right).

 
 

Interactions within each panel will be coordinated between them, highlighting selected elements or groups across all the relevant visualizations in order to facilitate their analysis.

 

A save/load function is also available, allowing users to create a file that saves all of their data, panels and layout parameters (including clustering) onto a single file that can be loaded at any time to restore the workspace to when the file was first saved, with minimal loading times.

 

Figure 8

Visualization of a gene expression dataset of Plasmodium Falciparum, the agent responsible for human malaria.

 

Figure 9

Visualization of a dataset containing transcription profiles across the yeast cell cycle, depicting two cycles.

 
 

CroP is freely available for download below as an executable jar, along with sample datasets. The supported data formats and available functionalities are detailed in the user manual.

 

Download Links

 

 

Sample Dataset Setup
  • Select “Load Network” and then “Network File (.csv)” to load the Human PPI network file;
  • Select “Load Time-Series” then “Time-Series File (.csv)” to load the HIV-1 time-series file;
  • Choose “Filter” to not add genes that are not present in the network;
  • Visualizations can be dragged with the left mouse button and zoomed with the scroll wheel;
  • Cluster by variance and switch the number of clusters with the slider.

 

Publication

  • A. Cruz, P. Machado, and J. P. Arrais, “CroP-Coordinated Panel visualization for biological networks analysis,” Bioinformatics, 2019.

  • A. Cruz, J. P. Arrais, and P. Machado, “Exploring Time-Series Through Force-Directed Timelines,” in 2020 24th International Conference Information Visualisation (IV), 2020.

Author

António Cruz

Joel P. Arrais

Penousal Machado