MixMash, a Visualisation System for Musical Mashup Creation

MixMash is an interactive tool to assist users in the creation of music mashups based on cross-modal associations between musical content analysis and information visualisation. Our point of departure is a harmonic mixing method for musical mashups by Bernardes et al. [1]. To surpass design limitations identified in the previous method, we propose a new interactive visualisation of multidimensional musical attributes—hierarchical harmonic compatibility, onset density, spectral region, and timbral similarity—extracted from a large collection of audio tracks. All tracks are represented as nodes whose distances and edge connections indicate their harmonic compatibility as a result of a force-directed graph. In addition, we provide a visual language that aims to enhance the tool usability and foster creative endeavour in the search for meaningful music mixes.



The system behind MixMash has a hierarchical harmonic mixing method that relies on indicators from the perceptually-motivated Tonal Interval Space [2], which in turn, represents musical tracks as points in a 12-dimensional space. Metrics for harmonic compatibility, small-scale (local) and large-scale (global) provide harmonic alignments between musical tracks. Small-scale harmonic compatibility can indicate the perceptual relatedness and consonance. The smaller the perceptual relatedness distances, the greater the affinity between two given tracks. Large-scale harmonic compatibility can indicate the degree of association of a given sample to the keys [1][2]. Moreover, the method of hierarchical harmonic mixing considers three additional dimensions that can help users defining compositional goals in terms of rhythmic (onset density), spectral (region) and timbral qualities [1].


Visualisation Model


Figure 1

Screenshot of the visualization interface. On the left, the interaction panel; at the centre, the graph.


With the aim of improving the interface presented by Bernardes et al. [1], we implement a new visualization and respective interactions (Fig. 1). We aim to visualize a collection of tracks according to the user preferences of harmonic, rhythmic, spectral and timbral attributes. Moreover, we allow the user to manipulate the forces of attraction and repulsion between nodes, making easier the reading of more cluttered zones (which represent groups of tracks with a strong harmonic compatibility). To avoid undesired clutter, we group compatible nodes into clusters. These clusters are represented in the graph as a group of nodes that can be expanded or withdrawn through interaction. In addition, we adopt a user-defined threshold value to constraint the number of connections between nodes and uniquely represent the tracks with strong harmonic compatibility.


Graphic Variables


Figure 2

Audiovisual mappings.


Regarding the audiovisual mappings, each track, onset density, spectral region, and timbral attributes are mapped to a corresponding visual variable. The spectral region and onset density are subdivided into three levels of magnitude. Onset density is represented by the fulfilment of a shape. Low density corresponds to an empty shape (only contours visible), medium density to a shape half filled, and high density to a completely filled shape (Fig. 2b). Spectral frequency is characterised by shape and colour, ranging from low frequencies (circle and orange) to higher ones (triangle and blue) (see fig. 2a and 2c).


Figure 3

Timbre representations: a) timbre representation of a track. b) representation of two tracks with similar timbres c) representation of a track (bottom) with more than one timbres d) track with no similar timbres.


Timbre is represented by a coloured circle on the top of the nodes (see Fig. 3). Timbre is only visualized when a node is selected. All tracks with similar timbres will also gain a circle coloured with the same colour code as the selected one. For last, to differentiate the nodes representing keys from the ones representing audio tracks, we colour the outline of the key nodes in red and represent the key’s tonic pitch with typography. (see Fig. 4)

Figure 4

Two clusters of different sizes.


Force-directed Graph

The force-directed graph is based on the work of Jacomy et al. [3] to represent the small-scale and large-scale harmonic compatibility distance metrics. In this layout, nodes represent both musical audio tracks and each one of the 24 major and minor musical keys. To distinguish the two types of nodes, we use different visual representations (previously discussed). We have access to the computed harmonic compatibilities, which resulted in a matrix of distances among tracks and keys. Through this matrix, we define the weight for each edge, and therefore the force of attraction between nodes. The similar the nodes, the higher the force of attraction, and consequently the closer the nodes. Additionally, by representing the musical keys through nodes, and consequently their relation to the tracks, we also represent the most probable key for each music track.

We use an agglomerative clustering algorithm and aggregate the nodes by their compatibility values. We also connect these clusters to adjacent nodes. These adjacent nodes are retrieved from the list of adjacent nodes of each node present in the cluster. The attraction force between a cluster and an adjacent node is equal to the average force of all forces between “inner” nodes and “outer” nodes.



[1] G. Bernardes, M. E. P. Davies, and C. Guedes, “A hierarchical harmonic mixing method”, Lecture Notes in Computer Sciences: Post-proceedings of the CMMR’17 Conference, In press.
[2] E. L. Mudge, “The common synaesthesia of music”, Journal of Applied Psychology, vol. 4, no. 4, pp. 342–345, 1920.
[3] M. Jacomy, T. Venturini, S. Heymann, and M. Bastian, “Forceatlas2, a continuous graph layout algorithm for handy network visualization designed for the gephi software”, PloS one, vol. 9, no. 6, 2014.


To appear in

Catarina Maçãs, Ana Rodrigues, Gilberto Bernardes, and Penousal Machado. Mixmash: A visualisation system for musical mashup creation. In 22 International Conference Information Visualisation, 2018