Tuesday 24 November 2015

Interactive DataViz: Rock albums by the genre since 1960


Interactive DataViz here: http://wiki-rock.azurewebsites.net/top10-album-genres.html
Last week I presented a talk in #BuildStuffLT titled “From Power Chords to the Power of Models” which was a study of the Rock Music by the way of Data Mining, Mathematical Modelling and also Machine Learning. It is such a fun subject to explore, especially for me that Rock Music has been one of my passions since I was a kid.

The slides from the talk is available and the videos will be available soon (although my performance during the talk was suboptimal due to lack of sleep, a problem which seems to be shared by many at the event). BuildStuffLT is a great event, highly recommended if you have never been to. It is a software conference with known speakers such as Michael Feathers, Randy Shoup, Venkat Subramaniam, Pieter Hintjens and this year was the host of Melvin Conway (yeah, the visionary who came up with Conway’s law in 1968) with really mind stimulating talks. You also get a variety of other speakers with very interesting talks.

I will be presenting my talk in CodeMash 2016 so I cannot share all of the material yet but I think this interactive DataViz alone is many many slides in a single representation. I can see myself spending hours just looking at the trends and artist names and their album covers - yeah this is how much I love Rock Music and its history - but even for others this could be fun and also help you discover some new to listen to.

DataViz

This is an interactive percentage-based stacked area chart of top 10 genres in a year, since 1960, where Rock Music as we know it started to appear. That is a mouthful but basically for every year, top 10 genres selected so the dataset contains only those Rock (or related) genres that at some point were among the top 10 genres. You can access it here or simply clone GitHub repo (see below) and host your own.


The data was collected from Wikipedia by capturing Rock Albums and then processing their genres, finding top 10 in every year and then presenting in a chart - I am using Highcharts which is really powerful and simple to use and has a non-commercial license too. The data itself I have shared so you can run your own DataViz if you want to. The license for the data is of course Wikipedia’s, which covers these purposes.



I highly recommend you start with the Visualisation with “All Unselected” (Figure 2) and then select a genre and visualise its rise and fall in the history.


Then you can click on a point (year/genre) to list all albums of that genre for that year (Figure 3). Please note that even when the chart shows 0%, there could be some albums for that genre - which are from a year which that genre was not among the top 10 genres.

Looking at the data in a different way

Here is the 50 years of Rock (starting from 1965) with the selected albums:



Things to bear in mind

  • The data has been captured by capturing all albums for all links found in documents that traversed from the list of rock genres then to the artist pages. As far as I know, the list includes all albums by the major (and minor) rock artists - according to Wikipedia. If you find a missing album (or artist), please let me know.
  • Every album will contribute all its genres to the list. This means if it has genres “Blues Rock” and “Rock”, then it will be counted once for each of the its genres and you can find it if you look at both Rock or Blues Rock genres.
  • Data has some oddities, sometimes an album occurs more than once, mainly due to nuances of data in Wikipedia, there are multiple entries (URLs) for the same document, etc. Data has already been cleansed through many processes and these oddities do not materially change the results. In the future however, there are things that can be done remove these remaining oddities.
  • Again, it is highly recommended that you click the “Unselect All” button and click on the genres that you are interested one by one and explore the name of the albums.
  • Clicking “Select All” or “Unselect All” takes a bit too much time. I am sure it has an easy solution (turn rendering off when changing the state) but have not been able to find it. Expect your PRs!
  • There are some genres in the list which are not really Rock genres. These genres would have been mentioned alongside a rock genre in the album cover or had been a not-so-much-rock album by an otherwise Rock artist.

Code and Data

All code and data published in GitHub. Code uses Highchartsjs, knockoutjs and foundations UI framework. Have fun!