Sean Miller recently shared the raw data he has used for visualisations on his blog, and I spent much of my day today working with it while waiting for engineers to replace a defunct boiler which died on the coldest day of the year. It’s all good – the new boiler has a cool gadgety digital thermostat that I can control with an App, that’s worth the cost (isn’t it?!)

One of the data sources tracked the attributes of various Metallica songs. My final visualisation was pretty cool, but it wasn’t the first thing I had in my mind. What I wanted to initially create was some sort of gantt navigation to filter the attributes associated with the tracks from each album, but I couldn’t figure it out.

The reason for this is that most gantt charts that I have seen created in Tableau are based on nice, neat datasets where you can use existing fields to calculate the size of the gap between data points. What if your dataset doesn’t contain such handy fields?

A perk of our corporate partnership with The Information Lab is that I can raise tickets online with their expert consultants and get rapid replies. I reached out today and Emily Chen nailed the solution really quickly, so I wanted to write about it to absorb the information.

Here’s the data:

screen-shot-2016-12-28-at-21-34-06

One release date per album, and (obviously) multiple tracks per album. I just wanted to plot a basic gantt showing the duration between album releases. I started with this:

screen-shot-2016-12-28-at-21-36-50

And that is where I stopped! I knew that I needed to find a way to identify the “gap” between album releases, but the long/tall data structure and my lack of experience had me stumped. In stepped Emily with a two stage solution. The first thing to do is work out when the next album is released, assuming that your albums are sorted in ascending order by release date:

screen-shot-2016-12-28-at-21-39-32

I’m getting to grips with LOOKUP() now, but let’s just cement that tentative learning with a reminder:

screen-shot-2016-12-28-at-21-40-21

OK – find something offset by a set number of rows. Nested within the calc is DATEPART, and I’m not overly familiar with that:

Screen Shot 2016-12-28 at 21.41.55.png

OK. So the calc converts the Year component of the album Release Date to an integer. Within the context of the LOOKUP(), I’m expecting the calculation to pull through the Year component of the next released album. Emily even created a table in her shared solution to show this:

Screen Shot 2016-12-28 at 21.45.13.png

Where the [Gap] calculation is set to Compute Using Year of Release Date and Album, the result is shown in the [Gap] column. Just to labour a point, lets see what happens if I offset the rows by 2, rather than 1:

Screen Shot 2016-12-28 at 21.49.26.png

So the enticingly named ‘Kill ‘Em All’ jumps down two rows to ‘Master of Puppets’ to return 1986 in [Gap], rather than the single row leap to 1984 for ‘Ride The Lightning’ where the offset is 1. That’s how LOOKUP() works. With the calculation reverted back to the correct offset of one, the second stage of the process can be done: another calculation:

screen-shot-2016-12-28-at-21-51-34

[Gap] has determined our end point, so all that remains is to calculate the difference between that end point and the Year of the current album. This calculation delivers that, and can then be dropped on the Size card as a Continuous Measure (it has to be Continuous in order to draw the full “duration” of the elapsed time between albums):

Screen Shot 2016-12-28 at 21.53.49.png

Again, Compute Using refers to Year of Release Date and Album. That’s how Emily fixed my problem, and it’s hopefully how I’ll remember this use case for LOOKUP() in future.

Advertisements