#TakeapartTuesday posts have been exceedingly rare of late, so I’m making a vaguely committed statement henceforth:
It was a unique take on a dataset which inspired great creativity across the community, so let’s see what sorcery Sarah employed to create those lines.
First of all, we need to understand what Sarah has plotted; the x-axis is showing the duration of paid maternity leave in weeks, whilst the y-axis depicts the average payment rate (i.e. what percentage of “full” pay was honoured during the maternity leave period.) This explains why we have a mix of steady horizontal lines, disrupted by reductions in the payment rate as the duration of maternity leave extends over time.
With that in mind, it soon becomes clear that to create this visualisation, Sarah has had to do some data preparation to get the source data into the right shape. The original source is arranged like this:
Yes – that is Excel. I’m sorry.
However – what that image highlights is that the source data is aggregated to one row per Country, and that isn’t too helpful if trying to plot a continuous line depicting payment rates over time.
Data prep is a weakness of mine (I’m lucky to work alongside colleagues eminently more competent at “technical stuff” than I am), so to recreate Sarah’s viz, I’m sticking with Excel. The aim is to create a row of data per paid week of maternity leave, per Country. If I ensure that I include the correct [Paid maternity leave avg payment rate (%)] for the week, then we should be good to go.
Let’s use Bulgaria as an example. We can see from the image of Sarah’s viz that Bulgaria offers 59 weeks of paid leave, with the first 46 weeks at 100% before reducing to 78% thereafter. Can we validate that in the source data?
It’s clear that 46 weeks are at full pay, but is the 78.4% the overall average for the 59 weeks, or just for the period between weeks 47 and 59? The data dictionary on data.world didn’t help, nor did a general Google search. In order for the average pay rate for 59 weeks to be 78.4%, the pay rate after week 46 would need to be about 3%, which seems pointless. In the absence of certainty, I’ll mimic Sarah’s approach of reverting to the average payment rate after the period of full payment. So – Bulgaria looks like this:
With this as a test sample, I can now demonstrate how this all hangs together in Tableau – it’s nice and simple:
In the final visualisation, Sarah leverages the LAST() table calc to draw a dot at the end of each line, to make it clear where each country’s paid maternity periods end – useful for when multiple countries start out with a similar payment rate. To achieve this, it’s a basic table calculation:
If the value is the last one across the table, then return the Sum of the [Rate] measure
Once that is dropped on Rows, converted to a Dual Axis and synchronised, this is the end result:
Minor tweaks to the size of the dot and the colour of the two chart elements make things a lot tidier. With that achieved, you have the confidence to extend the data prep to every Country, which allows you to create this worksheet:
Well on track now, and just a bit of formatting to jazz things up. As part of her data preparation stage, Sarah took the time to assign each Country to a Region, and that Region is used to colour the lines. In addition to this, the “Dot” is used to host a label showing the Country name:
Note [Region] on Colour, and [Country] on Label. Simple.
Also note the bespoke data prep Sarah applied to make the United States appear as it does. Ordinarily the US wouldn’t feature in the viz as there are zero weeks of paid maternity in America. To emphasise how inconsistent this is with the other countries in the dataset, Sarah deliberately pads the American data to force Tableau to plot 59 weeks (as per the Bulgarian paid maternity leave duration) at 0%.
Next is the layer of detail provided by annotating marks. No great science behind this, it’s just a question of identifying key points to draw the eye of the audience to:
There are no fancy Tableau shenanigans involved with annotating data – it’s just a question of taking the time to explore your data and selecting messages to emphasise to your audience.
One final neat touch in the viz is the colour legend, which can be created via a separate worksheet:
Whack it all together in a dashboard with a title and subtitle to reinforce the intended messaging behind the analysis, and you have a strong visualisation. Final flourishes were a highlight action driven by the colour legend, plus the general [Country] Highlighter tool.
In this #TakeapartTuesday it became clear to me that there is great value in stepping back from the supplied data and thinking of an end product that you want to create. You don’t have to be constrained by the shape of the original data. Sarah’s viz was inspired by some training she’d had in the previous week, so she thought about how to reshape her data to support the delivery of that viz.
The Tableau end of things was pretty straight forward, and I’m sure there are better ways to shape the data than my manual Excel approach (Alteryx? Tableau Prep?), but I have no idea how that can be done!