This year for #MakeoverMonday, we’ve been treated to weekly recaps of the best submissions from the prior week, plus a series of important reminders about best practise. These best practise comments range from design tips about overuse of colour or imagery, to fundamental mathematical principles, such as not using an average of an average. Whoops!
Submission number one committed the cardinal sin. Part of me isn’t too ashamed about this. In fact, if I had made it clear that the original viz was based on unweighted averages, I don’t think things would have been too bad. But I didn’t, so they were.
In fact, without a gentle nudge from Eva Murray to reconsider my viz, I might not even have ever given this viz a second thought – it would mentally have been filed in my “done” category. It didn’t tell a story, but not all visualisations need to. It made a- (-n albeit incorrect) point, and I like my clean and basic presentation style, so I was generally satisfied.
With Eva’s keyboard strokes ringing in my ears, I decided to remedy this oversight. I supplemented my data again so I could pull in population (I’d already created an Excel file so I could aggregate Countries to Regions) and therefore derive a weighted average.
Plan A was just to reissue the viz based on the new weighted average and to write it off as one of those days. However, I can remember Andy Kriebel writing this post back in January, so I wanted to reissue something which showed a range of metrics:
- Unweighted average – just the average of the default [Internet users per 100 people] field
- Weighted average – [SUM([Internet Users])/SUM([Population])*100]
- Median – [MEDIAN([Internet users per 100 people])]
- And I then added in a pane level global median reference line for overall context
The end result is a visualisation which is a little busy with multiple marks and reference lines, but it reasonably conveys the variability of the metrics bulleted above. You can see that sorting in descending order by the median 2015 value resulted in a rearrangement of regions, compared to the first viz.
By plotting the various measures, you can see the spread of the results in Australia and Oceania. In this instance, it is down to the inherent variability of the data, in what is a comparatively small region with some clear outliers:
The final submission can be downloaded here, or an image is below: