Another week, another challenge. The week 39 data source can be located on the new #MakeoverMonday site here.
Lesson this week was to slow things down a bit and check out the data before diving in at the deep end. As usual, I dived in…..
The dataset related to global peach production this week, a subject matter close to many peoples’ hearts. At first glance, the data revealed an astonishing rate of growth in Chinese peach production. I had a story, and that was to be the central part of my viz!
All good stories need a beginning, a middle, and an end, so I consciously divided my visualisation up into distinct parts. First up was the contextual “beginning” – how is global peach production developing over time?
A typically simple start. When I think of trends over time, I think of line charts. I lack the creative flair to diversify wildly from those gut instincts, so that’s what I create! However, there is one point of note: Why have I got a dual-axis of Year on Columns? Simply to put the axis labels at the top of the chart, rather than at the bottom. If I made Year discrete, then I would get this:
So we have some label rotation issues, placement problems, plus I lose the ability (that I know of) to control the number of units between marks and the tick origin. I can easily use the old Analysis > Table Layout > Advanced option to move the labels to the top of the chart, but that doesn’t resolve the issue of tick mark control, hence the decision to go Continuous and dual-axis.
Why go to all that trouble? One thing that resonated with me in all of my reading to date, was the tip that in the Western world at least, we tend to read in a “Z shape” – i.e. we start at the top left, scan left to right along the line of text, and then go to the leftmost position on the next line. Why is that important? Because if that is how we instinctively read, it makes sense to move key reference points like titles, labels and legends into positions that mimic the optimum placement of such things.
Job one done. Barring that minor dip in 2012, peach production is on the up. Onto the “middle” of the story. I knew that China was looking like a major producer of peaches, so I decided to plot the year on year growth in production, just to see if that identified a specific point from which an upwards trend was apparent. To do this, more dual-axis shenanigans, but principally it’s just a bar chart with a basic table calculation:
So at a high level at least, it achieves the objective as it is clear that year on year growth has been pretty consistent since the 80’s, with a marked upturn from the 90’s. One notable addition to the worksheet is the inclusion of the “China” calculation on colour, which is simply:
If Country in the data is equal to ‘China, mainland’, the boolean returns TRUE, or 1, else it’s FALSE, or 0. Whacking that on colour allows me to isolate China from everything else. I deliberately coloured China red throughout the dashboard, as that’s the main colour on the Chinese flag. You’ll also note the specific reference to “China, mainland”, rather than China itself. Why?
The data included dimension members of “China”, “China, mainland” and “China, Taiwan Providence of”. “China” was the sum of the other two members – it was effectively double-counting peach production on China
My initial submission failed to take that into account, which is annoying because I can remember reading up on Taiwan to see whether it should be grouped with China or not. When Googling “Is Taiwan part of China”, you get this:
Unclear. The short answer is de facto, separate, but in terms of formal recognition by other governments, part of China. Both the ROC and PRC governments claim that they govern all of China (including both the mainland and Taiwan,) but the ROC only controls Taiwan and the PRC only controls mainland China.
I remedied this oversight by just setting a data source filter to exclude that aggregated total of the two component parts:
Back to the story. Peaches up, growth in China. So what? How do the trends of China (mainland!) versus the rest of the world develop over time? Let’s bring line charts back to the fold:
Nothing complicated here. We’ve already discussed my fixation with positioning of reference points. The aim here was to explore whether the YoY growth of the previous chart was driving a material impact of global peach production. It clearly was, with 2010 marking the first year in which Chinese peaches outnumbered their continental chums.
Now for the “end” of the tale. Lots of peaches, growth in China resulting in more Chinese peaches. So what does that ultimately mean? What is the final context – what proportion of peaches produced globally in 2012 were Chinese?
It makes sense to use a stacked bar or area chart when looking at developing proportions of total, and I opted for the area chart as it results in a smoother overall view of this trend. The only vague complexity here is the addition of a table calc, but it’s just an out-of-the-box one:
And that is that! To pull everything together, I just needed to set a Fixed size for the dashboard, with vertical containers to hold all the charts, separated by blank boxes. At the foot is a horizontal containers allowing me to stick the data source and #MakeoverMonday text boxes side by side.