Preparing for Thanksgiving Conversations: Tricky Parts with Common Election Data Visualizations
By: Kristin Hunter-Thomson
**This post discusses common misconceptions in interpreting different election data visualizations from a data perspective. It does not take any political positions based on politics.
It seems as much as turkey, over eating, and flag football are a part of many people’s Thanksgiving traditions, talking politics can be an aspect for some as well. While our Thanksgiving get togethers may look different in 2020, this month has been a long in preparation for and through the aftermath of the 2020 election cycle. One thing that can be helpful is making sure you are talking about the same data in the same way.
Each election cycle there seems to be a growing number of ways that we communicate predictions, projections, results, and reflections about election data and the election process. Below is a list of things to look out for when helping your students (and ourselves) make sense of common data visualizations that show up this time of year.
*Note, regardless of what political affiliation you have or what desired outcome you want, being able to read and make sense of election-based data visualizations is an excellent example of the critical need for data literacy to be an active member of a democracy.
Presidential election results map from 1992 US presidential election. Source: https://en.wikipedia.org/wiki/1992_United_States_presidential_election
Choropleth Maps
…aka maps that color a geographic area to indicate the numeric or categorical value for an attribute of that location.
Ok, let’s unpack that a bit. In the example here we are looking at the presidential election “results” of 1992. Each state is colored by which party got the electoral votes for that state: blue for Democrat (aka Bill Clinton) and red for Republican (George H.W. Bush). The numbers super imposed on the states indicates how many electoral college votes each state has. Meaning the color in this instance is based off of a categorical attribute/variable (i.e., party affiliation) and not a numeric attribute/variable (e.g., number of electoral college votes, population size, popular vote).
Tricky part of this = The amount of color that we visually see does not indicate the amount of something in the data, but instead indicates the size of the geographic region that is being used for the color in the map. However, our eyes often associate amount of color to be meaningful and we often presume that more color means more important. However, compare Connecticut, South Carolina, Kentucky, Oklahoma, Colorado, and Arizona. They all have 8 electoral votes, but the amount of red and blue varies greatly across these states.
Consequence of this tricky part = It can be tempting to draw conclusions between the amount of color and the number of people who supported each candidate. When instead there are large variations in population across the states and ways that electoral votes are decided.
Comparison of the popular vote totals since 1900. Red is Republican, Blue is Democrat, and Grey is All other candidates together. Source:
Bar Charts
…aka our students favorite graphs to use / a way to display numeric values across categories.
Ok, let’s unpack that a bit. In the example here we are looking at the popular votes (by Millions) by party in each presidential election since 1900. The total popular votes cast for each party, or all non-Democratic or Republican parties combined, within an election year is the height of the bar. The bars are colored by which party they represent: red for Republican, blue for Democrat, and grey for all other parties. Meaning the color in this instance is based off of a categorical attribute/variable (i.e., party affiliation) and the height of the bar is based off of the sum of a numeric attribute/variable within a given election cycle (i.e., total number of votes cast for each party candidate).
Tricky part of this = The height of a bar and/or amount of color that we visually see does not indicate the actual winner of the election (based on the US election process for president). However, our eyes often associate height of a bar and/or amount of color to be meaningful and we often presume that a higher bar or more color means more important. Also, there is a general increasing trend in these data over time. However, these data are absolute totals and thus are not normalized for population size.
Consequence of this tricky part = It can be tempting to draw conclusions between the height of a bar and/or amount of color and who won the presidency. When instead the electoral college process determines the winner of the presidential election. Also, it can be tempting to draw a conclusion that more people are engaged and voting in US presidential elections over time. When instead there has been a large increase in the US population and thus eligible voters over this time period.
Source: https://www.statista.com/chart/23235/importance-of-each-presidential-election/
Line Charts/Graphs
…aka our favorite way to show changes over time / a way to display numeric values for categories across an ordered numeric attribute.
Ok, let’s unpack that a bit. In the example here we are looking at the the percentage of voters who indicates that this presidential election is more important than previous presidential elections. The percentage values are from voters asked by Gallup ahead of the 1996-2020 presidential elections. The voter responses are batched into three categories (lines): blue for self-described Democrats, red for self-described Republicans, and grey for all respondents of the question regardless of self-described party affiliation who were registered voters. Meaning the color in this instance is based off of two different categorical attributes/variables (i.e., party affiliation OR voter registration) and the height of the line is based off of the a numeric attribute/variable within those categories (i.e., percentage of respondents that indicated yes that the upcoming presidential election was more important than the previous).
Tricky part of this = The largest challenge in this particular graph is the mixture of categories represented by the lines. Grey does not indicate a different category of party affiliation like independents or all other party affiliations combined. Instead it indicates percentage of registered voters asked in the survey. Additionally, although the values represent percentages of respondents we more often presume that a line indicates an absolute quantity and that the highest line is the most important, and that when lines cross that is meaningful. However, this is survey data from different people (aka different samples) across each of these years. Also, we often attribute meaning the slope between two points, and/or from the start to the end of dataset.
Consequence of this tricky part = It can be tempting to draw conclusions that huge percentages of the US voting public thinks that each election is becoming more and more important than the previous. When instead the decision about each election is driven by what has happened within the country over the past 4 years, and thus an indication about 2012 has more to do with 2008-2012 than an opinion leading into 2008.