library(tidyverse)
26 Visualizing your data for publication
Doing data visualization well, and at professional level, takes time, skill and practice to perfect. Understanding it and doing it at a complex level is an entire class on it’s own. It uses some of the same skills here – grouping, filtering, calculating – but then takes that data and turns it into data pictures.
But simple stuff – and even some slightly complicated stuff – can be done with tools made for people who aren’t data viz pros.
The tool we’re going to use is called Datawrapper.
First, let’s get some data and work with it. Let’s use a simple CSV of total votes cast for four Maryland Republicans who ran for statewide office in the 2022 general election. Let’s look at it.
<- read_csv("data/md_gop_cands.csv") md_gop_cands
head(md_gop_cands)
# A tibble: 4 × 5
Candidate `Early Voting` `Election Day` Mail Votes
<chr> <dbl> <dbl> <dbl> <dbl>
1 Michael Peroutka 131551 451685 24669 606557
2 Dan Cox 123287 424837 22112 569450
3 Barry Glassman 142726 484892 28656 655379
4 Chris Chaffee 130458 447330 23561 600543
26.1 Datawrapper
Making charts in Datawrapper is preposterously simple, which is the point. There are dozens of chart types, and dozens of options. To get from a csv to a chart to publication is very, very easy.
First, go to datawrapper.de and sign up for an account. It’s free.
Once logged in, you’ll click on New Chart.
The first thing we’ll do is upload our CSV. Click on XLS/CSV and upload the file.
Next up is to check and see what Datawrappper did with our data when we uploaded it. As you can see from the text on the left, if it’s blue, it’s a number. If it’s green, it’s a date. If it’s black, it’s text. Red means there’s a problem. This data is very clean, so it imports cleanly. Click on the “Proceed” button.
Now we make a chart. Bar chart comes up by default, which is good, because with totals, that’s what we have.
Click on Refine. The first option we want to change is column we’re using for the bars, which defaults to “Early Voting”. Let’s make it “Votes. Let’s also choose to sort the bars so that the largest value (and bar appears first). We do that by clicking on the”Sort bars” button.
Now we need to annotate our charts. Every chart needs a title, a source line and a credit line. Most need chatter (called description here). Click on the “Annotate” tab to get started.
Really think about the title and description: the title is like a headline and the description is provides some additional context. Another way to think about it: the title is the most important lesson from the graphic, and the description could be the next most important lesson or could provide more context to the title.
To publish, we click the “Publish & Embed” tab. Some publication systems allow for the embedding of HTML into a post or a story. Some don’t. The only way to know is to ask someone at your publication. Every publication system on the planet, though, can publish an image. So there’s always a way to export your chart as a PNG file, which you can upload like any photo.
26.1.1 Making a Map
Let’s create a choropleth map - one that shows variations between the percentage of votes received by Wes Moore across Maryland counties. We’ll read that in from the data folder.
<- read_csv("data/md_gov_county.csv") md_gov_county
In order to make a map, we need to be able to tell Datawrapper that a certain column contains geographic information (besides the name of the county). The easiest way to do that for U.S. maps is to use something called a FIPS Code. You should read about them so you understand what they are, and think of them as a unique identifier for some geographical entity like a state or county. Our md_gov_county dataframe has a FIPS code for each county, but if you ever need one for a county, this is a solved problem thanks to the Tigris library that we used in pre_lab 9.
We’ll need to write code to add columns showing the total number of votes for each county and the percentage of votes received by Wes Moore in each county, then replace the CSV file in the data folder with it
<- md_gov_county |>
md_gov_county mutate(total = cox + moore + lashar + wallace + harding + write_ins) |>
mutate(pct_moore = moore/total * 100)
write_csv(md_gov_county, "data/md_gov_county_with_percent.csv")
Go back to Datawrapper and click on “New Map”. Click on “Choropleth map” and then choose “USA >> Counties (2022)” for the map base and click the Proceed button.
Now we can upload the md_gov_county_with_percent.csv
file we just saved using the Upload File button. It should look like the following image:
We’ll need to make sure that Datawrapper understands what the data is and where the FIPS code is. Click on the “Match” tab and make sure that yours looks like the image below:
Click the “Proceed” button (you should have to click it twice, since the first time it will tell you that there’s no data for 3,197 counties - the rest of the U.S.). That will take you to the Visualize tab.
You’ll see that the map currently is of the whole nation, and we only have Maryland data. Let’s fix that.
Look for “Hide regions without data” under Appearance, and click the slider icon to enable that feature. You should see a map zoomed into Maryland with some counties in various colors.
But it’s a little rough visually, so let’s clean that up.
Look for the “Show color legend” label and add a caption for the legend, which is the horizontal bar under the title. Then click on the “Annotate” tab to add a title, description, data source and byline. The title should represent the headline, while the description should be a longer phrase that tells people what they are looking at.
That’s better, but check out the tooltip by hovering over a county. It’s not super helpful. Let’s change the tooltip behavior to show the county name and a better-formatted number.
Click the “Customize tooltips” button so it expands down. Change {{ fips_code }} to {{ county }} and {{ pctmoore }} to {{ FORMAT(pctmoore, “00.0%”)}}
Experiment with the “Show labels” options to see if you can add county labels to your map.
Ok, that looks better. Let’s publish!
Click the “Proceed” button until you get to the “Publish & Embed” tab, then click “Publish Now”.