library(tidyverse)
library(gt)
26 Tables
But not a table. A table with features.
Sometimes, the best way to show your data is with a table – simple rows and columns. It allows a reader to compare whatever they want to compare a little easier than a graph where you’ve chosen what to highlight. The folks that made R Studio and the tidyverse have a neat package called gt
.
For this assignment, we’ll need gt
so go over to the console and run:
install.packages("gt")
So what does all of these libraries do? Let’s gather a few and use data of every men’s basketball game between 2015-2024.
For this walkthrough:
Load libraries.
And the data.
<- read_csv("data/cbblogs1524.csv") logs
Rows: 98161 Columns: 51
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (9): Season, TeamFull, Opponent, HomeAway, W_L, URL, Conference, Team,...
dbl (39): Game, TeamScore, OpponentScore, TeamFG, TeamFGA, TeamFGPCT, Team3...
lgl (2): Blank, season
date (1): Date
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Let’s ask this question: which college basketball team saw the greatest decrease in three point attempts per game between last season as a percentage of shots? The simplest way to calculate that is by percent change.
We’ve got a little work to do, putting together ideas we’ve used before. What we need to end up with is some data that looks like this:
Team | 2022-2023 season threes | 2023-2024 season threes | pct change
To get that, we’ll need to do some filtering to get the right seasons, some grouping and summarizing to get the right number, some pivoting to get it organized correctly so we can mutate the percent change.
<- logs |>
threechange filter(Season == "2022-2023" | Season == "2023-2024") |>
group_by(Team, Season) |>
summarise(Total3PA = sum(Team3PA)) |>
pivot_wider(names_from=Season, values_from = Total3PA) |>
filter(!is.na(`2023-2024`)) |>
mutate(PercentChange = (`2023-2024`-`2022-2023`)/`2022-2023`) |>
arrange(PercentChange) |>
ungroup() |>
slice_head(n=10) # just want a top 10 list, but can't use top_n!
`summarise()` has grouped output by 'Team'. You can override using the
`.groups` argument.
We’ve output tables to the screen a thousand times in this class with head
, but gt
makes them look decent with very little code.
|> gt() threechange
Team | 2022-2023 | 2023-2024 | PercentChange |
---|---|---|---|
Houston Christian | 726 | 488 | -0.3278237 |
Jacksonville State | 747 | 507 | -0.3212851 |
Coppin State | 793 | 540 | -0.3190416 |
Toledo | 764 | 550 | -0.2801047 |
Southeast Missouri State | 875 | 631 | -0.2788571 |
Eastern Kentucky | 927 | 678 | -0.2686084 |
Southern Utah | 887 | 649 | -0.2683202 |
Tennessee Tech | 829 | 608 | -0.2665862 |
Utah Valley | 728 | 540 | -0.2582418 |
Louisiana-Monroe | 690 | 513 | -0.2565217 |
So there you have it. Long Island changed their team so much they took 44 percent fewer threes in 2022-23 from the season before. Where did Maryland come out? We ranked pretty low in college basketball in terms of fewer threes from the season before, because the Terps actually took 61 more.
gt
has a mountain of customization options. The good news is that it works in a very familiar pattern. We’ll start with fixing headers. What we have isn’t bad, but PercentChange isn’t good either. Let’s fix that.
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
)
Team | 2022-2023 | 2023-2024 | Percent Change |
---|---|---|---|
Houston Christian | 726 | 488 | -0.3278237 |
Jacksonville State | 747 | 507 | -0.3212851 |
Coppin State | 793 | 540 | -0.3190416 |
Toledo | 764 | 550 | -0.2801047 |
Southeast Missouri State | 875 | 631 | -0.2788571 |
Eastern Kentucky | 927 | 678 | -0.2686084 |
Southern Utah | 887 | 649 | -0.2683202 |
Tennessee Tech | 829 | 608 | -0.2665862 |
Utah Valley | 728 | 540 | -0.2582418 |
Louisiana-Monroe | 690 | 513 | -0.2565217 |
Better. Note the pattern: Actual header name = “What we want to see”. So if we wanted to change Team to School, we’d do this: Team = "School"
inside the cols_label
bits.
Now we can start working with styling. The truth is most of your code in tables is going to be dedicated to styling specific things. The first thing we need: A headline and some chatter. They’re required parts of a graphic, so they’re a good place to start. We do that with tab_header
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Did Maryland Shoot Fewer Threes in 2023-24?",
subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
)
Did Maryland Shoot Fewer Threes in 2023-24? | |||
---|---|---|---|
No, the Terps shot more. But these 10 teams completely changed their offenses. | |||
Team | 2022-2023 | 2023-2024 | Percent Change |
Houston Christian | 726 | 488 | -0.3278237 |
Jacksonville State | 747 | 507 | -0.3212851 |
Coppin State | 793 | 540 | -0.3190416 |
Toledo | 764 | 550 | -0.2801047 |
Southeast Missouri State | 875 | 631 | -0.2788571 |
Eastern Kentucky | 927 | 678 | -0.2686084 |
Southern Utah | 887 | 649 | -0.2683202 |
Tennessee Tech | 829 | 608 | -0.2665862 |
Utah Valley | 728 | 540 | -0.2582418 |
Louisiana-Monroe | 690 | 513 | -0.2565217 |
We have a headline and some chatter, but … gross. Centered? The extra lines? No real difference in font weight? We can do better. We can style individual elements using tab_style
. First, let’s make the main headline – the title
– bold and left aligned.
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Did Maryland Shoot Fewer Threes in 2023-24?",
subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
|> tab_style(
) style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
)
Did Maryland Shoot Fewer Threes in 2023-24? | |||
---|---|---|---|
No, the Terps shot more. But these 10 teams completely changed their offenses. | |||
Team | 2022-2023 | 2023-2024 | Percent Change |
Houston Christian | 726 | 488 | -0.3278237 |
Jacksonville State | 747 | 507 | -0.3212851 |
Coppin State | 793 | 540 | -0.3190416 |
Toledo | 764 | 550 | -0.2801047 |
Southeast Missouri State | 875 | 631 | -0.2788571 |
Eastern Kentucky | 927 | 678 | -0.2686084 |
Southern Utah | 887 | 649 | -0.2683202 |
Tennessee Tech | 829 | 608 | -0.2665862 |
Utah Valley | 728 | 540 | -0.2582418 |
Louisiana-Monroe | 690 | 513 | -0.2565217 |
It’s hard to see here, but the chatter below is also centered (it doesn’t look like it because it fills the space). We can left align that too, but leave it normal weight (i.e. not bold).
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Did Maryland Shoot Fewer Threes in 2023-24?",
subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
|> tab_style(
) style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
|> tab_style(
) style = cell_text(color = "black", align = "left"),
locations = cells_title("subtitle")
)
Did Maryland Shoot Fewer Threes in 2023-24? | |||
---|---|---|---|
No, the Terps shot more. But these 10 teams completely changed their offenses. | |||
Team | 2022-2023 | 2023-2024 | Percent Change |
Houston Christian | 726 | 488 | -0.3278237 |
Jacksonville State | 747 | 507 | -0.3212851 |
Coppin State | 793 | 540 | -0.3190416 |
Toledo | 764 | 550 | -0.2801047 |
Southeast Missouri State | 875 | 631 | -0.2788571 |
Eastern Kentucky | 927 | 678 | -0.2686084 |
Southern Utah | 887 | 649 | -0.2683202 |
Tennessee Tech | 829 | 608 | -0.2665862 |
Utah Valley | 728 | 540 | -0.2582418 |
Louisiana-Monroe | 690 | 513 | -0.2565217 |
The next item on the required elements list: Source and credit lines. In gt
, those are called tab_source_notes
and we can add them like this:
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Did Maryland Shoot Fewer Threes in 2021-22?",
subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
|> tab_style(
) style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
|> tab_style(
) style = cell_text(color = "black", align = "left"),
locations = cells_title("subtitle")
|>
) tab_source_note(
source_note = md("**By:** Derek Willis | **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
)
Did Maryland Shoot Fewer Threes in 2021-22? | |||
---|---|---|---|
No, the Terps shot more. But these 10 teams completely changed their offenses. | |||
Team | 2022-2023 | 2023-2024 | Percent Change |
Houston Christian | 726 | 488 | -0.3278237 |
Jacksonville State | 747 | 507 | -0.3212851 |
Coppin State | 793 | 540 | -0.3190416 |
Toledo | 764 | 550 | -0.2801047 |
Southeast Missouri State | 875 | 631 | -0.2788571 |
Eastern Kentucky | 927 | 678 | -0.2686084 |
Southern Utah | 887 | 649 | -0.2683202 |
Tennessee Tech | 829 | 608 | -0.2665862 |
Utah Valley | 728 | 540 | -0.2582418 |
Louisiana-Monroe | 690 | 513 | -0.2565217 |
By: Derek Willis | Source: Sports Reference |
We can do a lot with tab_style
. For instance, we can make the headers bold and reduce the size a bit to reduce font congestion in the area.
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Did Maryland Shoot Fewer Threes in 2023-24?",
subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
|>
) tab_style(
style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
|>
) tab_style(
style = cell_text(color = "black", align = "left"),
locations = cells_title("subtitle")
|>
) tab_source_note(
source_note = md("**By:** Derek Willis | **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
|>
) tab_style(
locations = cells_column_labels(columns = everything()),
style = list(
cell_borders(sides = "bottom", weight = px(3)),
cell_text(weight = "bold", size=12)
) )
Did Maryland Shoot Fewer Threes in 2023-24? | |||
---|---|---|---|
No, the Terps shot more. But these 10 teams completely changed their offenses. | |||
Team | 2022-2023 | 2023-2024 | Percent Change |
Houston Christian | 726 | 488 | -0.3278237 |
Jacksonville State | 747 | 507 | -0.3212851 |
Coppin State | 793 | 540 | -0.3190416 |
Toledo | 764 | 550 | -0.2801047 |
Southeast Missouri State | 875 | 631 | -0.2788571 |
Eastern Kentucky | 927 | 678 | -0.2686084 |
Southern Utah | 887 | 649 | -0.2683202 |
Tennessee Tech | 829 | 608 | -0.2665862 |
Utah Valley | 728 | 540 | -0.2582418 |
Louisiana-Monroe | 690 | 513 | -0.2565217 |
By: Derek Willis | Source: Sports Reference |
Next up: There’s a lot of lines in this that don’t need to be there. gt
has some tools to get rid of them easily and add in some other readability improvements.
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Did Maryland Shoot Fewer Threes in 2023-24?",
subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
|>
) tab_source_note(
source_note = md("**By:** Derek Willis | **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
|>
) tab_style(
style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
|>
) tab_style(
style = cell_text(color = "black", align = "left"),
locations = cells_title("subtitle")
|>
) tab_style(
locations = cells_column_labels(columns = everything()),
style = list(
cell_borders(sides = "bottom", weight = px(3)),
cell_text(weight = "bold", size=12)
)|>
) opt_row_striping() |>
opt_table_lines("none")
Did Maryland Shoot Fewer Threes in 2023-24? | |||
---|---|---|---|
No, the Terps shot more. But these 10 teams completely changed their offenses. | |||
Team | 2022-2023 | 2023-2024 | Percent Change |
Houston Christian | 726 | 488 | -0.3278237 |
Jacksonville State | 747 | 507 | -0.3212851 |
Coppin State | 793 | 540 | -0.3190416 |
Toledo | 764 | 550 | -0.2801047 |
Southeast Missouri State | 875 | 631 | -0.2788571 |
Eastern Kentucky | 927 | 678 | -0.2686084 |
Southern Utah | 887 | 649 | -0.2683202 |
Tennessee Tech | 829 | 608 | -0.2665862 |
Utah Valley | 728 | 540 | -0.2582418 |
Louisiana-Monroe | 690 | 513 | -0.2565217 |
By: Derek Willis | Source: Sports Reference |
We’re in pretty good shape here, but look closer. What else makes this table sub-par? How about the formatting of the percent change? We can fix that with a formatter.
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Did Maryland Shoot Fewer Threes in 2023-24?",
subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
|>
) tab_source_note(
source_note = md("**By:** Derek Willis | **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
|>
) tab_style(
style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
|>
) tab_style(
style = cell_text(color = "black", align = "left"),
locations = cells_title("subtitle")
|>
) tab_style(
locations = cells_column_labels(columns = everything()),
style = list(
cell_borders(sides = "bottom", weight = px(3)),
cell_text(weight = "bold", size=12)
)|>
) opt_row_striping() |>
opt_table_lines("none") |>
fmt_percent(
columns = c(PercentChange),
decimals = 1
)
Did Maryland Shoot Fewer Threes in 2023-24? | |||
---|---|---|---|
No, the Terps shot more. But these 10 teams completely changed their offenses. | |||
Team | 2022-2023 | 2023-2024 | Percent Change |
Houston Christian | 726 | 488 | −32.8% |
Jacksonville State | 747 | 507 | −32.1% |
Coppin State | 793 | 540 | −31.9% |
Toledo | 764 | 550 | −28.0% |
Southeast Missouri State | 875 | 631 | −27.9% |
Eastern Kentucky | 927 | 678 | −26.9% |
Southern Utah | 887 | 649 | −26.8% |
Tennessee Tech | 829 | 608 | −26.7% |
Utah Valley | 728 | 540 | −25.8% |
Louisiana-Monroe | 690 | 513 | −25.7% |
By: Derek Willis | Source: Sports Reference |
Throughout the semester, we’ve been using color and other signals to highlight things. Let’s pretend we’re doing a project on Coppin State. With a little tab_style
magic, we can change individual rows and add color. The last tab_style
block here will first pass off the styles we want to use – we’re going to make the rows blue and the text gold – and then for locations we specify where with a simple filter. What that means is that any rows we can address with logic – all rows with a value greater than X, for example – we can change the styling.
|>
threechange gt() |>
cols_label(
PercentChange = "Percent Change"
|>
) tab_header(
title = "Did Maryland Shoot Fewer Threes in 2023-24?",
subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
|>
) tab_source_note(
source_note = md("**By:** Derek Willis | **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
|>
) tab_style(
style = cell_text(color = "black", weight = "bold", align = "left"),
locations = cells_title("title")
|>
) tab_style(
style = cell_text(color = "black", align = "left"),
locations = cells_title("subtitle")
|>
) tab_style(
locations = cells_column_labels(columns = everything()),
style = list(
cell_borders(sides = "bottom", weight = px(3)),
cell_text(weight = "bold", size=12)
)|>
) opt_row_striping() |>
opt_table_lines("none") |>
fmt_percent(
columns = c(PercentChange),
decimals = 1
|>
) tab_style(
style = list(
cell_fill(color = "blue"),
cell_text(color = "gold")
),locations = cells_body(
rows = Team == "Coppin State")
)
Did Maryland Shoot Fewer Threes in 2023-24? | |||
---|---|---|---|
No, the Terps shot more. But these 10 teams completely changed their offenses. | |||
Team | 2022-2023 | 2023-2024 | Percent Change |
Houston Christian | 726 | 488 | −32.8% |
Jacksonville State | 747 | 507 | −32.1% |
Coppin State | 793 | 540 | −31.9% |
Toledo | 764 | 550 | −28.0% |
Southeast Missouri State | 875 | 631 | −27.9% |
Eastern Kentucky | 927 | 678 | −26.9% |
Southern Utah | 887 | 649 | −26.8% |
Tennessee Tech | 829 | 608 | −26.7% |
Utah Valley | 728 | 540 | −25.8% |
Louisiana-Monroe | 690 | 513 | −25.7% |
By: Derek Willis | Source: Sports Reference |
Two things here:
- Not the blue and gold that I prefer, but it stands out.
- We’ve arrived where we want to be: We’ve created a clear table that allows a reader to compare schools at will while also using color to draw attention to the thing we want to draw attention to. We’ve kept it simple so the color has impact.