26  Tables

But not a table. A table with features.

Sometimes, the best way to show your data is with a table – simple rows and columns. It allows a reader to compare whatever they want to compare a little easier than a graph where you’ve chosen what to highlight. The folks that made R Studio and the tidyverse have a neat package called gt.

For this assignment, we’ll need gt so go over to the console and run:

install.packages("gt")

So what does all of these libraries do? Let’s gather a few and use data of every men’s basketball game between 2015-2024.

For this walkthrough:

Load libraries.

library(tidyverse)
library(gt)

And the data.

logs <- read_csv("data/cbblogs1524.csv")
Rows: 98161 Columns: 51
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr   (9): Season, TeamFull, Opponent, HomeAway, W_L, URL, Conference, Team,...
dbl  (39): Game, TeamScore, OpponentScore, TeamFG, TeamFGA, TeamFGPCT, Team3...
lgl   (2): Blank, season
date  (1): Date

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Let’s ask this question: which college basketball team saw the greatest decrease in three point attempts per game between last season as a percentage of shots? The simplest way to calculate that is by percent change.

We’ve got a little work to do, putting together ideas we’ve used before. What we need to end up with is some data that looks like this:

Team | 2022-2023 season threes | 2023-2024 season threes | pct change

To get that, we’ll need to do some filtering to get the right seasons, some grouping and summarizing to get the right number, some pivoting to get it organized correctly so we can mutate the percent change.

threechange <- logs |>
  filter(Season == "2022-2023" | Season == "2023-2024") |>
  group_by(Team, Season) |>
  summarise(Total3PA = sum(Team3PA)) |>
  pivot_wider(names_from=Season, values_from = Total3PA) |>
  filter(!is.na(`2023-2024`)) |> 
  mutate(PercentChange = (`2023-2024`-`2022-2023`)/`2022-2023`) |>
  arrange(PercentChange) |> 
  ungroup() |>
  slice_head(n=10) # just want a top 10 list, but can't use top_n!
`summarise()` has grouped output by 'Team'. You can override using the
`.groups` argument.

We’ve output tables to the screen a thousand times in this class with head, but gt makes them look decent with very little code.

threechange |> gt()
Team 2022-2023 2023-2024 PercentChange
Houston Christian 726 488 -0.3278237
Jacksonville State 747 507 -0.3212851
Coppin State 793 540 -0.3190416
Toledo 764 550 -0.2801047
Southeast Missouri State 875 631 -0.2788571
Eastern Kentucky 927 678 -0.2686084
Southern Utah 887 649 -0.2683202
Tennessee Tech 829 608 -0.2665862
Utah Valley 728 540 -0.2582418
Louisiana-Monroe 690 513 -0.2565217

So there you have it. Long Island changed their team so much they took 44 percent fewer threes in 2022-23 from the season before. Where did Maryland come out? We ranked pretty low in college basketball in terms of fewer threes from the season before, because the Terps actually took 61 more.

gt has a mountain of customization options. The good news is that it works in a very familiar pattern. We’ll start with fixing headers. What we have isn’t bad, but PercentChange isn’t good either. Let’s fix that.

threechange |> 
  gt() |> 
  cols_label(
    PercentChange = "Percent Change"
  )
Team 2022-2023 2023-2024 Percent Change
Houston Christian 726 488 -0.3278237
Jacksonville State 747 507 -0.3212851
Coppin State 793 540 -0.3190416
Toledo 764 550 -0.2801047
Southeast Missouri State 875 631 -0.2788571
Eastern Kentucky 927 678 -0.2686084
Southern Utah 887 649 -0.2683202
Tennessee Tech 829 608 -0.2665862
Utah Valley 728 540 -0.2582418
Louisiana-Monroe 690 513 -0.2565217

Better. Note the pattern: Actual header name = “What we want to see”. So if we wanted to change Team to School, we’d do this: Team = "School" inside the cols_label bits.

Now we can start working with styling. The truth is most of your code in tables is going to be dedicated to styling specific things. The first thing we need: A headline and some chatter. They’re required parts of a graphic, so they’re a good place to start. We do that with tab_header

threechange |> 
  gt() |> 
  cols_label(
    PercentChange = "Percent Change"
  ) |>
  tab_header(
    title = "Did Maryland Shoot Fewer Threes in 2021-22?",
    subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
  )
Did Maryland Shoot Fewer Threes in 2021-22?
No, the Terps shot more. But these 10 teams completely changed their offenses.
Team 2022-2023 2023-2024 Percent Change
Houston Christian 726 488 -0.3278237
Jacksonville State 747 507 -0.3212851
Coppin State 793 540 -0.3190416
Toledo 764 550 -0.2801047
Southeast Missouri State 875 631 -0.2788571
Eastern Kentucky 927 678 -0.2686084
Southern Utah 887 649 -0.2683202
Tennessee Tech 829 608 -0.2665862
Utah Valley 728 540 -0.2582418
Louisiana-Monroe 690 513 -0.2565217

We have a headline and some chatter, but … gross. Centered? The extra lines? No real difference in font weight? We can do better. We can style individual elements using tab_style. First, let’s make the main headline – the title – bold and left aligned.

threechange |> 
  gt() |> 
  cols_label(
    PercentChange = "Percent Change"
  ) |>
  tab_header(
    title = "Did Maryland Shoot Fewer Threes in 2021-22?",
    subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
  ) |> tab_style(
    style = cell_text(color = "black", weight = "bold", align = "left"),
    locations = cells_title("title")
  )
Did Maryland Shoot Fewer Threes in 2021-22?
No, the Terps shot more. But these 10 teams completely changed their offenses.
Team 2022-2023 2023-2024 Percent Change
Houston Christian 726 488 -0.3278237
Jacksonville State 747 507 -0.3212851
Coppin State 793 540 -0.3190416
Toledo 764 550 -0.2801047
Southeast Missouri State 875 631 -0.2788571
Eastern Kentucky 927 678 -0.2686084
Southern Utah 887 649 -0.2683202
Tennessee Tech 829 608 -0.2665862
Utah Valley 728 540 -0.2582418
Louisiana-Monroe 690 513 -0.2565217

It’s hard to see here, but the chatter below is also centered (it doesn’t look like it because it fills the space). We can left align that too, but leave it normal weight (i.e. not bold).

threechange |> 
  gt() |> 
  cols_label(
    PercentChange = "Percent Change"
  ) |>
  tab_header(
    title = "Did Maryland Shoot Fewer Threes in 2021-22?",
    subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
  ) |> tab_style(
    style = cell_text(color = "black", weight = "bold", align = "left"),
    locations = cells_title("title")
  ) |> tab_style(
    style = cell_text(color = "black", align = "left"),
    locations = cells_title("subtitle")
  )
Did Maryland Shoot Fewer Threes in 2021-22?
No, the Terps shot more. But these 10 teams completely changed their offenses.
Team 2022-2023 2023-2024 Percent Change
Houston Christian 726 488 -0.3278237
Jacksonville State 747 507 -0.3212851
Coppin State 793 540 -0.3190416
Toledo 764 550 -0.2801047
Southeast Missouri State 875 631 -0.2788571
Eastern Kentucky 927 678 -0.2686084
Southern Utah 887 649 -0.2683202
Tennessee Tech 829 608 -0.2665862
Utah Valley 728 540 -0.2582418
Louisiana-Monroe 690 513 -0.2565217

The next item on the required elements list: Source and credit lines. In gt, those are called tab_source_notes and we can add them like this:

threechange |> 
  gt() |> 
  cols_label(
    PercentChange = "Percent Change"
  ) |>
  tab_header(
    title = "Did Maryland Shoot Fewer Threes in 2021-22?",
    subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
  ) |> tab_style(
    style = cell_text(color = "black", weight = "bold", align = "left"),
    locations = cells_title("title")
  ) |> tab_style(
    style = cell_text(color = "black", align = "left"),
    locations = cells_title("subtitle")
  ) |>
  tab_source_note(
    source_note = md("**By:** Derek Willis  |  **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
  )
Did Maryland Shoot Fewer Threes in 2021-22?
No, the Terps shot more. But these 10 teams completely changed their offenses.
Team 2022-2023 2023-2024 Percent Change
Houston Christian 726 488 -0.3278237
Jacksonville State 747 507 -0.3212851
Coppin State 793 540 -0.3190416
Toledo 764 550 -0.2801047
Southeast Missouri State 875 631 -0.2788571
Eastern Kentucky 927 678 -0.2686084
Southern Utah 887 649 -0.2683202
Tennessee Tech 829 608 -0.2665862
Utah Valley 728 540 -0.2582418
Louisiana-Monroe 690 513 -0.2565217
By: Derek Willis | Source: Sports Reference

We can do a lot with tab_style. For instance, we can make the headers bold and reduce the size a bit to reduce font congestion in the area.

threechange |> 
  gt() |> 
  cols_label(
    PercentChange = "Percent Change"
  ) |>
  tab_header(
    title = "Did Maryland Shoot Fewer Threes in 2021-22?",
    subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
  ) |> 
  tab_style(
    style = cell_text(color = "black", weight = "bold", align = "left"),
    locations = cells_title("title")
  ) |> 
  tab_style(
    style = cell_text(color = "black", align = "left"),
    locations = cells_title("subtitle")
  ) |>  
  tab_source_note(
    source_note = md("**By:** Derek Willis  |  **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
  ) |>
  tab_style(
     locations = cells_column_labels(columns = everything()),
     style = list(
       cell_borders(sides = "bottom", weight = px(3)),
       cell_text(weight = "bold", size=12)
     )
   ) 
Did Maryland Shoot Fewer Threes in 2021-22?
No, the Terps shot more. But these 10 teams completely changed their offenses.
Team 2022-2023 2023-2024 Percent Change
Houston Christian 726 488 -0.3278237
Jacksonville State 747 507 -0.3212851
Coppin State 793 540 -0.3190416
Toledo 764 550 -0.2801047
Southeast Missouri State 875 631 -0.2788571
Eastern Kentucky 927 678 -0.2686084
Southern Utah 887 649 -0.2683202
Tennessee Tech 829 608 -0.2665862
Utah Valley 728 540 -0.2582418
Louisiana-Monroe 690 513 -0.2565217
By: Derek Willis | Source: Sports Reference

Next up: There’s a lot of lines in this that don’t need to be there. gt has some tools to get rid of them easily and add in some other readability improvements.

threechange |> 
  gt() |> 
  cols_label(
    PercentChange = "Percent Change"
  ) |>
  tab_header(
    title = "Did Maryland Shoot Fewer Threes in 2021-22?",
    subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
  ) |>  
  tab_source_note(
    source_note = md("**By:** Derek Willis  |  **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
  ) |> 
  tab_style(
    style = cell_text(color = "black", weight = "bold", align = "left"),
    locations = cells_title("title")
  ) |> 
  tab_style(
    style = cell_text(color = "black", align = "left"),
    locations = cells_title("subtitle")
  ) |>
  tab_style(
     locations = cells_column_labels(columns = everything()),
     style = list(
       cell_borders(sides = "bottom", weight = px(3)),
       cell_text(weight = "bold", size=12)
     )
   ) |>
  opt_row_striping() |> 
  opt_table_lines("none")
Did Maryland Shoot Fewer Threes in 2021-22?
No, the Terps shot more. But these 10 teams completely changed their offenses.
Team 2022-2023 2023-2024 Percent Change
Houston Christian 726 488 -0.3278237
Jacksonville State 747 507 -0.3212851
Coppin State 793 540 -0.3190416
Toledo 764 550 -0.2801047
Southeast Missouri State 875 631 -0.2788571
Eastern Kentucky 927 678 -0.2686084
Southern Utah 887 649 -0.2683202
Tennessee Tech 829 608 -0.2665862
Utah Valley 728 540 -0.2582418
Louisiana-Monroe 690 513 -0.2565217
By: Derek Willis | Source: Sports Reference

We’re in pretty good shape here, but look closer. What else makes this table sub-par? How about the formatting of the percent change? We can fix that with a formatter.

threechange |> 
  gt() |> 
  cols_label(
    PercentChange = "Percent Change"
  ) |>
  tab_header(
    title = "Did Maryland Shoot Fewer Threes in 2021-22?",
    subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
  ) |>  
  tab_source_note(
    source_note = md("**By:** Derek Willis  |  **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
  ) |> 
  tab_style(
    style = cell_text(color = "black", weight = "bold", align = "left"),
    locations = cells_title("title")
  ) |> 
  tab_style(
    style = cell_text(color = "black", align = "left"),
    locations = cells_title("subtitle")
  ) |>
  tab_style(
     locations = cells_column_labels(columns = everything()),
     style = list(
       cell_borders(sides = "bottom", weight = px(3)),
       cell_text(weight = "bold", size=12)
     )
   ) |>
  opt_row_striping() |> 
  opt_table_lines("none") |>
    fmt_percent(
    columns = c(PercentChange),
    decimals = 1
  )
Did Maryland Shoot Fewer Threes in 2021-22?
No, the Terps shot more. But these 10 teams completely changed their offenses.
Team 2022-2023 2023-2024 Percent Change
Houston Christian 726 488 −32.8%
Jacksonville State 747 507 −32.1%
Coppin State 793 540 −31.9%
Toledo 764 550 −28.0%
Southeast Missouri State 875 631 −27.9%
Eastern Kentucky 927 678 −26.9%
Southern Utah 887 649 −26.8%
Tennessee Tech 829 608 −26.7%
Utah Valley 728 540 −25.8%
Louisiana-Monroe 690 513 −25.7%
By: Derek Willis | Source: Sports Reference

Throughout the semester, we’ve been using color and other signals to highlight things. Let’s pretend we’re doing a project on Syracuse. With a little tab_style magic, we can change individual rows and add color. The last tab_style block here will first pass off the styles we want to use – we’re going to make the rows maroon and the text gold – and then for locations we specify where with a simple filter. What that means is that any rows we can address with logic – all rows with a value greater than X, for example – we can change the styling.

threechange |> 
  gt() |> 
  cols_label(
    PercentChange = "Percent Change"
  ) |>
  tab_header(
    title = "Did Maryland Shoot Fewer Threes in 2021-22?",
    subtitle = "No, the Terps shot more. But these 10 teams completely changed their offenses."
  ) |>  
  tab_source_note(
    source_note = md("**By:** Derek Willis  |  **Source:** [Sports Reference](https://www.sports-reference.com/cbb/seasons/)")
  ) |> 
  tab_style(
    style = cell_text(color = "black", weight = "bold", align = "left"),
    locations = cells_title("title")
  ) |> 
  tab_style(
    style = cell_text(color = "black", align = "left"),
    locations = cells_title("subtitle")
  ) |>
  tab_style(
     locations = cells_column_labels(columns = everything()),
     style = list(
       cell_borders(sides = "bottom", weight = px(3)),
       cell_text(weight = "bold", size=12)
     )
   ) |>
  opt_row_striping() |> 
  opt_table_lines("none") |>
    fmt_percent(
    columns = c(PercentChange),
    decimals = 1
  ) |>
  tab_style(
    style = list(
      cell_fill(color = "blue"),
      cell_text(color = "orange")
      ),
    locations = cells_body(
      rows = Team == "Syracuse")
  )
Did Maryland Shoot Fewer Threes in 2021-22?
No, the Terps shot more. But these 10 teams completely changed their offenses.
Team 2022-2023 2023-2024 Percent Change
Houston Christian 726 488 −32.8%
Jacksonville State 747 507 −32.1%
Coppin State 793 540 −31.9%
Toledo 764 550 −28.0%
Southeast Missouri State 875 631 −27.9%
Eastern Kentucky 927 678 −26.9%
Southern Utah 887 649 −26.8%
Tennessee Tech 829 608 −26.7%
Utah Valley 728 540 −25.8%
Louisiana-Monroe 690 513 −25.7%
By: Derek Willis | Source: Sports Reference

Two things here:

  1. Dear God that color scheme is awful, which is fitting for a school that has a non-rhyming mascot.
  2. We’ve arrived where we want to be: We’ve created a clear table that allows a reader to compare schools at will while also using color to draw attention to the thing we want to draw attention to. We’ve kept it simple so the color has impact.