boB Rudis
5 years ago
22 changed files with 454 additions and 121 deletions
@ -0,0 +1,15 @@ |
|||
#' @md |
|||
#' @title June 2019 U.S. Democratic Debate Candidate/Topic Times |
|||
#' @description The New York Times and other media outlets kept track of the time each |
|||
#' candidate spent talking including the timestamp of the start of the blathering |
|||
#' and the topic up for debate. This dataset only includes candidates and |
|||
#' topic times. The complete datasets (See References) also include moderator |
|||
#' metadata and opening/closing statement records. |
|||
#' @format data frame with columns: `elapsed` (dbl), `timestamp` (drtn), `speaker` (chr), `topic` (chr) |
|||
#' @docType data |
|||
#' @keywords datasets |
|||
#' @name debates2019 |
|||
#' @references <https://www.nytimes.com/interactive/2019/admin/100000006581096.embedded.html> |
|||
#' @references <https://www.nytimes.com/interactive/2019/admin/100000006584572.embedded.html> |
|||
#' @usage data("debates2019") |
|||
NULL |
@ -0,0 +1,4 @@ |
|||
`%l0%` <- function(x, y) if (length(x) == 0) y else x |
|||
`%||%` <- function(x, y) if (is.null(x)) y else x |
|||
`%@%` <- function(x, name) attr(x, name, exact = TRUE) |
|||
`%nin%` <- function(x, table) match(x, table, nomatch = 0) == 0 |
Before Width: | Height: | Size: 270 KiB After Width: | Height: | Size: 250 KiB |
@ -0,0 +1,14 @@ |
|||
## code to prepare `debates2019` dataset goes here |
|||
|
|||
read_csv( |
|||
file = "https://rud.is/data/2019-dem-debates.csv.gz", |
|||
col_types = cols( |
|||
elapsed = col_double(), |
|||
timestamp = col_time(format = ""), |
|||
speaker = col_character(), |
|||
topic = col_character() |
|||
) |
|||
) -> debates2019 |
|||
|
|||
|
|||
usethis::use_data(debates2019, overwrite = TRUE) |
Binary file not shown.
@ -0,0 +1,23 @@ |
|||
% Generated by roxygen2: do not edit by hand |
|||
% Please edit documentation in R/datasets.R |
|||
\docType{data} |
|||
\name{debates2019} |
|||
\alias{debates2019} |
|||
\title{June 2019 U.S. Democratic Debate Candidate/Topic Times} |
|||
\format{data frame with columns: \code{elapsed} (dbl), \code{timestamp} (drtn), \code{speaker} (chr), \code{topic} (chr)} |
|||
\usage{ |
|||
data("debates2019") |
|||
} |
|||
\description{ |
|||
The New York Times and other media outlets kept track of the time each |
|||
candidate spent talking including the timestamp of the start of the blathering |
|||
and the topic up for debate. This dataset only includes candidates and |
|||
topic times. The complete datasets (See References) also include moderator |
|||
metadata and opening/closing statement records. |
|||
} |
|||
\references{ |
|||
\url{https://www.nytimes.com/interactive/2019/admin/100000006581096.embedded.html} |
|||
|
|||
\url{https://www.nytimes.com/interactive/2019/admin/100000006584572.embedded.html} |
|||
} |
|||
\keyword{datasets} |
After Width: | Height: | Size: 257 KiB |
After Width: | Height: | Size: 107 KiB |
@ -0,0 +1,2 @@ |
|||
*.html |
|||
*.R |
After Width: | Height: | Size: 102 KiB |
@ -0,0 +1,133 @@ |
|||
--- |
|||
title: "Using {ggchicklet}" |
|||
output: |
|||
rmarkdown::html_vignette: |
|||
df_print: kable |
|||
vignette: > |
|||
%\VignetteIndexEntry{Using {ggchicklet}} |
|||
%\VignetteEncoding{UTF-8} |
|||
%\VignetteEngine{knitr::rmarkdown} |
|||
editor_options: |
|||
chunk_output_type: console |
|||
--- |
|||
|
|||
```{r, include = FALSE} |
|||
knitr::opts_chunk$set( |
|||
message = FALSE, |
|||
warning = FALSE, |
|||
collapse = TRUE, |
|||
comment = "## " |
|||
) |
|||
``` |
|||
|
|||
The New York Times reporters [kept track of the candiate speaking time spent per-topic](https://www.nytimes.com/interactive/2019/admin/100000006581096.embedded.html) for the June 2019 initial U.S. Democratic debates. They used a segmented, rounded-corner bar chart --- ordered by timestamp --- that I've dubbed a "chicklet" chart since they look like the fairly well-known gum/candy. This is the image from one of them: |
|||
|
|||
![](nytimes.png) |
|||
|
|||
The rounded corners aesthetic looked great and said feature begat the creation of {ggchicklet}. |
|||
|
|||
Let's load up the packages we'll need: |
|||
|
|||
```{r setup} |
|||
library(hrbrthemes) # my preferred theme |
|||
library(ggchicklet) # this pacakge! |
|||
library(dplyr) # we need to do a bit of data wrangling |
|||
library(forcats) # so we include {dplyr} and {forcats} |
|||
library(ggplot2) # duh! |
|||
``` |
|||
|
|||
If you peek at the source code for the New York Times javascript-created charts you'll see |
|||
that all the data is right there. Rather than make you figure out how to wrangle it, a majority |
|||
subset has been included in the package and can be accessed via: |
|||
|
|||
```{r data} |
|||
data("debates2019") |
|||
|
|||
head(debates2019, 10) |
|||
``` |
|||
|
|||
The `elapsed` column contains how long the candidate spoke and `timestamp` is the time they started speaking. We'll use both to control the look and feel of the {ggchicklet} chart. |
|||
|
|||
There are also candidates: |
|||
|
|||
```{r data-ex-01} |
|||
distinct(debates2019, speaker) %>% |
|||
arrange(speaker) %>% |
|||
print(n=nrow(.)) |
|||
``` |
|||
|
|||
and the topics debates: |
|||
|
|||
```{r data-ex-02} |
|||
distinct(debates2019, topic) %>% |
|||
arrange(topic) %>% |
|||
print(n=nrow(.)) |
|||
``` |
|||
|
|||
First, we'll use `forcats::fct_reorder()` to reorder the `speaker`s by total speaking time to |
|||
make it easier to compare the differences in total time spoken between candidate. |
|||
|
|||
Then, we'll use `forcats::fct_other()` to limit the number of `topic`s to only those highlighted |
|||
by the New York Times (and to show how to do that). |
|||
|
|||
We need to use `group = timestamp` to ensure the segments are ordered by time (vs category/topic) |
|||
and `fill = topic` to color them appropriately. Note that just using `fill = topic` would group |
|||
the segments by topic. |
|||
|
|||
```{r chicklet, fig.width=600/72, fig.height=600/72} |
|||
debates2019 %>% |
|||
mutate(speaker = fct_reorder(speaker, elapsed, sum, .desc=FALSE)) %>% |
|||
mutate(topic = fct_other( |
|||
topic, |
|||
c("Immigration", "Economy", "Climate Change", "Gun Control", "Healthcare", "Foreign Policy")) |
|||
) %>% |
|||
ggplot(aes(speaker, elapsed, group = timestamp, fill = topic)) + |
|||
geom_chicklet(width = 0.75) + |
|||
scale_y_continuous( |
|||
expand = c(0, 0.0625), |
|||
position = "right", |
|||
breaks = seq(0, 14, 2), |
|||
labels = c(0, sprintf("%d min.", seq(2, 14, 2))) |
|||
) + |
|||
scale_fill_manual( |
|||
name = NULL, |
|||
values = c( # NYTimes colors |
|||
"Immigration" = "#ae4544", |
|||
"Economy" = "#d8cb98", |
|||
"Climate Change" = "#a4ad6f", |
|||
"Gun Control" = "#cc7c3a", |
|||
"Healthcare" = "#436f82", |
|||
"Foreign Policy" = "#7c5981", |
|||
"Other" = "#cccccc" |
|||
), |
|||
breaks = setdiff(unique(debates2019$topic), "Other") |
|||
) + |
|||
guides( |
|||
fill = guide_legend(nrow = 1) |
|||
) + |
|||
coord_flip() + |
|||
labs( |
|||
x = NULL, y = NULL, fill = NULL, |
|||
title = "How Long Each Candidate Spoke", |
|||
subtitle = "Nights 1 & 2 of the June 2019 Democratic Debates", |
|||
caption = "Each bar segment represents the length of a candidate’s response to a question.\n\nOriginals <https://www.nytimes.com/interactive/2019/admin/100000006581096.embedded.html?>\n<https://www.nytimes.com/interactive/2019/admin/100000006584572.embedded.html?>\nby @nytimes Weiyi Cai, Jason Kao, Jasmine C. Lee, Alicia Parlapiano and Jugal K. Patel\n\n#rstats reproduction by @hrbrmstr" |
|||
) + |
|||
theme_ipsum_rc(grid="X") + |
|||
theme(axis.text.x = element_text(color = "gray60", size = 10)) + |
|||
theme(legend.position = "top") |
|||
``` |
|||
|
|||
You can use `ggplot2::geom_col()` to create a similar chart without the rounded rectangles but `geom_chicklet()` sets some useful defaults: |
|||
|
|||
- "`white`" stroke for the chicklet/segment (`geom_col()` has `NA` for the stroke) |
|||
- automatic reversing of the `group` order (`geom_col()` uses the standard sort order) |
|||
- radius setting of `unit(3, "px")` |
|||
- chicklet legend geom |
|||
|
|||
You will need to modify `colour`/`color` to use something besides "`white`" if you are using |
|||
a non-white background and do not want a white stroke. Larger width chicklet segments may |
|||
look better with a larger radius. |
|||
|
|||
Note also that the `group`ing column does not need to be a time-like object; any type of |
|||
ordered column will work to set the display order. |
|||
|
Loading…
Reference in new issue