### :mask: cdcfluview - Retrieve U.S. Flu Season Data from the CDC FluView Portal [![CRAN\_Status\_Badge](http://www.r-pkg.org/badges/version/cdcfluview)](https://cran.r-project.org/package=cdcfluview) [![Travis-CI Build Status](https://travis-ci.org/hrbrmstr/cdcfluview.svg?branch=master)](https://travis-ci.org/hrbrmstr/cdcfluview) **NOTE** If there's a particular data set from that you want and that isn't in the package, please file it as an issue and be as specific as you can (screen shot if possible). ------------------------------------------------------------------------ The U.S. Centers for Disease Control (CDC) maintains a [portal](https://gis.cdc.gov/grasp/fluview/fluportaldashboard.html) for accessing state, regional and national influenza statistics. The portal's Flash interface makes it difficult and time-consuming to select and retrieve influenza data. This package provides functions to access the data provided by the portal's underlying API. The following functions are implemented: - `get_flu_data`: Retrieves state, regional or national influenza statistics from the CDC - `get_state_data`: Retrieves state/territory-level influenza statistics from the CDC - `get_weekly_flu_report`: Retrieves (high-level) weekly influenza surveillance report from the CDC - `get_mortality_surveillance_data` : (fairly self explanatory but also pretty new to the pkg and uses data from: The following data sets are included: - `hhs_regions` HHS Region Table (a data frame with 59 rows and 4 variables) - `census_regions` Census Region Table (a data frame with 51 rows and 2 variables) ### News - See NEWS - Version 0.4.0 - [CRAN release](http://cran.r-project.org/web/packages/cdcfluview) - Version 0.4.0.999 released : another fix for the CDC API (for region parameter); added data files for HHS/Census region lookups; added weekly high-level flu report retrieval - Version 0.3 released : fix for the CDC API (it changed how year & region params are encoded in the request) - Version 0.2.1 released : bumped up `httr` version \# requirement in `DESCRIPTION` (via Issue [1](https://github.com/hrbrmstr/cdcfluview/issues/1)) - Version 0.2 released : added state-level data retrieval - Version 0.1 released ### Installation ``` r install.packages("cdcfluview") # **OR** devtools::install_github("hrbrmstr/cdcfluview") ``` ### Usage ``` r library(cdcfluview) library(ggplot2) library(dplyr) library(statebins) # current verison packageVersion("cdcfluview") #> [1] '0.5.1' flu <- get_flu_data("hhs", sub_region=1:10, "ilinet", years=2014) glimpse(flu) #> Observations: 530 #> Variables: 15 #> $ REGION TYPE "HHS Regions", "HHS Regions", "HHS Regions", "HHS Regions", "HHS Regions", "HHS Regions",... #> $ REGION "Region 1", "Region 2", "Region 3", "Region 4", "Region 5", "Region 6", "Region 7", "Regi... #> $ YEAR 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2014,... #> $ WEEK 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 42, 42, 4... #> $ % WEIGHTED ILI 0.830610, 1.795830, 1.162260, 0.828920, 0.744546, 1.604740, 0.697022, 0.635856, 1.793140,... #> $ %UNWEIGHTED ILI 0.681009, 1.649790, 1.321020, 0.911243, 1.013950, 1.647270, 0.437619, 0.813397, 1.501530,... #> $ AGE 0-4 101, 869, 395, 331, 358, 446, 50, 76, 310, 22, 109, 837, 403, 355, 338, 540, 57, 57, 335,... #> $ AGE 25-49 44, 363, 455, 187, 181, 410, 43, 49, 220, 7, 37, 349, 465, 244, 182, 451, 56, 87, 225, 20... #> $ AGE 25-64 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N... #> $ AGE 5-24 185, 757, 627, 530, 400, 636, 98, 154, 577, 30, 199, 676, 669, 772, 443, 741, 124, 148, 5... #> $ AGE 50-64 13, 157, 126, 80, 80, 111, 15, 19, 110, 1, 24, 151, 130, 74, 105, 168, 18, 23, 118, 10, 2... #> $ AGE 65 9, 107, 89, 46, 64, 77, 14, 8, 112, 1, 17, 115, 75, 64, 48, 97, 14, 12, 103, 3, 9, 114, 8... #> $ ILITOTAL 352, 2253, 1692, 1174, 1083, 1680, 220, 306, 1329, 61, 386, 2128, 1742, 1509, 1116, 1997,... #> $ NUM. OF PROVIDERS 146, 276, 234, 299, 261, 226, 84, 119, 237, 55, 150, 265, 232, 306, 271, 235, 84, 115, 24... #> $ TOTAL PATIENTS 51688, 136563, 128083, 128835, 106810, 101987, 50272, 37620, 88510, 11172, 51169, 131884,... state_flu <- get_state_data(years=2015) glimpse(state_flu) #> Observations: 2,807 #> Variables: 8 #> $ statename "Alabama", "Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut", "... #> $ url "http://adph.org/influenza/", "http://dhss.alaska.gov/dph/Epi/id/Pages/influenza/influ... #> $ website "Influenza Surveillance", "Influenza Surveillance Report", "Influenza & RSV Surveillan... #> $ activity_level 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,... #> $ activity_level_label "Minimal", "Minimal", "Minimal", "Minimal", "Minimal", "Minimal", "Minimal", "Minimal"... #> $ weekend "Oct-10-2015", "Oct-10-2015", "Oct-10-2015", "Oct-10-2015", "Oct-10-2015", "Oct-10-201... #> $ season "2015-16", "2015-16", "2015-16", "2015-16", "2015-16", "2015-16", "2015-16", "2015-16"... #> $ weeknumber 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40... gg <- ggplot(flu, aes(x=WEEK, y=`% WEIGHTED ILI`, group=REGION)) gg <- gg + geom_line() gg <- gg + facet_wrap(~REGION, ncol=2) gg <- gg + theme_bw() gg ``` ``` r msd <- get_mortality_surveillance_data() mutate(msd$by_state, ym=as.Date(sprintf("%04d-%02d-1", Year, Week), "%Y-%U-%u")) %>% select(state, wk=ym, death_pct=`Percent of Deaths Due to Pneumonia and Influenza`) %>% mutate(death_pct=death_pct/100) -> df gg <- ggplot() + geom_smooth(data=df, aes(wk, death_pct, group=state), se=FALSE, color="#2b2b2b", size=0.25) gb <- ggplot_build(gg) gb$data[[1]] %>% arrange(desc(x)) %>% group_by(group) %>% slice(1) %>% ungroup() %>% arrange(desc(y)) %>% head(1) -> top top_state <- sort(unique(msd$by_state$state))[top$group] gg <- gg + geom_text(data=top, aes(as.Date(x, origin="1970-01-01"), y, label=top_state), hjust=1, family="Arial Narrow", size=3, nudge_x=-5, nudge_y=-0.001) gg <- gg + scale_x_date(expand=c(0,0)) gg <- gg + scale_y_continuous(label=scales::percent) gg <- gg + labs(x=NULL, y=NULL, title="Percent of In-State Deaths Due to Pneumonia and Pnfluenza (2010-Present)") gg <- gg + theme_bw(base_family="Arial Narrow") gg <- gg + theme(axis.text.x=element_text(margin=margin(0,0,0,0))) gg <- gg + theme(axis.text.y=element_text(margin=margin(0,0,0,0))) gg <- gg + theme(axis.ticks=element_blank()) gg <- gg + theme(plot.title=element_text(face="bold", size=16)) gg ``` ``` r gg_s <- state_flu %>% filter(weekend=="Jan-02-2016") %>% select(state=statename, value=activity_level) %>% filter(!(state %in% c("Puerto Rico", "New York City"))) %>% # need to add PR to statebins mutate(value=as.numeric(gsub("Level ", "", value))) %>% statebins(brewer_pal="RdPu", breaks=4, labels=c("Minimal", "Low", "Moderate", "High"), legend_position="bottom", legend_title="ILI Activity Level") + ggtitle("CDC State FluView (2015-01-03)") gg_s ``` ### Test Results ``` r library(cdcfluview) library(testthat) date() #> [1] "Mon Dec 5 14:45:12 2016" test_dir("tests/") #> testthat results ======================================================================================================== #> OK: 2 SKIPPED: 0 FAILED: 0 #> #> DONE =================================================================================================================== ```