Browse Source

new API coverage & CRAN checks

tags/v0.7.0
boB Rudis 6 years ago
parent
commit
c0ce51942c
No known key found for this signature in database GPG Key ID: 2A514A4997464560
  1. 13
      .Rbuildignore
  2. 25
      .travis.yml
  3. 25
      CONDUCT.md
  4. 19
      DESCRIPTION
  5. 11
      NAMESPACE
  6. 30
      NEWS.md
  7. 13
      R/aaa.R
  8. 3
      R/agd-ipt.r
  9. 14
      R/cdcfluview-package.R
  10. 28
      R/coverage-map.r
  11. 42
      R/datasets.r
  12. 38
      R/geographic-spread.R
  13. 53
      R/hospital.r
  14. 80
      R/ili-weekly-state.r
  15. 135
      R/pi-mortality.r
  16. 9
      R/utils.r
  17. 3
      R/who-nrvess.r
  18. 11
      R/zzz.r
  19. 43
      README.Rmd
  20. 101
      README.md
  21. 1
      codecov.yml
  22. 37
      crunch/mkdata.r
  23. BIN
      data/census_regions.rda
  24. BIN
      data/hhs_regions.rda
  25. 5
      man/agd_ipt.Rd
  26. 32
      man/cdc_coverage_map.Rd
  27. 9
      man/cdcfluview.Rd
  28. 27
      man/census_regions.Rd
  29. 16
      man/geographic_spread.Rd
  30. 30
      man/hhs_regions.Rd
  31. 16
      man/hospitalizations.Rd
  32. 39
      man/ili_weekly_activity_indicators.Rd
  33. 54
      man/pi_mortality.Rd
  34. 14
      man/state_data_providers.Rd
  35. 14
      man/surveillance_areas.Rd
  36. 2
      man/who_nrevss.Rd

13
.Rbuildignore

@ -8,3 +8,16 @@
^\.codecov\.yml$
^README_files$
^doc$
^CONDUCT\.md$
^codecov\.yml$
^.*\.Rproj$
^\.Rproj\.user$
^\.travis\.yml$
^.*md$
^crunch/
^crunch/.*
^README_files/
^README_files/.*
^README-.*
^cran-comments\.md$
^codecov\.yml$

25
.travis.yml

@ -1,31 +1,16 @@
language: r
warnings_are_errors: true
# R for travis: see documentation at https://docs.travis-ci.com/user/languages/r
language: R
sudo: required
cache: packages
apt_packages:
- libv8-dev
r:
- oldrel
- release
- devel
apt_packages:
- libv8-dev
- xclip
env:
global:
- CRAN: http://cran.rstudio.com
after_success:
- Rscript -e 'covr::codecov()'
notifications:
email:
- bob@rud.is
irc:
channels:
- "104.236.112.222#builds"
nick: travisci

25
CONDUCT.md

@ -0,0 +1,25 @@
# Contributor Code of Conduct
As contributors and maintainers of this project, we pledge to respect all people who
contribute through reporting issues, posting feature requests, updating documentation,
submitting pull requests or patches, and other activities.
We are committed to making participation in this project a harassment-free experience for
everyone, regardless of level of experience, gender, gender identity and expression,
sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion.
Examples of unacceptable behavior by participants include the use of sexual language or
imagery, derogatory comments or personal attacks, trolling, public or private harassment,
insults, or other unprofessional conduct.
Project maintainers have the right and responsibility to remove, edit, or reject comments,
commits, code, wiki edits, issues, and other contributions that are not aligned to this
Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed
from the project team.
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by
opening an issue or contacting one or more of the project maintainers.
This Code of Conduct is adapted from the Contributor Covenant
(http:contributor-covenant.org), version 1.0.0, available at
http://contributor-covenant.org/version/1/0/0/

19
DESCRIPTION

@ -1,18 +1,25 @@
Package: cdcfluview
Type: Package
Title: cdcfluview title goes here otherwise CRAN checks fail
Version: 0.1.0
Title: Retrieve 'U.S'.' Flu Season Data from the 'CDC' 'FluView' Portal
Version: 0.7.0
Date: 2017-11-04
Authors@R: c(
person("Bob", "Rudis", email = "bob@rud.is", role = c("aut", "cre"),
comment = c(ORCID = "0000-0001-5670-2640"))
comment = c(ORCID = "0000-0001-5670-2640")),
person("Craig", "McGowan", email = "mcgowan.cj@gmail.com", role = "ctb")
)
Author: Bob Rudis (bob@rud.is)
Maintainer: Bob Rudis <bob@rud.is>
Description: A good description goes here otherwise CRAN checks fail.
Description: The U.S. Centers for Disease Control (CDC) maintains a portal
<http://gis.cdc.gov/grasp/fluview/fluportaldashboard.html> for
accessing state, regional and national influenza statistics as well as
Mortality Surveillance Data. The Flash interface makes it difficult and
time-consuming to select and retrieve influenza data. This package
provides functions to access the data provided by the portal's underlying API.
URL: https://github.com/hrbrmstr/cdcfluview
BugReports: https://github.com/hrbrmstr/cdcfluview/issues
License: MIT + file LICENSE
LazyData: true
Suggests:
testthat,
covr
@ -23,5 +30,7 @@ Imports:
tools,
dplyr,
jsonlite,
stats
stats,
utils,
sf
RoxygenNote: 6.0.1

11
NAMESPACE

@ -1,13 +1,24 @@
# Generated by roxygen2: do not edit by hand
export(agd_ipt)
export(cdc_coverage_map)
export(geographic_spread)
export(hospitalizations)
export(ili_weekly_activity_indicators)
export(ilinet)
export(pi_mortality)
export(state_data_providers)
export(surveillance_areas)
export(who_nrevss)
import(httr)
importFrom(dplyr,"%>%")
importFrom(dplyr,bind_rows)
importFrom(dplyr,filter)
importFrom(dplyr,left_join)
importFrom(dplyr,mutate)
importFrom(jsonlite,fromJSON)
importFrom(sf,st_read)
importFrom(stats,setNames)
importFrom(tools,file_path_sans_ext)
importFrom(utils,read.csv)
importFrom(utils,unzip)

30
NEWS.md

@ -1,2 +1,28 @@
0.1.0
* Initial release
# cdcfluview 0.7.0
* The CDC changed most of their API endpoints to support a new HTML interface.
There are many breaking changes but also many new data endpoints.
# cdcfluview 0.5.2
* Modified behavior of `get_flu_data()` to actually grab current flu season
year if a single year was specified and it is the current year and the
return is a 0 length data frame (fixes #7)
* Added code coverage tests for all API functions.
# cdcfluview 0.5.1
* Replaced `http` URLs with `https` as `http` ones no longer work (fixes #6)
* Fixed State data download (CDC changed the hidden API)
# cdcfluview 0.5.0
* Fixed issue with WHO data format change
* Added Mortality Surveillance Data retrieval function
* Switched to readr::read_csv() and since it handles column names
better this will break your scripts until you use the new
column names.
# cdcfluview 0.4.0
* First CRAN release

13
R/aaa.R

@ -1,8 +1,15 @@
# CDC U.S. region names to ID map
.region_map <- c(national=3, hhs=1, census=2, state=5)
# CDC hospital surveillance region map
.hosp_surv_map <- c(flusurv=1, eip=2, ihsp=3)
# CDC hospital surveillance surveillance area name to internal pkg use map
.surv_map <- c(`FluSurv-NET`="flusurv", `EIP`="eip", `IHSP`="ihsp")
.surv_rev_map <- c(flusurv="FluSurv-NET", eip="EIP", ihsp="IHSP")
# CDC P&I mortality GepID mapping
.geoid_map <- c(national="1", state="2", region="3")
# Our bot's user-agent string
.cdcfluview_ua <- "Mozilla/5.0 (compatible; R-cdcvluview Bot/2.0; https://github.com/hrbrmstr/cdcfluview)"
.cdcfluview_ua <- "Mozilla/5.0 (compatible; R-cdcvluview Bot/2.0; https://github.com/hrbrmstr/cdcfluview)"
# CDC Basemap
.cdc_basemap <- "https://gis.cdc.gov/grasp/fluview/FluView1References/data/US_States_w_PR_labels.json"

3
R/agd-ipt.r

@ -8,6 +8,9 @@
#' - [CDC FluView Portal](https://gis.cdc.gov/grasp/fluview/fluportaldashboard.html)
#' - [AGD IPT Portal](https://gis.cdc.gov/grasp/fluview/flu_by_age_virus.html)
#' @export
#' @examples \dontrun{
#' agd_ipt()
#' }
agd_ipt <- function() {
httr::GET(
url = "https://gis.cdc.gov/grasp/fluView6/GetFlu6AllDataP",

14
R/cdcfluview-package.R

@ -1,11 +1,21 @@
#' ...
#' Retrieve 'U.S'.' Flu Season Data from the 'CDC' 'FluView' Portal
#'
#' The U.S. Centers for Disease Control (CDC) maintains a portal
#' <http://gis.cdc.gov/grasp/fluview/fluportaldashboard.html> for
#' accessing state, regional and national influenza statistics as well as
#' Mortality Surveillance Data. The Flash interface makes it difficult and
#' time-consuming to select and retrieve influenza data. This package
#' provides functions to access the data provided by the portal's underlying API.
#'
#' @md
#' @name cdcfluview
#' @docType package
#' @author Bob Rudis (bob@@rud.is)
#' @import httr
#' @importFrom tools file_path_sans_ext
#' @importFrom dplyr left_join bind_rows %>%
#' @importFrom dplyr left_join bind_rows mutate filter %>%
#' @importFrom jsonlite fromJSON
#' @importFrom stats setNames
#' @importFrom sf st_read
#' @importFrom utils read.csv unzip
NULL

28
R/coverage-map.r

@ -0,0 +1,28 @@
#' Retrieve CDC U.S. Coverage Map
#'
#' The CDC FluView application uses a composite basemap of coverage areas
#' within the United States that elides and scales Alaska, Hawaii and
#' Puerto Rico and provides elided and scaled breakouts for New York City
#' and the District of Columbia.\cr
#' \cr
#' The basemap provides polygon identifiers by:
#' \cr
#' - `STATE_FIPS`
#' - `STATE_ABBR`
#' - `STATE_NAME`
#' - `HHS_Region`
#' - `FIPSTXT`)
#' \cr
#' This function retrieves the shapefile, projects to EPSG:5069 and
#' returns it as an `sf` (simple features) object.
#'
#' @md
#' @export
#' @examples \dontrun{
#' plot(cdc_coverage_map())
#' }
cdc_coverage_map <- function() {
xsf <- sf::st_read(.cdc_basemap, quiet=TRUE, stringsAsFactors=FALSE)
sf::st_crs(xsf) <- 4326
sf::st_transform(xsf, 5069)
}

42
R/datasets.r

@ -0,0 +1,42 @@
#' @title HHS Region Table
#' @description This dataset contains the names, numbers, regional offices for-,
#' and states/territories belonging to the (presently) 10 HHS U.S.
#' regions in "long" format. It consists of a \code{data.frame}
#' with the following columns:
#'
#' \itemize{
#' \item \code{region}: the official HHS region name (e.g. "\code{Region 1}")
#' \item \code{region_number}: the associated region number
#' \item \code{regional_office}: the HHS regional office for the entire region
#' \item \code{state_or_territory}: state or territory belonging to the region
#' }
#'
#' @docType data
#' @keywords datasets
#' @name hhs_regions
#'
#' @references \url{https://www.hhs.gov/about/agencies/iea/regional-offices/index.html}
#' @usage data(hhs_regions)
#' @note Last updated 2015-08-09.
#' @format A data frame with 59 rows and 4 variables
NULL
#' @title Census Region Table
#' @description This dataset contains the states belonging to the (presently) 4
#' U.S. Census regions in "long" format. It consists of a \code{data.frame}
#' with the following columns:
#'
#' \itemize{
#' \item \code{region}: the official Census region name (e.g. "\code{East}")
#' \item \code{state}: state belonging to the region
#' }
#'
#' @docType data
#' @keywords datasets
#' @name census_regions
#'
#' @references \url{https://www.cdc.gov/std/stats12/images/CensusMap.png}
#' @usage data(census_regions)
#' @note Last updated 2015-08-09.
#' @format A data frame with 51 rows and 2 variables
NULL

38
R/geographic-spread.R

@ -0,0 +1,38 @@
#' State and Territorial Epidemiologists Reports of Geographic Spread of Influenza
#'
#' @export
#' @examples \dontrun{
#' geographic_spread()
#' }
geographic_spread <- function() {
meta <- jsonlite::fromJSON("https://gis.cdc.gov/grasp/Flu8/GetPhase08InitApp?appVersion=Public")
meta$seasons$seasonid
httr::POST(
url = "https://gis.cdc.gov/grasp/Flu8/PostPhase08DownloadData",
httr::user_agent(.cdcfluview_ua),
httr::add_headers(
Origin = "https://gis.cdc.gov",
Accept = "application/json, text/plain, */*",
Referer = "https://gis.cdc.gov/grasp/fluview/FluView8.html"
),
encode = "json",
body = list(
AppVersion = "Public",
SeasonIDs = paste0(meta$seasons$seasonid, collapse=",")
),
httr::timeout(60),
httr::verbose()
) -> res
httr::stop_for_status(res)
res <- httr::content(res, as="parsed", flatten=TRUE)
xdf <- dplyr::bind_rows(res$datadownload)
xdf$weekend <- as.Date(xdf$weekend, format="%B-%d-%Y")
xdf
}

53
R/hospital.r

@ -1,20 +1,38 @@
#' Laboratory-Confirmed Influenza Hospitalizations
#'
#' @md
#' @param surveillance_area one of "`flusurv`", "`eip`", or "`ihsp`"
#' @param region Using "`all`" mimics selecting "Entire Network" from the
#' CDC FluView application drop down. Individual regions for each
#' surveillance area can also be selected. Use [surveillance_areas()] to
#' see a list of valid sub-regions for each surveillance area.
#' @references
#' - [Hospital Portal](https://gis.cdc.gov/GRASP/Fluview/FluHospRates.html)
#' @export
#' @examples
#' @examples \dontrun{
#' hosp_fs <- hospitalizations("flusurv")
#' hosp_eip <- hospitalizations("eip")
#' hosp_ihsp <- hospitalizations("ihsp")
hospitalizations <- function(surveillance_area=c("flusurv", "eip", "ihsp")) {
surveillance_area <- match.arg(tolower(surveillance_area), c("flusurv", "eip", "ihsp"))
#' }
hospitalizations <- function(surveillance_area=c("flusurv", "eip", "ihsp"),
region="all") {
network_id <- .hosp_surv_map[surveillance_area]
sarea <- match.arg(tolower(surveillance_area), choices = c("flusurv", "eip", "ihsp"))
sarea <- .surv_rev_map[sarea]
meta <- jsonlite::fromJSON("https://gis.cdc.gov/GRASP/Flu3/GetPhase03InitApp?appVersion=Public")
areas <- setNames(meta$catchments[,c("networkid", "name", "area", "catchmentid")],
c("networkid", "surveillance_area", "region", "id"))
reg <- region
if (reg == "all") reg <- "Entire Network"
tgt <- dplyr::filter(areas, (surveillance_area == sarea) & (region == reg))
if (nrow(tgt) == 0) {
stop("Region not found. Use `surveillance_areas()` to see a list of valid inputs.",
call.=FALSE)
}
httr::POST(
url = "https://gis.cdc.gov/GRASP/Flu3/PostPhase03GetData",
@ -27,9 +45,9 @@ hospitalizations <- function(surveillance_area=c("flusurv", "eip", "ihsp")) {
encode = "json",
body = list(
appversion = "Public",
networkid = network_id,
cacthmentid = "22"
),
networkid = tgt$networkid,
cacthmentid = tgt$id
),
httr::verbose()
) -> res
@ -62,6 +80,23 @@ hospitalizations <- function(surveillance_area=c("flusurv", "eip", "ihsp")) {
dplyr::left_join(xdf, mmwr_df, c("mmwrid", "weeknumber")) %>%
dplyr::left_join(age_df, "age") %>%
dplyr::left_join(sea_df, "seasonid")
dplyr::left_join(sea_df, "seasonid") %>%
dplyr::mutate(
surveillance_area = sarea,
region = reg
)
}
#' Retrieve a list of valid sub-regions for each surveillance area.
#'
#' @md
#' @export
#' @examples
#' surveillance_areas()
surveillance_areas <- function() {
meta <- jsonlite::fromJSON("https://gis.cdc.gov/GRASP/Flu3/GetPhase03InitApp?appVersion=Public")
xdf <- setNames(meta$catchments[,c("name", "area")], c("surveillance_area", "region"))
xdf$surveillance_area <- .surv_map[xdf$surveillance_area]
xdf
}

80
R/ili-weekly-state.r

@ -0,0 +1,80 @@
#' Retrieve weekly state-level ILI indicators per-state for a given season
#'
#' @md
#' @param season_start_year numeric; start year for flu season (e.g. 2017 for 2017-2018 season)
#' @references
#' - [ILI Activity Indicator Map Portal](https://gis.cdc.gov/grasp/fluview/main.html)
#' @note These statistics use the proportion of outpatient visits to healthcare providers
#' for influenza-like illness to measure the ILI activity level within a state. They do
#' not, however, measure the extent of geographic spread of flu within a state. Therefore,
#' outbreaks occurring in a single city could cause the state to display high activity levels.\cr
#' \cr
#' Data collected in ILINet may disproportionately represent certain populations within
#' a state, and therefore may not accurately depict the full picture of influenza activity
#' for the whole state.\cr
#' \cr
#' All summary statistics are based on either data collected in ILINet, or reports from
#' state and territorial epidemiologists. Differences in the summary data presented by
#' CDC and state health departments likely represent differing levels of data completeness
#' with data presented by the state likely being the more complete.
#' @export
#' @examples \dontrun{
#' ili_weekly_activity_indicators(2016)
#' }
ili_weekly_activity_indicators <- function(season_start_year) {
jsonlite::fromJSON("https://gis.cdc.gov/grasp/fluView1/Phase1IniP") %>%
jsonlite::fromJSON() -> meta
season <- season_start_year - 1960
res <- httr::GET(sprintf("https://gis.cdc.gov/grasp/fluView1/Phase1SeasonDataP/%s",
season))
httr::stop_for_status(res)
res <- httr::content(res, as="parsed")
res <- jsonlite::fromJSON(res)
setNames(
meta$ili_intensity[,c("iliActivityid", "ili_activity_label", "legend")],
c("iliactivityid", "ili_activity_label", "ili_activity_group")
) -> iliact
dplyr::left_join(res$busdata, meta$stateinfo, "stateid") %>%
dplyr::left_join(res$mmwr, "mmwrid") %>%
dplyr::left_join(iliact, "iliactivityid") -> xdf
xdf <- xdf[,c("statename", "ili_activity_label", "ili_activity_group",
"statefips", "stateabbr", "weekend", "weeknumber", "year", "seasonid")]
xdf$statefips <- trimws(xdf$statefips)
xdf$stateabbr <- trimws(xdf$stateabbr)
xdf$weekend <- as.Date(xdf$weekend)
xdf$ili_activity_label <- factor(xdf$ili_activity_label,
levels=iliact$ili_activity_label)
class(xdf) <- c("tbl_df", "tbl", "data.frame")
xdf
}
#' Retrieve metadat about U.S. State CDC Provider Data
#'
#' @md
#' @export
#' @examples
#' state_data_providers()
state_data_providers <- function() {
jsonlite::fromJSON("https://gis.cdc.gov/grasp/fluView1/Phase1IniP") %>%
jsonlite::fromJSON() -> meta
state_info <- meta$stateinfo
state_info <- state_info[,c("statename", "statehealthdeptname", "url", "statewebsitename", "statefluphonenum")]
class(state_info) <- c("tbl_df", "tbl", "data.frame")
state_info
}

135
R/pi-mortality.r

@ -0,0 +1,135 @@
#' Pneumonia and Influenza Mortality Surveillance
#'
#' The National Center for Health Statistics (NCHS) collects and disseminates the Nation's
#' official vital statistics. NCHS collects death certificate data from state vital
#' statistics offices for virtually all deaths occurring in the United States. Pneumonia
#' and influenza (P&I) deaths are identified based on ICD-10
#' multiple cause of death codes.\cr
#' \cr
#' NCHS Mortality Surveillance System data are presented by the week the death occurred
#' at the national, state, and HHS Region levels. Data on the percentage of deaths due
#' to P&I on a national level are released two weeks after the week of death to allow
#' for collection of enough data to produce a stable percentage. States and HHS regions
#' with less than 20% of the expected total deaths (average number of total deaths
#' reported by week during 2008-2012) will be marked as insufficient data. Collection
#' of complete data is not expected at the time of initial report, and a reliable
#' percentage of deaths due to P&I is not anticipated at the U.S. Department of Health
#' and Human Services region or state level within this two week period. The data for
#' earlier weeks are continually revised and the proportion of deaths due to P&I may
#' increase or decrease as new and updated death certificate data are received by NCHS.\cr
#' \cr
#' The seasonal baseline of P&I deaths is calculated using a periodic regression model
#' that incorporates a robust regression procedure applied to data from the previous
#' five years. An increase of 1.645 standard deviations above the seasonal baseline
#' of P&I deaths is considered the "epidemic threshold," i.e., the point at which
#' the observed proportion of deaths attributed to pneumonia or influenza was
#' significantly higher than would be expected at that time of the year in the
#' absence of substantial influenza-related mortality. Baselines and thresholds are
#' calculated at the national and regional level and by age group.
#'
#' @md
#' @param coverage_area coverage area for data (national, state or region)
#' @note Queries for "state" and "region" are not "instantaneous" and can near or over 30s retrieval delays.
#' @references
#' - [Pneumonia and Influenza Mortality Surveillance Portal](https://gis.cdc.gov/grasp/fluview/mortality.html)
#' @export
#' @examples \dontrun{
#' ndf <- pi_mortality()
#' sdf <- pi_mortality("state")
#' rdf <- pi_mortality("region")
#' }
pi_mortality <- function(coverage_area=c("national", "state", "region")) {
coverage_area <- match.arg(tolower(coverage_area), choices = c("national", "state", "region"))
us_states <- read.csv("https://gis.cdc.gov/grasp/fluview/Flu7References/Data/USStates.csv",
stringsAsFactors=FALSE)
us_states <- setNames(us_states, c("region_name", "subgeoid", "state_abbr"))
us_states <- us_states[,c("region_name", "subgeoid")]
us_states$subgeoid <- as.character(us_states$subgeoid)
meta <- jsonlite::fromJSON("https://gis.cdc.gov/grasp/flu7/GetPhase07InitApp?appVersion=Public")
mapcode_df <- setNames(meta$nchs_mapcode[,c("mapcode", "description")], c("map_code", "callout"))
mapcode_df$map_code <- as.character(mapcode_df$map_code)
geo_df <- meta$nchs_geo_dim
geo_df$geoid <- as.character(geo_df$geoid)
age_df <- setNames(meta$nchs_ages, c("ageid", "age_label"))
age_df$ageid <- as.character(age_df$ageid)
mwmr_df <- meta$mmwr
mwmr_df$mmwrid <- as.character(mwmr_df$mmwrid)
mwmr_df <- setNames(mwmr_df,
c("mmwrid", "weekend", "mwmr_weeknumber", "weekstart",
"year", "yearweek", "mwmr_seasonid", "mwmr_label", "weekendlabel"))
sum_df <- meta$nchs_summary
sum_df$seasonid <- as.character(sum_df$seasonid)
sum_df$ageid <- as.character(sum_df$ageid)
sum_df$geoid <- as.character(sum_df$geoid)
httr::POST(
url = "https://gis.cdc.gov/grasp/flu7/PostPhase07DownloadData",
httr::user_agent(.cdcfluview_ua),
httr::add_headers(
Origin = "https://gis.cdc.gov",
Accept = "application/json, text/plain, */*",
Referer = "https://gis.cdc.gov/grasp/fluview/mortality.html"
),
encode = "json",
body = list(
AppVersion = "Public",
AreaParameters = list(list(ID=.geoid_map[coverage_area])),
SeasonsParameters = lapply(meta$seasons$seasonid, function(.x) { list(ID=as.integer(.x)) }),
AgegroupsParameters = list(list(ID="1"))
),
httr::timeout(60),
httr::verbose()
) -> res
httr::stop_for_status(res)
res <- httr::content(res, as="parsed", flatten=TRUE)
dplyr::bind_rows(res$seasons) %>%
dplyr::left_join(mapcode_df, "map_code") %>%
dplyr::left_join(geo_df, "geoid") %>%
dplyr::left_join(age_df, "ageid") %>%
dplyr::left_join(mwmr_df, "mmwrid") -> xdf
xdf <- dplyr::mutate(xdf, coverage_area = coverage_area)
if (coverage_area == "state") {
xdf <- dplyr::left_join(xdf, us_states, "subgeoid")
} else if (coverage_area == "region") {
xdf$region_name <- sprintf("Region %s", xdf$subgeoid)
} else {
xdf$region_name <- NA_character_
}
xdf[,c("seasonid", "baseline", "threshold", "percent_pni",
"percent_complete", "number_influenza", "number_pneumonia",
"all_deaths", "Total_PnI", "weeknumber", "geo_description",
"age_label", "weekend", "weekstart", "year", "yearweek",
"coverage_area", "region_name", "callout")] -> xdf
suppressWarnings(xdf$baseline <- to_num(xdf$baseline))
suppressWarnings(xdf$threshold <- to_num(xdf$threshold))
suppressWarnings(xdf$percent_pni <- to_num(xdf$percent_pni) / 100)
suppressWarnings(xdf$percent_complete <- to_num(xdf$percent_complete) / 100)
suppressWarnings(xdf$number_influenza <- to_num(xdf$number_influenza))
suppressWarnings(xdf$number_pneumonia <- to_num(xdf$number_pneumonia))
suppressWarnings(xdf$all_deaths <- to_num(xdf$all_deaths))
suppressWarnings(xdf$Total_PnI <- to_num(xdf$Total_PnI))
suppressWarnings(xdf$weekend <- as.Date(xdf$weekend))
suppressWarnings(xdf$weekstart <- as.Date(xdf$weekstart))
xdf <- .mcga(xdf)
xdf
}

9
R/utils.r

@ -13,3 +13,12 @@
tbl
}
to_num <- function(x) {
x <- gsub("%", "", x, fixed=TRUE)
x <- gsub(">", "", x, fixed=TRUE)
x <- gsub("<", "", x, fixed=TRUE)
x <- gsub(",", "", x, fixed=TRUE)
x <- gsub(" ", "", x, fixed=TRUE)
as.numeric(x)
}

3
R/who-nrvess.r

@ -24,11 +24,12 @@
#' - [ILINet Portal](https://wwwn.cdc.gov/ilinet/) (Login required)
#' - [WHO/NREVSS](https://www.cdc.gov/surveillance/nrevss/index.html)
#' @export
#' @examples
#' @examples \dontrun{
#' national_who <- who_nrevss("national")
#' hhs_who <- who_nrevss("hhs")
#' census_who <- who_nrevss("census")
#' state_who <- who_nrevss("state")
#' }
who_nrevss <- function(region=c("national", "hhs", "census", "state")) {
region <- match.arg(tolower(region), c("national", "hhs", "census", "state"))

11
R/zzz.r

@ -0,0 +1,11 @@
# this is only used during active development phases before/after CRAN releases
.onAttach <- function(...) {
if (!interactive()) return()
packageStartupMessage(paste0("cdcfluview is under *active* development. ",
"There are *MASSIVE* breaking changes*. ",
"See https://github.com/hrbrmstr/cdcfluview for info/news."))
}

43
README.Rmd

@ -1,15 +1,50 @@
---
title: ""
pagetitle: ""
output: rmarkdown::github_document
---
# cdcfluview
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/cdcfluview)](https://cran.r-project.org/package=cdcfluview)
[![Travis-CI Build Status](https://travis-ci.org/hrbrmstr/cdcfluview.svg?branch=master)](https://travis-ci.org/hrbrmstr/cdcfluview)
[![Coverage Status](https://img.shields.io/codecov/c/github/hrbrmstr/cdcfluview/master.svg)](https://codecov.io/github/hrbrmstr/cdcfluview?branch=master)
# I M P O R T A N T
The CDC migrated to a new non-Flash portal and back-end APIs changed. This is a complete reimagining of the package and --- as such --- all your code is going to break. Please use GitHub issues to identify previous API functionality you would like ported over. There's a [release candidate for 0.5.2](https://github.com/hrbrmstr/cdcfluview/releases/tag/v0.5.2) which uses the old API but it likely to break in the near future given the changes to the hidden API. You can do what with `devtools::install_github("hrbrmstr/cdcfluview", ref="58c172b")`.
All folks providing feedback, code or suggestions will be added to the DESCRIPTION file. Please include how you would prefer to be cited in any issues you file.
If there's a particular data set from https://www.cdc.gov/flu/weekly/fluviewinteractive.htm that you want and that isn't in the package, please file it as an issue and be as specific as you can (screen shot if possible).
# :mask: cdcfluview
Retrieve U.S. Flu Season Data from the CDC FluView Portal
## Description
The U.S. Centers for Disease Control (CDC) maintains a portal <http://gis.cdc.gov/grasp/fluview/fluportaldashboard.html> for accessing state, regional and national influenza statistics as well as Mortality Surveillance Data. The Flash interface makes it difficult and time-consuming to select and retrieve influenza data. This package provides functions to access the data provided by the portal's underlying API.
## What's Inside The Tin
The following functions are implemented:
- `agd_ipt`: Age Group Distribution of Influenza Positive Tests Reported by Public Health Laboratories
- `cdcfluview`: Tools to Work with the 'CDC' 'FluView' 'API'
- `cdc_coverage_map`: Retrieve CDC U.S. Coverage Map
- `geographic_spread`: State and Territorial Epidemiologists Reports of Geographic Spread of Influenza
- `hospitalizations`: Laboratory-Confirmed Influenza Hospitalizations
- `ilinet`: Retrieve ILINet Surveillance Data
- `ili_weekly_activity_indicators`: Retrieve weekly state-level ILI indicators per-state for a given season
- `pi_mortality`: Pneumonia and Influenza Mortality Surveillance
- `state_data_providers`: Retrieve metadat about U.S. State CDC Provider Data
- `surveillance_areas`: Retrieve a list of valid sub-regions for each surveillance area.
- `who_nrevss`: Retrieve WHO/NREVSS Surveillance Data
The following data sets are included:
- `hhs_regions` HHS Region Table (a data frame with 59 rows and 4 variables)
- `census_regions` Census Region Table (a data frame with 51 rows and 2 variables)
## Installation
```{r eval=FALSE}
@ -27,6 +62,10 @@ library(cdcfluview)
# current verison
packageVersion("cdcfluview")
```
### EXAMPLES COMING SOON
## Code of Conduct
Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms.

101
README.md

@ -0,0 +1,101 @@
[![CRAN\_Status\_Badge](http://www.r-pkg.org/badges/version/cdcfluview)](https://cran.r-project.org/package=cdcfluview)
[![Travis-CI Build
Status](https://travis-ci.org/hrbrmstr/cdcfluview.svg?branch=master)](https://travis-ci.org/hrbrmstr/cdcfluview)
[![Coverage
Status](https://img.shields.io/codecov/c/github/hrbrmstr/cdcfluview/master.svg)](https://codecov.io/github/hrbrmstr/cdcfluview?branch=master)
I M P O R T A N T
=================
The CDC migrated to a new non-Flash portal and back-end APIs changed.
This is a complete reimagining of the package and — as such — all your
code is going to break. Please use GitHub issues to identify previous
API functionality you would like ported over. There’s a [release
candidate for
0.5.2](https://github.com/hrbrmstr/cdcfluview/releases/tag/v0.5.2) which
uses the old API but it likely to break in the near future given the
changes to the hidden API. You can do what with
`devtools::install_github("hrbrmstr/cdcfluview", ref="58c172b")`.
All folks providing feedback, code or suggestions will be added to the
DESCRIPTION file. Please include how you would prefer to be cited in any
issues you file.
If there’s a particular data set from
<https://www.cdc.gov/flu/weekly/fluviewinteractive.htm> that you want
and that isn’t in the package, please file it as an issue and be as
specific as you can (screen shot if possible).
:mask: cdcfluview
=================
Retrieve U.S. Flu Season Data from the CDC FluView Portal
Description
-----------
The U.S. Centers for Disease Control (CDC) maintains a portal
<http://gis.cdc.gov/grasp/fluview/fluportaldashboard.html> for accessing
state, regional and national influenza statistics as well as Mortality
Surveillance Data. The Flash interface makes it difficult and
time-consuming to select and retrieve influenza data. This package
provides functions to access the data provided by the portal’s
underlying API.
What’s Inside The Tin
---------------------
The following functions are implemented:
- `agd_ipt`: Age Group Distribution of Influenza Positive Tests
Reported by Public Health Laboratories
- `cdcfluview`: Tools to Work with the ‘CDC’ ‘FluView’ ‘API’
- `cdc_coverage_map`: Retrieve CDC U.S. Coverage Map
- `geographic_spread`: State and Territorial Epidemiologists Reports
of Geographic Spread of Influenza
- `hospitalizations`: Laboratory-Confirmed Influenza Hospitalizations
- `ilinet`: Retrieve ILINet Surveillance Data
- `ili_weekly_activity_indicators`: Retrieve weekly state-level ILI
indicators per-state for a given season
- `pi_mortality`: Pneumonia and Influenza Mortality Surveillance
- `state_data_providers`: Retrieve metadat about U.S. State CDC
Provider Data
- `surveillance_areas`: Retrieve a list of valid sub-regions for each
surveillance area.
- `who_nrevss`: Retrieve WHO/NREVSS Surveillance Data
The following data sets are included:
- `hhs_regions` HHS Region Table (a data frame with 59 rows and 4
variables)
- `census_regions` Census Region Table (a data frame with 51 rows and
2 variables)
Installation
------------
``` r
devtools::install_github("hrbrmstr/cdcfluview")
```
Usage
-----
``` r
library(cdcfluview)
# current verison
packageVersion("cdcfluview")
```
## [1] '0.7.0'
### EXAMPLES COMING SOON
Code of Conduct
---------------
Please note that this project is released with a [Contributor Code of
Conduct](CONDUCT.md). By participating in this project you agree to
abide by its terms.

1
codecov.yml

@ -0,0 +1 @@
comment: false

37
crunch/mkdata.r

@ -0,0 +1,37 @@
hhs_regions <- read.table(text="region;region_number;regional_office;state_or_territory
Region 1;1;Boston;Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, Vermont
Region 2;2;New York;New Jersey, New York, Puerto Rico, Virgin Islands
Region 3;3;Philadelphia;Delaware, District of Columbia, Maryland, Pennsylvania, Virginia, West Virginia
Region 4;4;Atlanta;Alabama, Florida, Georgia, Kentucky, Mississippi, North Carolina, South Carolina, Tennessee
Region 5;5;Chicago;Illinois, Indiana, Michigan, Minnesota, Ohio, Wisconsin
Region 6;6;Dallas;Arkansas, Louisiana, New Mexico, Oklahoma, Texas
Region 7;7;Kansas City;Iowa, Kansas, Missouri, Nebraska
Region 8;8;Denver;Colorado, Montana, North Dakota, South Dakota, Utah, Wyoming
Region 9;9;San Francisco;Arizona, California, Hawaii, Nevada, American Samoa, Commonwealth of the Northern Mariana Islands, Federated States of Micronesia, Guam, Marshall Islands, Republic of Palau
Region 10;10;Seattle;Alaska, Idaho, Oregon, Washington", sep=";", stringsAsFactors=FALSE, header=TRUE)
library(stringr)
do.call(rbind.data.frame, lapply(1:nrow(hhs_regions), function(i) {
x <- hhs_regions[i,]
rownames(x) <- NULL
out <- data.frame(x[, c(1:3)],
str_split(x$state_or_territory, ", ")[1],
stringsAsFactors=FALSE)
colnames(out) <- c("region", "region_number", "regional_office", "state_or_territory")
out
})) -> hhs_regions
str(hhs_regions)
library(rvest)
library(magrittr)
pg <- html("http://www.cdc.gov/std/stats11/census.htm")
pg %>% html_table() %>% extract2(1) %>% as.list -> cens
do.call(rbind.data.frame, lapply(names(cens), function(x) {
data.frame(region=x,
state=cens[[x]][cens[[x]]!=""],
stringsAsFactors=FALSE)
})) -> census_regions
devtools::use_data(hhs_regions, census_regions, overwrite=TRUE)

BIN
data/census_regions.rda

Binary file not shown.

BIN
data/hhs_regions.rda

Binary file not shown.

5
man/agd_ipt.Rd

@ -11,6 +11,11 @@ Retrieves the age group distribution of influenza positive tests that are report
public health laboratories by influenza virus type and subtype/lineage. Laboratory data
from multiple seasons and different age groups is provided.
}
\examples{
\dontrun{
agd_ipt()
}
}
\references{
- [CDC FluView Portal](https://gis.cdc.gov/grasp/fluview/fluportaldashboard.html)
- [AGD IPT Portal](https://gis.cdc.gov/grasp/fluview/flu_by_age_virus.html)

32
man/cdc_coverage_map.Rd

@ -0,0 +1,32 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/coverage-map.r
\name{cdc_coverage_map}
\alias{cdc_coverage_map}
\title{Retrieve CDC U.S. Coverage Map}
\usage{
cdc_coverage_map()
}
\description{
The CDC FluView application uses a composite basemap of coverage areas
within the United States that elides and scales Alaska, Hawaii and
Puerto Rico and provides elided and scaled breakouts for New York City
and the District of Columbia.\cr
\cr
The basemap provides polygon identifiers by:
\cr
\itemize{
\item \code{STATE_FIPS}
\item \code{STATE_ABBR}
\item \code{STATE_NAME}
\item \code{HHS_Region}
\item \code{FIPSTXT})
\cr
This function retrieves the shapefile, projects to EPSG:5069 and
returns it as an \code{sf} (simple features) object.
}
}
\examples{
\dontrun{
plot(cdc_coverage_map())
}
}

9
man/cdcfluview.Rd

@ -4,9 +4,14 @@
\name{cdcfluview}
\alias{cdcfluview}
\alias{cdcfluview-package}
\title{...}
\title{Retrieve 'U.S'.' Flu Season Data from the 'CDC' 'FluView' Portal}
\description{
...
The U.S. Centers for Disease Control (CDC) maintains a portal
\url{http://gis.cdc.gov/grasp/fluview/fluportaldashboard.html} for
accessing state, regional and national influenza statistics as well as
Mortality Surveillance Data. The Flash interface makes it difficult and
time-consuming to select and retrieve influenza data. This package
provides functions to access the data provided by the portal's underlying API.
}
\author{
Bob Rudis (bob@rud.is)

27
man/census_regions.Rd

@ -0,0 +1,27 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/datasets.r
\docType{data}
\name{census_regions}
\alias{census_regions}
\title{Census Region Table}
\format{A data frame with 51 rows and 2 variables}
\usage{
data(census_regions)
}
\description{
This dataset contains the states belonging to the (presently) 4
U.S. Census regions in "long" format. It consists of a \code{data.frame}
with the following columns:
\itemize{
\item \code{region}: the official Census region name (e.g. "\code{East}")
\item \code{state}: state belonging to the region
}
}
\note{
Last updated 2015-08-09.
}
\references{
\url{https://www.cdc.gov/std/stats12/images/CensusMap.png}
}
\keyword{datasets}

16
man/geographic_spread.Rd

@ -0,0 +1,16 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/geographic-spread.R
\name{geographic_spread}
\alias{geographic_spread}
\title{State and Territorial Epidemiologists Reports of Geographic Spread of Influenza}
\usage{
geographic_spread()
}
\description{
State and Territorial Epidemiologists Reports of Geographic Spread of Influenza
}
\examples{
\dontrun{
geographic_spread()
}
}

30
man/hhs_regions.Rd

@ -0,0 +1,30 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/datasets.r
\docType{data}
\name{hhs_regions}
\alias{hhs_regions}
\title{HHS Region Table}
\format{A data frame with 59 rows and 4 variables}
\usage{
data(hhs_regions)
}
\description{
This dataset contains the names, numbers, regional offices for-,
and states/territories belonging to the (presently) 10 HHS U.S.
regions in "long" format. It consists of a \code{data.frame}
with the following columns:
\itemize{
\item \code{region}: the official HHS region name (e.g. "\code{Region 1}")
\item \code{region_number}: the associated region number
\item \code{regional_office}: the HHS regional office for the entire region
\item \code{state_or_territory}: state or territory belonging to the region
}
}
\note{
Last updated 2015-08-09.
}
\references{
\url{https://www.hhs.gov/about/agencies/iea/regional-offices/index.html}
}
\keyword{datasets}

16
man/hospitalizations.Rd

@ -4,19 +4,29 @@
\alias{hospitalizations}
\title{Laboratory-Confirmed Influenza Hospitalizations}
\usage{
hospitalizations(surveillance_area = c("flusurv", "eip", "ihsp"))
hospitalizations(surveillance_area = c("flusurv", "eip", "ihsp"),
region = "all")
}
\arguments{
\item{surveillance_area}{one of "`flusurv`", "`eip`", or "`ihsp`"}
\item{surveillance_area}{one of "\code{flusurv}", "\code{eip}", or "\code{ihsp}"}
\item{region}{Using "\code{all}" mimics selecting "Entire Network" from the
CDC FluView application drop down. Individual regions for each
surveillance area can also be selected. Use \code{\link[=surveillance_areas]{surveillance_areas()}} to
see a list of valid sub-regions for each surveillance area.}
}
\description{
Laboratory-Confirmed Influenza Hospitalizations
}
\examples{
\dontrun{
hosp_fs <- hospitalizations("flusurv")
hosp_eip <- hospitalizations("eip")
hosp_ihsp <- hospitalizations("ihsp")
}
}
\references{
- [Hospital Portal](https://gis.cdc.gov/GRASP/Fluview/FluHospRates.html)
\itemize{
\item \href{https://gis.cdc.gov/GRASP/Fluview/FluHospRates.html}{Hospital Portal}
}
}

39
man/ili_weekly_activity_indicators.Rd

@ -0,0 +1,39 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ili-weekly-state.r
\name{ili_weekly_activity_indicators}
\alias{ili_weekly_activity_indicators}
\title{Retrieve weekly state-level ILI indicators per-state for a given season}
\usage{
ili_weekly_activity_indicators(season_start_year)
}
\arguments{
\item{season_start_year}{numeric; start year for flu season (e.g. 2017 for 2017-2018 season)}
}
\description{
Retrieve weekly state-level ILI indicators per-state for a given season
}
\note{
These statistics use the proportion of outpatient visits to healthcare providers
for influenza-like illness to measure the ILI activity level within a state. They do
not, however, measure the extent of geographic spread of flu within a state. Therefore,
outbreaks occurring in a single city could cause the state to display high activity levels.\cr
\cr
Data collected in ILINet may disproportionately represent certain populations within
a state, and therefore may not accurately depict the full picture of influenza activity
for the whole state.\cr
\cr
All summary statistics are based on either data collected in ILINet, or reports from
state and territorial epidemiologists. Differences in the summary data presented by
CDC and state health departments likely represent differing levels of data completeness
with data presented by the state likely being the more complete.
}
\examples{
\dontrun{
ili_weekly_activity_indicators(2016)
}
}
\references{
\itemize{
\item \href{https://gis.cdc.gov/grasp/fluview/main.html}{ILI Activity Indicator Map Portal}
}
}

54
man/pi_mortality.Rd

@ -0,0 +1,54 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/pi-mortality.r
\name{pi_mortality}
\alias{pi_mortality}
\title{Pneumonia and Influenza Mortality Surveillance}
\usage{
pi_mortality(coverage_area = c("national", "state", "region"))
}
\arguments{
\item{coverage_area}{coverage area for data (national, state or region)}
}
\description{
The National Center for Health Statistics (NCHS) collects and disseminates the Nation's
official vital statistics. NCHS collects death certificate data from state vital
statistics offices for virtually all deaths occurring in the United States. Pneumonia
and influenza (P&I) deaths are identified based on ICD-10
multiple cause of death codes.\cr
\cr
NCHS Mortality Surveillance System data are presented by the week the death occurred
at the national, state, and HHS Region levels. Data on the percentage of deaths due
to P&I on a national level are released two weeks after the week of death to allow
for collection of enough data to produce a stable percentage. States and HHS regions
with less than 20% of the expected total deaths (average number of total deaths
reported by week during 2008-2012) will be marked as insufficient data. Collection
of complete data is not expected at the time of initial report, and a reliable
percentage of deaths due to P&I is not anticipated at the U.S. Department of Health
and Human Services region or state level within this two week period. The data for
earlier weeks are continually revised and the proportion of deaths due to P&I may
increase or decrease as new and updated death certificate data are received by NCHS.\cr
\cr
The seasonal baseline of P&I deaths is calculated using a periodic regression model
that incorporates a robust regression procedure applied to data from the previous
five years. An increase of 1.645 standard deviations above the seasonal baseline
of P&I deaths is considered the "epidemic threshold," i.e., the point at which
the observed proportion of deaths attributed to pneumonia or influenza was
significantly higher than would be expected at that time of the year in the
absence of substantial influenza-related mortality. Baselines and thresholds are
calculated at the national and regional level and by age group.
}
\note{
Queries for "state" and "region" are not "instantaneous" and can near or over 30s retrieval delays.
}
\examples{
\dontrun{
ndf <- pi_mortality()
sdf <- pi_mortality("state")
rdf <- pi_mortality("region")
}
}
\references{
\itemize{
\item \href{https://gis.cdc.gov/grasp/fluview/mortality.html}{Pneumonia and Influenza Mortality Surveillance Portal}
}
}

14
man/state_data_providers.Rd

@ -0,0 +1,14 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/ili-weekly-state.r
\name{state_data_providers}
\alias{state_data_providers}
\title{Retrieve metadat about U.S. State CDC Provider Data}
\usage{
state_data_providers()
}
\description{
Retrieve metadat about U.S. State CDC Provider Data
}
\examples{
state_data_providers()
}

14
man/surveillance_areas.Rd

@ -0,0 +1,14 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/hospital.r
\name{surveillance_areas}
\alias{surveillance_areas}
\title{Retrieve a list of valid sub-regions for each surveillance area.}
\usage{
surveillance_areas()
}
\description{
Retrieve a list of valid sub-regions for each surveillance area.
}
\examples{
surveillance_areas()
}

2
man/who_nrevss.Rd

@ -35,11 +35,13 @@ laboratories are presented separately in the weekly influenza update. This is
the reason why a list of data frames is returned.
}
\examples{
\dontrun{
national_who <- who_nrevss("national")
hhs_who <- who_nrevss("hhs")
census_who <- who_nrevss("census")
state_who <- who_nrevss("state")
}
}
\references{
\itemize{
\item \href{https://gis.cdc.gov/grasp/fluview/fluportaldashboard.html}{CDC FluView Portal}

Loading…
Cancel
Save