You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

58 lines
1.4 KiB

7 years ago
---
output: rmarkdown::github_document
7 years ago
editor_options:
chunk_output_type: console
7 years ago
---
`hgr` : Tools to Work with the 'Postlight' 'Mercury' 'API'
Mercury takes any web article and returns only the relevant content — headline, author, body text, relevant images and more — free from any clutter. You need an API key which you can get from [here](https://mercury.postlight.com).
7 years ago
The following functions are implemented:
- `just_the_facts`: Retrieve parsed content of a URL processed by the Postlight Mercury API
7 years ago
- `clean_text`: Remove all HTML/XML tags from an HTML document/atomic character vector
7 years ago
### Installation
```{r eval=FALSE}
devtools::install_github("hrbrmstr/hgr")
```
```{r message=FALSE, warning=FALSE, error=FALSE, include=FALSE}
options(width=120)
```
### Usage
```{r message=FALSE, warning=FALSE, error=FALSE}
library(hgr)
# current verison
packageVersion("hgr")
story <- "https://www.nytimes.com/2017/04/18/world/asia/aircraft-carrier-north-korea-carl-vinson.html?hp&action=click&pgtype=Homepage&clickSource=story-heading&module=first-column-region&region=top-news&WT.nav=top-news&_r=0"
7 years ago
doc <- just_the_facts(story)
7 years ago
7 years ago
dplyr::glimpse(doc)
substr(doc$content, 1, 100)
plain <- clean_text(doc$content)
substr(plain, 1, 100)
7 years ago
```
### Test Results
```{r message=FALSE, warning=FALSE, error=FALSE}
library(hgr)
library(testthat)
date()
test_dir("tests/")
```