Identify the Crux of an Article
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 

2.5 KiB

---
output: rmarkdown::github_document
editor_options:
chunk_output_type: inline
---
```{r pkg-knitr-opts, include=FALSE}
knitr::opts_chunk$set(collapse=TRUE, fig.retina=2, message=FALSE, warning=FALSE)
options(width=120)
```

[![Travis-CI Build Status](https://travis-ci.org/hrbrmstr/crux.svg?branch=master)](https://travis-ci.org/hrbrmstr/crux)
[![Coverage Status](https://codecov.io/gh/hrbrmstr/crux/branch/master/graph/badge.svg)](https://codecov.io/gh/hrbrmstr/crux)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/crux)](https://cran.r-project.org/package=crux)

# crux

Identify the Crux of an Article

## Description

Methods are provided to retrieve HTML content and return extracted
metadata and summarised plain text. Further methods are provided to classify
URLs with or without making network calls. Based on <https://github.com/chimbori/crux>.

## What's Inside The Tin

The following functions are implemented:

- `classify_url`: Classify a URL with or without making network calls
- `is_ad_image`: Classify a URL with or without making network calls
- `is_likely_archive`: Classify a URL with or without making network calls
- `is_likely_article`: Classify a URL with or without making network calls
- `is_likely_audio`: Classify a URL with or without making network calls
- `is_likely_binary_doc`: Classify a URL with or without making network calls
- `is_likely_executable`: Classify a URL with or without making network calls
- `is_likely_image`: Classify a URL with or without making network calls
- `is_likely_video`: Classify a URL with or without making network calls
- `is_web_scheme`: Classify a URL with or without making network calls
- `summarise_url`: Summarise the contents at a URL to essential bits

## Installation

```{r install-ex, eval=FALSE}
install.packages(c("cruxjars", "crux"), repos = "https://cinc.rud.is/")
```

## Usage

```{r lib-ex}
library(crux)

# current version
packageVersion("crux")

```

```{r}
str(
summarise_url("http://time.com/5541738/joe-biden-backtracks-pence-praise-criticism/"), 1
)
```

```{r}
str(
classify_url("https://www.washingtonpost.com/powerpost/house-democrats-explode-in-recriminations-as-liberals-lash-out-at-moderates/2019/02/28/c3d163fe-3b87-11e9-a06c-3ec8ed509d15_story.html")
)
```

## crux Metrics

```{r cloc, echo=FALSE}
cloc::cloc_pkg_md()
```

## Code of Conduct

Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md).
By participating in this project you agree to abide by its terms.