You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
boB Rudis 878bb7f045
initial commit
7 years ago
R initial commit 7 years ago
man initial commit 7 years ago
src initial commit 7 years ago
tests initial commit 7 years ago
.Rbuildignore initial commit 7 years ago
.codecov.yml initial commit 7 years ago
.gitignore initial commit 7 years ago
.travis.yml initial commit 7 years ago
CONDUCT.md initial commit 7 years ago
DESCRIPTION initial commit 7 years ago
LICENSE initial commit 7 years ago
NAMESPACE initial commit 7 years ago
NEWS.md initial commit 7 years ago
README.Rmd initial commit 7 years ago
README.md initial commit 7 years ago
rep.Rproj initial commit 7 years ago

README.md

rep : Tools to Parse and Test Robots Exclusion Protocol Files and Rules

The 'Robots Exclusion Protocol' http://www.robotstxt.org/orig.html documents a set of standards for allowing or excluding robot/spider crawling of different areas of site content. Tools are provided which wrap The 'rep-cpp` https://github.com/seomoz/rep-cpp C++ library for processing these 'robots.txt' files.

The following functions are implemented:

  • robxp: Create a robots.txt object
  • can_fetch: Test URL path against robots.txt

Installation

devtools::install_github("hrbrmstr/rep")

Usage

library(rep)
library(robotstxt)

# current verison
packageVersion("rep")
## [1] '0.1.0'
rt <- robxp(get_robotstxt("https://cdc.gov"))

print(rt)
## <Robots Exclusion Protocol Object>
can_fetch(rt, "/asthma/asthma_stats/default.htm", "*")
## [1] TRUE
can_fetch(rt, "/_borders", "*")
## [1] FALSE

Test Results

library(rep)
library(testthat)

date()
## [1] "Mon Aug 14 15:00:16 2017"
test_dir("tests/")
## testthat results ========================================================================================================
## OK: 3 SKIPPED: 0 FAILED: 0
## 
## DONE ===================================================================================================================

Code of Conduct

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.