mirror of https://git.sr.ht/~hrbrmstr/jericho
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
31 lines
1.2 KiB
31 lines
1.2 KiB
Package: jericho
|
|
Type: Package
|
|
Title: Break Down the Walls of 'HTML' Tags into Usable Text
|
|
Version: 0.2.0
|
|
Date: 2019-03-01
|
|
Authors@R: c(
|
|
person("Bob", "Rudis", role = c("aut", "cre"), email = "bob@rud.is"),
|
|
person("Martin", "Jericho", role = c("ctb"), comment = "Jericho HTML Parser")
|
|
)
|
|
Maintainer: Bob Rudis <bob@rud.is>
|
|
SystemRequirements: Java
|
|
Description: Structured 'HTML' content can be useful when you need to parse data tables
|
|
or other tagged data from within a document. However, it is also useful to obtain
|
|
"just the text" from a document free from the walls of tags that surround it. Tools
|
|
are provied that wrap methods in the 'Jericho HTML Parser' Java library by
|
|
Martin Jericho <http://jericho.htmlparser.net/docs/index.html>. Martin's library
|
|
is used in many at-scale projects, icluding the 'The Internet Archive'.
|
|
URL: https://gitlab.com/hrbrmstr/jericho
|
|
BugReports: https://gitlab.com/hrbrmstr/jericho/issues
|
|
License: Apache License 2.0 | file LICENSE
|
|
Encoding: UTF-8
|
|
Suggests:
|
|
testthat,
|
|
covr
|
|
Depends:
|
|
R (>= 3.2.0),
|
|
rJava,
|
|
jerichojars
|
|
RoxygenNote: 6.1.1
|
|
Remotes:
|
|
url::https://git.sr.ht/~hrbrmstr/jerichojars
|
|
|