Browse Source

merge conflicts

Merge branch 'master' of https://github.com/hrbrmstr/htmltidy

# Conflicts:
#	NEWS.md
master
boB Rudis 5 years ago
parent
commit
904b36c76c
No known key found for this signature in database GPG Key ID: 1D7529BE14E2BBA9
  1. 1
      .Rbuildignore
  2. 1
      DESCRIPTION
  3. 15
      NEWS.md
  4. 4
      R/xmltreeview.R
  5. 3
      R/xmlview.R
  6. 12
      cran-comments.md
  7. 292
      docs/index.html
  8. 121
      docs/news/index.html
  9. 59
      docs/pkgdown.css
  10. 13
      docs/pkgdown.js
  11. 156
      docs/reference/highlight_styles.html
  12. 115
      docs/reference/htmltidy.html
  13. 114
      docs/reference/index.html
  14. 108
      docs/reference/renderXmlview.html
  15. 284
      docs/reference/tidy_html.html
  16. 154
      docs/reference/xml_tree_view.html
  17. 185
      docs/reference/xml_view.html
  18. 118
      docs/reference/xmltreeview-shiny.html
  19. 108
      docs/reference/xmlviewOutput.html
  20. 3
      man/xml_tree_view.Rd
  21. 2
      man/xml_view.Rd

1
.Rbuildignore

@ -7,3 +7,4 @@
^README\.html$
^cran-comments\.md$
^appveyor\.yml$
^docs$

1
DESCRIPTION

@ -42,6 +42,7 @@ Depends:
R (>= 3.2.0)
License: MIT + file LICENSE
LazyData: true
Encoding: UTF-8
NeedsCompilation: yes
Suggests:
testthat,

15
NEWS.md

@ -1,17 +1,18 @@
# htmltidy 0.3.1
htmltidy 0.3.1
====================
* Fix warnings coming from URL redirection in examples
# htmltidy 0.3.0
htmltidy 0.3.0
====================
* Better error handling (fixed crashing bug in #1)
* New option to display document errors
* Support for directly tidying httr::response objects
* Added XML/HTML viewer & XPath query widgets
# htmltidy 0.2.0
htmltidy 0.2.0
====================
* Bundled tidy-html5 library with the package
* Windows compatibility
* Options handling
@ -19,8 +20,8 @@
* Modified tests
# htmltidy 0.1.0
htmltidy 0.1.0
====================
* Added a `NEWS.md` file to track changes to the package.
* Added Debian & Ubuntu compatibility
* Added basic error checking

4
R/xmltreeview.R

@ -19,8 +19,8 @@
#' or used in a browser context vs an IDE viewer context.
#' @export
#' @references \href{https://github.com/juliangruber/xml-viewer}{xml-viewer}
#' @examples \dontrun{
#' library(htmltidy)
#' @examples
#' if (interactive()) {
#'
#' # from ?xml2::read_xml
#' cd <- xml2::read_xml("http://www.xmlfiles.com/examples/cd_catalog.xml")

3
R/xmlview.R

@ -27,7 +27,8 @@
#' @export
#' @references \href{https://highlightjs.org/}{highlight.js},
#' \href{http://www.eslinstructor.net/vkbeautify/}{vkbeautify}
#' @examples \dontrun{
#' @examples
#' if (interactive()) {
#' library(xml2)
#'
#' # plain text

12
cran-comments.md

@ -10,17 +10,13 @@
0 errors | 0 warnings | 2 notes
* This is a new release.
* XHTML is a valid and widely used acronym
This is a new release, so there are no reverse dependencies.
---
This fixes a fairly nasty bug that was
user-identfied fairly early after release
but I didn't want to bug the CRAN team
so quickly after the CRAN acceptange. This
also addes new functionality and (optionally)
so quickly after the CRAN acceptance. This
also addes new functionality (widgets for
viewing & querying XML/HTML) and (optionally)
provides more informaiton on the tidying
process.
process.

292
docs/index.html

@ -0,0 +1,292 @@
<!-- Generated by pkgdown: do not edit by hand -->
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Home. htmltidy</title>
<!-- jquery -->
<script src="https://code.jquery.com/jquery-3.1.0.min.js" integrity="sha384-nrOSfDHtoPMzJHjVTdCopGqIqeYETSXhZDFyniQ8ZHcVy08QesyHcnOUpMpqnmWq" crossorigin="anonymous"></script>
<!-- Bootstrap -->
<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<!-- Font Awesome icons -->
<link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-T8Gy5hrqNKT+hzMclPo118YTQO6cYprQmhrYwIiQ/3axmI1hQomh7Ud2hPOy8SP1" crossorigin="anonymous">
<!-- pkgdown -->
<link href="pkgdown.css" rel="stylesheet">
<script src="pkgdown.js"></script>
<!-- mathjax -->
<script src='https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></script>
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="container">
<header>
<div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="index.html">htmltidy</a>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="index.html">Home</a>
</li>
<li>
<a href="reference/index.html">Reference</a>
</li>
<li>
<a href="news/index.html">News</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://github.com/hrbrmstr/htmltidy">
<span class="fa fa-github fa-lg"></span>
</a>
</li>
</ul>
</div><!--/.nav-collapse -->
</div><!--/.container -->
</div><!--/.navbar -->
</header>
<div class="row">
<div class="col-md-9">
<p><a href="https://travis-ci.org/hrbrmstr/htmltidy"><img src="https://travis-ci.org/hrbrmstr/htmltidy.svg?branch=master" alt="Travis-CI Build Status"></a> <a href="https://ci.appveyor.com/project/hrbrmstr/htmltidy"><img src="https://ci.appveyor.com/api/projects/status/github/hrbrmstr/htmltidy?branch=master&amp;svg=true" alt="AppVeyor Build Status"></a> <a href="https://cran.r-project.org/package=htmltidy"><img src="http://www.r-pkg.org/badges/version/htmltidy" alt="CRAN_Status_Badge"></a> <img src="http://cranlogs.r-pkg.org/badges/grand-total/htmltidy" alt="downloads"></p>
<!-- README.md is generated from README.Rmd. Please edit that file -->
<p><code>htmltidy</code> &mdash; Tidy Up and Test XPath Queries on HTML and XML Content</p>
<p>Partly inspired by <a href="http://stackoverflow.com/questions/37061873/identify-a-weblink-in-bold-in-r">this SO question</a> and because there&rsquo;s a great deal of cruddy HTML out there that needs fixing to use properly when scraping data.</p>
<p>It relies on a locally included version of <a href="http://www.html-tidy.org/"><code>libtidy</code></a> and works on macOS, Linux &amp; Windows.</p>
<p>It also incorporates an <code>htmlwidget</code> to view and test XPath queries on HTML/XML content.</p>
<p>The following functions are implemented:</p>
<ul>
<li>
<code>tidy_html</code>: Tidy or &ldquo;Pretty Print&rdquo; HTML/XHTML Documents</li>
<li>
<code>html_view</code>: HTML/XML pretty printer and viewer</li>
<li>
<code>xml_view</code>: HTML/XML pretty printer and viewer</li>
<li>
<code>html_tree_view</code>: HTML/XML tree viewer</li>
<li>
<code>xml_tree_view</code>: HTML/XML tree viewer</li>
</ul>
<h3 id="installation">Installation</h3>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">devtools::<span class="kw">install_github</span>(<span class="st">"hrbrmstr/htmltidy"</span>)</code></pre></div>
<h3 id="usage">Usage</h3>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">library</span>(htmltidy)
<span class="co"># current verison</span>
<span class="kw">packageVersion</span>(<span class="st">"htmltidy"</span>)
## [1] '0.3.0'
<span class="kw">library</span>(XML)
<span class="kw">library</span>(xml2)
<span class="kw">library</span>(httr)
<span class="kw">library</span>(purrr)</code></pre></div>
<p>This is really &ldquo;un-tidy&rdquo; content:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">res &lt;-<span class="st"> </span><span class="kw">GET</span>(<span class="st">"http://rud.is/test/untidy.html"</span>)
<span class="kw">cat</span>(<span class="kw">content</span>(res, <span class="dt">as=</span><span class="st">"text"</span>))
## &lt;head&gt;
## &lt;style&gt;
## body { font-family: sans-serif; }
## &lt;/style&gt;
## &lt;/head&gt;
## &lt;body&gt;
## &lt;b&gt;This is &lt;b&gt;some &lt;i&gt;really &lt;/i&gt; poorly formatted HTML&lt;/b&gt;
##
## as is this &lt;span id="sp"&gt;portion&lt;div&gt;</code></pre></div>
<p>Let&rsquo;s see what <code><a href="reference/tidy_html.response.html">tidy_html()</a></code> does to it.</p>
<p>It can handle the <code>response</code> object directly:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">cat</span>(<span class="kw"><a href="reference/tidy_html.response.html">tidy_html</a></span>(res, <span class="kw">list</span>(<span class="dt">TidyDocType=</span><span class="st">"html5"</span>, <span class="dt">TidyWrapLen=</span><span class="dv">200</span>)))
## &lt;!DOCTYPE html&gt;
## &lt;html&gt;
## &lt;head&gt;
## &lt;meta name="generator" content="HTML Tidy for HTML5 for R version 5.0.0"&gt;
## &lt;style&gt;
## body { font-family: sans-serif; }
## &lt;/style&gt;
## &lt;title&gt;&lt;/title&gt;
## &lt;/head&gt;
## &lt;body&gt;
## &lt;b&gt;This is some &lt;i&gt;really&lt;/i&gt; poorly formatted HTML as is this &lt;span id="sp"&gt;portion&lt;/span&gt;&lt;/b&gt;
## &lt;div&gt;&lt;span id="sp"&gt;&lt;/span&gt;&lt;/div&gt;
## &lt;/body&gt;
## &lt;/html&gt;</code></pre></div>
<p>But, you&rsquo;ll probably mostly use it on HTML you&rsquo;ve identified as gnarly and already have that HTML text content handy:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">cat</span>(<span class="kw"><a href="reference/tidy_html.response.html">tidy_html</a></span>(<span class="kw">content</span>(res, <span class="dt">as=</span><span class="st">"text"</span>), <span class="kw">list</span>(<span class="dt">TidyDocType=</span><span class="st">"html5"</span>, <span class="dt">TidyWrapLen=</span><span class="dv">200</span>)))
## &lt;!DOCTYPE html&gt;
## &lt;html&gt;
## &lt;head&gt;
## &lt;meta name="generator" content="HTML Tidy for HTML5 for R version 5.0.0"&gt;
## &lt;style&gt;
## body { font-family: sans-serif; }
## &lt;/style&gt;
## &lt;title&gt;&lt;/title&gt;
## &lt;/head&gt;
## &lt;body&gt;
## &lt;b&gt;This is some &lt;i&gt;really&lt;/i&gt; poorly formatted HTML as is this &lt;span id="sp"&gt;portion&lt;/span&gt;&lt;/b&gt;
## &lt;div&gt;&lt;span id="sp"&gt;&lt;/span&gt;&lt;/div&gt;
## &lt;/body&gt;
## &lt;/html&gt;</code></pre></div>
<p>NOTE: you could also just have done:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">cat</span>(<span class="kw"><a href="reference/tidy_html.response.html">tidy_html</a></span>(<span class="kw">url</span>(<span class="st">"http://rud.is/test/untidy.html"</span>),
<span class="kw">list</span>(<span class="dt">TidyDocType=</span><span class="st">"html5"</span>, <span class="dt">TidyWrapLen=</span><span class="dv">200</span>)))
## &lt;!DOCTYPE html&gt;
## &lt;html&gt;
## &lt;head&gt;
## &lt;meta name="generator" content="HTML Tidy for HTML5 for R version 5.0.0"&gt;
## &lt;style&gt;
## body { font-family: sans-serif; }
## &lt;/style&gt;
## &lt;title&gt;&lt;/title&gt;
## &lt;/head&gt;
## &lt;body&gt;
## &lt;b&gt;This is some &lt;i&gt;really&lt;/i&gt; poorly formatted HTMLas is this &lt;span id="sp"&gt;portion&lt;/span&gt;&lt;/b&gt;
## &lt;div&gt;&lt;span id="sp"&gt;&lt;/span&gt;&lt;/div&gt;
## &lt;/body&gt;
## &lt;/html&gt;</code></pre></div>
<p>You&rsquo;ll see that this differs substantially from the mangling <code>libxml2</code> does (via <code>read_html()</code>):</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">pg &lt;-<span class="st"> </span><span class="kw">read_html</span>(<span class="st">"http://rud.is/test/untidy.html"</span>)
<span class="kw">cat</span>(<span class="kw">toString</span>(pg))
## &lt;?xml version="1.0" standalone="yes"?&gt;
## &lt;!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"&gt;
## &lt;html&gt;&lt;head&gt;&lt;style&gt;&lt;![CDATA[
## body { font-family: sans-serif; }
## ]]&gt;&lt;/style&gt;&lt;/head&gt;&lt;body&gt;
## &lt;b&gt;This is &lt;b&gt;some &lt;i&gt;really &lt;/i&gt; poorly formatted HTML&lt;/b&gt;
##
## as is this &lt;span id="sp"&gt;portion&lt;div/&gt;&lt;/span&gt;&lt;/b&gt;&lt;/body&gt;&lt;/html&gt;</code></pre></div>
<p>It can also deal with &ldquo;raw&rdquo; and parsed objects:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw"><a href="reference/tidy_html.response.html">tidy_html</a></span>(<span class="kw">content</span>(res, <span class="dt">as=</span><span class="st">"raw"</span>))
## [1] 3c 21 44 4f 43 54 59 50 45 20 68 74 6d 6c 3e 0a 3c 68 74 6d 6c 20 78 6d 6c 6e 73 3d 22 68 74 74 70 3a 2f 2f 77 77
## [39] 77 2e 77 33 2e 6f 72 67 2f 31 39 39 39 2f 78 68 74 6d 6c 22 3e 0a 3c 68 65 61 64 3e 0a 3c 6d 65 74 61 20 6e 61 6d
## [77] 65 3d 22 67 65 6e 65 72 61 74 6f 72 22 20 63 6f 6e 74 65 6e 74 3d 0a 22 48 54 4d 4c 20 54 69 64 79 20 66 6f 72 20
## [115] 48 54 4d 4c 35 20 66 6f 72 20 52 20 76 65 72 73 69 6f 6e 20 35 2e 30 2e 30 22 20 2f 3e 0a 3c 74 69 74 6c 65 3e 3c
## [153] 2f 74 69 74 6c 65 3e 0a 3c 2f 68 65 61 64 3e 0a 3c 62 6f 64 79 3e 0a 3c 2f 62 6f 64 79 3e 0a 3c 2f 68 74 6d 6c 3e
## [191] 0a
<span class="kw"><a href="reference/tidy_html.response.html">tidy_html</a></span>(<span class="kw">content</span>(res, <span class="dt">as=</span><span class="st">"text"</span>, <span class="dt">encoding=</span><span class="st">"UTF-8"</span>))
## [1] "&lt;!DOCTYPE html&gt;\n&lt;html xmlns=\"http://www.w3.org/1999/xhtml\"&gt;\n&lt;head&gt;\n&lt;meta name=\"generator\" content=\n\"HTML Tidy for HTML5 for R version 5.0.0\" /&gt;\n&lt;style&gt;\n&lt;![CDATA[\nbody { font-family: sans-serif; }\n]]&gt;\n&lt;/style&gt;\n&lt;title&gt;&lt;/title&gt;\n&lt;/head&gt;\n&lt;body&gt;\n&lt;b&gt;This is some &lt;i&gt;really&lt;/i&gt; poorly formatted HTML as is this\n&lt;span id=\"sp\"&gt;portion&lt;/span&gt;&lt;/b&gt;\n&lt;div&gt;&lt;span id=\"sp\"&gt;&lt;/span&gt;&lt;/div&gt;\n&lt;/body&gt;\n&lt;/html&gt;\n"
<span class="kw"><a href="reference/tidy_html.response.html">tidy_html</a></span>(<span class="kw">content</span>(res, <span class="dt">as=</span><span class="st">"parsed"</span>, <span class="dt">encoding=</span><span class="st">"UTF-8"</span>))
## {xml_document}
## &lt;html xmlns="http://www.w3.org/1999/xhtml"&gt;
## [1] &lt;head&gt;\n &lt;meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /&gt;\n &lt;meta name="generator" content ...
## [2] &lt;body&gt;\n&lt;b&gt;This is some &lt;i&gt;really&lt;/i&gt; poorly formatted HTML as is this\n&lt;span id="sp"&gt;portion&lt;/span&gt;&lt;/b&gt;\n&lt;/body&gt;
<span class="kw"><a href="reference/tidy_html.response.html">tidy_html</a></span>(<span class="kw">htmlParse</span>(<span class="st">"http://rud.is/test/untidy.html"</span>))
## &lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"&gt;
## &lt;html xmlns="http://www.w3.org/1999/xhtml"&gt;
## &lt;head&gt;
## &lt;meta name="generator" content="HTML Tidy for HTML5 for R version 5.0.0"&gt;
## &lt;style&gt;
## &lt;![CDATA[
## body { font-family: sans-serif; }
## ]]&gt;
## &lt;/style&gt;
## &lt;title&gt;&lt;/title&gt;
## &lt;/head&gt;
## &lt;body&gt;
## &lt;b&gt;This is some &lt;i&gt;really&lt;/i&gt; poorly formatted HTML as is this
## &lt;span id="sp"&gt;portion&lt;/span&gt;&lt;/b&gt;
## &lt;div&gt;&lt;span id="sp"&gt;&lt;/span&gt;&lt;/div&gt;
## &lt;/body&gt;
## &lt;/html&gt;
## </code></pre></div>
<p>And, show the markup errors:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r"><span class="kw">invisible</span>(<span class="kw"><a href="reference/tidy_html.response.html">tidy_html</a></span>(<span class="kw">url</span>(<span class="st">"http://rud.is/test/untidy.html"</span>), <span class="dt">verbose=</span><span class="ot">TRUE</span>))
## line 1 column 1 - Warning: missing &lt;!DOCTYPE&gt; declaration
## line 1 column 68 - Warning: nested emphasis &lt;b&gt;
## line 1 column 138 - Warning: missing &lt;/span&gt; before &lt;div&gt;
## line 1 column 68 - Warning: missing &lt;/b&gt; before &lt;div&gt;
## line 1 column 164 - Warning: inserting implicit &lt;span&gt;
## line 1 column 164 - Warning: missing &lt;/span&gt;
## line 1 column 159 - Warning: missing &lt;/div&gt;
## line 1 column 1 - Warning: inserting missing 'title' element
## line 1 column 164 - Warning: &lt;span&gt; anchor "sp" already defined
## Info: Document content looks like XHTML5
## Tidy found 9 warnings and 0 errors!</code></pre></div>
<h3 id="testing-options">Testing Options</h3>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">
opts &lt;-<span class="st"> </span><span class="kw">list</span>(<span class="dt">TidyDocType=</span><span class="st">"html5"</span>,
<span class="dt">TidyMakeClean=</span><span class="ot">TRUE</span>,
<span class="dt">TidyHideComments=</span><span class="ot">TRUE</span>,
<span class="dt">TidyIndentContent=</span><span class="ot">FALSE</span>,
<span class="dt">TidyWrapLen=</span><span class="dv">200</span>)
txt &lt;-<span class="st"> "&lt;html&gt;</span>
<span class="st">&lt;head&gt;</span>
<span class="st"> &lt;style&gt;</span>
<span class="st"> p { color: red; }</span>
<span class="st"> &lt;/style&gt;</span>
<span class="st"> &lt;body&gt;</span>
<span class="st"> &lt;!-- ===== body ====== --&gt;</span>
<span class="st"> &lt;p&gt;Test&lt;/p&gt;</span>
<span class="st"> &lt;/body&gt;</span>
<span class="st"> &lt;!--Default Zone</span>
<span class="st"> --&gt;</span>
<span class="st"> &lt;!--Default Zone End--&gt;</span>
<span class="st">&lt;/html&gt;"</span>
<span class="kw">cat</span>(<span class="kw"><a href="reference/tidy_html.response.html">tidy_html</a></span>(txt, <span class="dt">option=</span>opts))
## &lt;!DOCTYPE html&gt;
## &lt;html&gt;
## &lt;head&gt;
## &lt;meta name="generator" content="HTML Tidy for HTML5 for R version 5.0.0"&gt;
## &lt;style&gt;
## p { color: red; }
## &lt;/style&gt;
## &lt;title&gt;&lt;/title&gt;
## &lt;/head&gt;
## &lt;body&gt;
## &lt;p&gt;Test&lt;/p&gt;
## &lt;/body&gt;
## &lt;/html&gt;</code></pre></div>
<p>But, you&rsquo;re probably better off running it on plain HTML source.</p>
<p>Since it&rsquo;s C/C++-backed, it&rsquo;s pretty fast:</p>
<div class="sourceCode"><pre class="sourceCode r"><code class="sourceCode r">book &lt;-<span class="st"> </span><span class="kw">readLines</span>(<span class="st">"http://singlepageappbook.com/single-page.html"</span>)
<span class="kw">sum</span>(<span class="kw">map_int</span>(book, nchar))
## [1] 207501
<span class="kw">system.time</span>(tidy_book &lt;-<span class="st"> </span><span class="kw"><a href="reference/tidy_html.response.html">tidy_html</a></span>(book))
## user system elapsed
## 0.021 0.001 0.022</code></pre></div>
<p>(It&rsquo;s usually between 20 &amp; 25 milliseconds to process those 202 kilobytes of HTML.) Not too shabby.</p>
<h3 id="code-of-conduct">Code of Conduct</h3>
<p>Please note that this project is released with a <a href="CONDUCT.md">Contributor Code of Conduct</a>. By participating in this project you agree to abide by its terms.</p>
</div>
</div>
<footer>
<p>Built by <a href="http://hadley.github.io/pkgdown/">pkgdown</a>. Styled with <a href="http://getbootstrap.com">Bootstrap 3</a>.</p>
</footer>
</div>
</body>
</html>

121
docs/news/index.html

@ -0,0 +1,121 @@
<!-- Generated by pkgdown: do not edit by hand -->
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>All news. htmltidy</title>
<!-- jquery -->
<script src="https://code.jquery.com/jquery-3.1.0.min.js" integrity="sha384-nrOSfDHtoPMzJHjVTdCopGqIqeYETSXhZDFyniQ8ZHcVy08QesyHcnOUpMpqnmWq" crossorigin="anonymous"></script>
<!-- Bootstrap -->
<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<!-- Font Awesome icons -->
<link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-T8Gy5hrqNKT+hzMclPo118YTQO6cYprQmhrYwIiQ/3axmI1hQomh7Ud2hPOy8SP1" crossorigin="anonymous">
<!-- pkgdown -->
<link href="../pkgdown.css" rel="stylesheet">
<script src="../pkgdown.js"></script>
<!-- mathjax -->
<script src='https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></script>
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="container">
<header>
<div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="../index.html">htmltidy</a>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="../index.html">Home</a>
</li>
<li>
<a href="../reference/index.html">Reference</a>
</li>
<li>
<a href="../news/index.html">News</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://github.com/hrbrmstr/htmltidy">
<span class="fa fa-github fa-lg"></span>
</a>
</li>
</ul>
</div><!--/.nav-collapse -->
</div><!--/.container -->
</div><!--/.navbar -->
</header>
<div class="row">
<div class="col-md-9">
<div id="htmltidy-0.3.0" class="section level1">
<h1>htmltidy 0.3.0</h1>
<ul><li>Better error handling (fixed crashing bug in #1)</li>
<li>New option to display document errors</li>
<li>Support for directly tidying httr::response objects</li>
<li>Added XML/HTML viewer &amp; XPath query widgets</li>
</ul></div>
<div id="htmltidy-0.2.0" class="section level1">
<h1>htmltidy 0.2.0</h1>
<ul><li>Bundled tidy-html5 library with the package</li>
<li>Windows compatibility</li>
<li>Options handling</li>
<li>Enabled generics</li>
<li>Modified tests</li>
</ul></div>
<div id="htmltidy-0.1.0" class="section level1">
<h1>htmltidy 0.1.0</h1>
<ul><li>Added a <code>NEWS.md</code> file to track changes to the package.</li>
<li>Added Debian &amp; Ubuntu compatibility</li>
<li>Added basic error checking</li>
<li>Added basic test harness</li>
</ul></div>
</div>
<div class="col-md-3 hidden-xs">
<div id="tocnav">
<h2>Contents</h2>
<ul class="nav nav-pills nav-stacked">
<li><a href="#htmltidy-0.3.0">0.3.0</a></li>
<li><a href="#htmltidy-0.2.0">0.2.0</a></li>
<li><a href="#htmltidy-0.1.0">0.1.0</a></li>
</ul>
</div>
</div>
</div>
<footer>
<p>Built by <a href="http://hadley.github.io/pkgdown/">pkgdown</a>. Styled with <a href="http://getbootstrap.com">Bootstrap 3</a>.</p>
</footer>
</div>
</body>
</html>

59
docs/pkgdown.css

@ -0,0 +1,59 @@
body {
position: relative;
}
.icon img {
float: right;
border: 1px solid #ccc;
}
.index .internal {display: none;}
ul.index li {margin-bottom: 0.5em; clear: both;}
footer {
margin-top: 45px;
padding: 35px 0 36px;
border-top: 1px solid #e5e5e5;
}
footer p {
margin-bottom: 0;
color: #555;
}
/* Fixes for fixed navbar --------------------------*/
body {
position: relative;
padding-top: 60px;
}
.section h1, .section h2, .section h3, .section h4 {
padding-top: 60px;
margin-top: -60px;
}
/* Table of contents --------------------------*/
#tocnav h2 {
margin-top: 0;
font-size: 1.5em;
}
/* Syntax highlighting ---------------------------------------------------- */
.fl,.number {color:rgb(21,20,181);}
.fu,.functioncall {color:#264D66 ;}
.ch,.st,.string {color:#375D81 ;}
.kw,.keyword {font-weight:bolder ;color:black;}
.argument {color:#264D66 ;}
.co,.comment {color: #333;}
.formalargs {color: #264D66;}
.eqformalargs {color:#264D66;}
.slot {font-style:italic;}
.symbol {color:black ;}
.prompt {color:black ;}
pre img {
background-color: #fff;
display: block;
}

13
docs/pkgdown.js

@ -0,0 +1,13 @@
$(function() {
$('#tocnav').affix({
offset: {
top: $('#tocnav').offset().top - 80
}
});
$('body').scrollspy({
target: '#tocnav',
offset: 80
});
});

156
docs/reference/highlight_styles.html

@ -0,0 +1,156 @@
<!-- Generated by pkgdown: do not edit by hand -->
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>highlight_styles. htmltidy</title>
<!-- jquery -->
<script src="https://code.jquery.com/jquery-3.1.0.min.js" integrity="sha384-nrOSfDHtoPMzJHjVTdCopGqIqeYETSXhZDFyniQ8ZHcVy08QesyHcnOUpMpqnmWq" crossorigin="anonymous"></script>
<!-- Bootstrap -->
<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<!-- Font Awesome icons -->
<link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-T8Gy5hrqNKT+hzMclPo118YTQO6cYprQmhrYwIiQ/3axmI1hQomh7Ud2hPOy8SP1" crossorigin="anonymous">
<!-- pkgdown -->
<link href="../pkgdown.css" rel="stylesheet">
<script src="../pkgdown.js"></script>
<!-- mathjax -->
<script src='https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></script>
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="container">
<header>
<div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="../index.html">htmltidy</a>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="../index.html">Home</a>
</li>
<li>
<a href="../reference/index.html">Reference</a>
</li>
<li>
<a href="../news/index.html">News</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://github.com/hrbrmstr/htmltidy">
<span class="fa fa-github fa-lg"></span>
</a>
</li>
</ul>
</div><!--/.nav-collapse -->
</div><!--/.container -->
</div><!--/.navbar -->
</header>
<div class="page-header">
<h1>List available HTML/XML highlight styles</h1>
</div>
<div class="row">
<div class="col-md-9">
<p>Returns a character vector of available style sheets to use when displaying
an XML document.</p>
<pre><span class='fu'>highlight_styles</span>()</pre>
<div class="References">
<h2>References</h2>
<p>See <a href = 'https://highlightjs.org/static/demo/'>https://highlightjs.org/static/demo/</a> for a demo of all
highlight.js styles</p>
</div>
<h2 id="examples">Examples</h2>
<pre class="examples"><div class='input'><span class='fu'>highlight_styles</span>()</div><div class='output co'>#&gt; [1] &quot;agate&quot; &quot;androidstudio&quot;
#&gt; [3] &quot;arta&quot; &quot;ascetic&quot;
#&gt; [5] &quot;atelier-cave-dark&quot; &quot;atelier-cave-light&quot;
#&gt; [7] &quot;atelier-cave.dark&quot; &quot;atelier-cave.light&quot;
#&gt; [9] &quot;atelier-dune-dark&quot; &quot;atelier-dune-light&quot;
#&gt; [11] &quot;atelier-dune.dark&quot; &quot;atelier-dune.light&quot;
#&gt; [13] &quot;atelier-estuary-dark&quot; &quot;atelier-estuary-light&quot;
#&gt; [15] &quot;atelier-estuary.dark&quot; &quot;atelier-estuary.light&quot;
#&gt; [17] &quot;atelier-forest-dark&quot; &quot;atelier-forest-light&quot;
#&gt; [19] &quot;atelier-forest.dark&quot; &quot;atelier-forest.light&quot;
#&gt; [21] &quot;atelier-heath-dark&quot; &quot;atelier-heath-light&quot;
#&gt; [23] &quot;atelier-heath.dark&quot; &quot;atelier-heath.light&quot;
#&gt; [25] &quot;atelier-lakeside-dark&quot; &quot;atelier-lakeside-light&quot;
#&gt; [27] &quot;atelier-lakeside.dark&quot; &quot;atelier-lakeside.light&quot;
#&gt; [29] &quot;atelier-plateau-dark&quot; &quot;atelier-plateau-light&quot;
#&gt; [31] &quot;atelier-plateau.dark&quot; &quot;atelier-plateau.light&quot;
#&gt; [33] &quot;atelier-savanna-dark&quot; &quot;atelier-savanna-light&quot;
#&gt; [35] &quot;atelier-savanna.dark&quot; &quot;atelier-savanna.light&quot;
#&gt; [37] &quot;atelier-seaside-dark&quot; &quot;atelier-seaside-light&quot;
#&gt; [39] &quot;atelier-seaside.dark&quot; &quot;atelier-seaside.light&quot;
#&gt; [41] &quot;atelier-sulphurpool-dark&quot; &quot;atelier-sulphurpool-light&quot;
#&gt; [43] &quot;atelier-sulphurpool.dark&quot; &quot;atelier-sulphurpool.light&quot;
#&gt; [45] &quot;brown_paper&quot; &quot;brown-paper&quot;
#&gt; [47] &quot;codepen-embed&quot; &quot;color-brewer&quot;
#&gt; [49] &quot;dark&quot; &quot;darkula&quot;
#&gt; [51] &quot;default&quot; &quot;docco&quot;
#&gt; [53] &quot;far&quot; &quot;foundation&quot;
#&gt; [55] &quot;github-gist&quot; &quot;github&quot;
#&gt; [57] &quot;googlecode&quot; &quot;grayscale&quot;
#&gt; [59] &quot;hopscotch&quot; &quot;hybrid&quot;
#&gt; [61] &quot;idea&quot; &quot;ir_black&quot;
#&gt; [63] &quot;ir-black&quot; &quot;kimbie.dark&quot;
#&gt; [65] &quot;kimbie.light&quot; &quot;magula&quot;
#&gt; [67] &quot;mono-blue&quot; &quot;monokai_sublime&quot;
#&gt; [69] &quot;monokai-sublime&quot; &quot;monokai&quot;
#&gt; [71] &quot;obsidian&quot; &quot;paraiso-dark&quot;
#&gt; [73] &quot;paraiso-light&quot; &quot;paraiso.dark&quot;
#&gt; [75] &quot;paraiso.light&quot; &quot;pojoaque&quot;
#&gt; [77] &quot;railscasts&quot; &quot;rainbow&quot;
#&gt; [79] &quot;school_book&quot; &quot;school-book&quot;
#&gt; [81] &quot;solarized_dark&quot; &quot;solarized_light&quot;
#&gt; [83] &quot;solarized-dark&quot; &quot;solarized-light&quot;
#&gt; [85] &quot;sunburst&quot; &quot;tomorrow-night-blue&quot;
#&gt; [87] &quot;tomorrow-night-bright&quot; &quot;tomorrow-night-eighties&quot;
#&gt; [89] &quot;tomorrow-night&quot; &quot;tomorrow&quot;
#&gt; [91] &quot;vs&quot; &quot;xcode&quot;
#&gt; [93] &quot;zenburn&quot;
#&gt; </div></pre>
</div>
<div class="col-md-3">
</div>
</div>
<footer>
<p>Built by <a href="http://hadley.github.io/pkgdown/">pkgdown</a>. Styled with <a href="http://getbootstrap.com">Bootstrap 3</a>.</p>
</footer>
</div>
</body>
</html>

115
docs/reference/htmltidy.html

@ -0,0 +1,115 @@
<!-- Generated by pkgdown: do not edit by hand -->
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>htmltidy. htmltidy</title>
<!-- jquery -->
<script src="https://code.jquery.com/jquery-3.1.0.min.js" integrity="sha384-nrOSfDHtoPMzJHjVTdCopGqIqeYETSXhZDFyniQ8ZHcVy08QesyHcnOUpMpqnmWq" crossorigin="anonymous"></script>
<!-- Bootstrap -->
<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<!-- Font Awesome icons -->
<link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-T8Gy5hrqNKT+hzMclPo118YTQO6cYprQmhrYwIiQ/3axmI1hQomh7Ud2hPOy8SP1" crossorigin="anonymous">
<!-- pkgdown -->
<link href="../pkgdown.css" rel="stylesheet">
<script src="../pkgdown.js"></script>
<!-- mathjax -->
<script src='https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></script>
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="container">
<header>
<div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="../index.html">htmltidy</a>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="../index.html">Home</a>
</li>
<li>
<a href="../reference/index.html">Reference</a>
</li>
<li>
<a href="../news/index.html">News</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://github.com/hrbrmstr/htmltidy">
<span class="fa fa-github fa-lg"></span>
</a>
</li>
</ul>
</div><!--/.nav-collapse -->
</div><!--/.container -->
</div><!--/.navbar -->
</header>
<div class="page-header">
<h1>Tidy Up and Test XPath Queries on HTML and XML Content</h1>
</div>
<div class="row">
<div class="col-md-9">
<p>HTML documents can be beautiful and pristine. They can also be
wretched, evil, malformed demon-spawn. Now, you can tidy up that HTML and XHTML
before processing it with your favorite angle-bracket crunching tools, going beyond
the limited tidying that &#39;libxml2&#39; affords in the &#39;XML&#39; and &#39;xml2&#39; packages and
taming even the ugliest HTML code generated by the likes of Google Docs and Microsoft
Word. It&#39;s also possible to use the functions provided to format or &quot;pretty print&quot;
HTML content as it is being tidied. Utilities are also included that make it
possible to view formatted and &quot;pretty printed&quot; HTML/XML
content from HTML/XML document objects, nodes, node sets and plain character HTML/XML
using &#39;vkbeautify&#39; (by Vadim Kiryukhin) and &#39;highlight.js&#39; (by Ivan Sagalaev).
Also (optionally) enables filtering of nodes via XPath or viewing an XML document
in &quot;tree&quot; view using &#39;xml-viewer&#39; (by Julian Gruber). See
<a href = 'https://github.com/vkiryukhin/vkBeautify'>https://github.com/vkiryukhin/vkBeautify</a> and
<a href = 'https://github.com/juliangruber/xml-viewer'>https://github.com/juliangruber/xml-viewer</a> for more information about &#39;vkbeautify&#39;
and &#39;xml-viewer&#39;, respectively.</p>
</div>
<div class="col-md-3">
<h2>Author</h2>
Bob Rudis (bob@rud.is)
</div>
</div>
<footer>
<p>Built by <a href="http://hadley.github.io/pkgdown/">pkgdown</a>. Styled with <a href="http://getbootstrap.com">Bootstrap 3</a>.</p>
</footer>
</div>
</body>
</html>

114
docs/reference/index.html

@ -0,0 +1,114 @@
<!-- Generated by pkgdown: do not edit by hand -->
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Function reference. htmltidy</title>
<!-- jquery -->
<script src="https://code.jquery.com/jquery-3.1.0.min.js" integrity="sha384-nrOSfDHtoPMzJHjVTdCopGqIqeYETSXhZDFyniQ8ZHcVy08QesyHcnOUpMpqnmWq" crossorigin="anonymous"></script>
<!-- Bootstrap -->
<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<!-- Font Awesome icons -->
<link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-T8Gy5hrqNKT+hzMclPo118YTQO6cYprQmhrYwIiQ/3axmI1hQomh7Ud2hPOy8SP1" crossorigin="anonymous">
<!-- pkgdown -->
<link href="../pkgdown.css" rel="stylesheet">
<script src="../pkgdown.js"></script>
<!-- mathjax -->
<script src='https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></script>
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="container">
<header>
<div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="../index.html">htmltidy</a>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="../index.html">Home</a>
</li>
<li>
<a href="../reference/index.html">Reference</a>
</li>
<li>
<a href="../news/index.html">News</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://github.com/hrbrmstr/htmltidy">
<span class="fa fa-github fa-lg"></span>
</a>
</li>
</ul>
</div><!--/.nav-collapse -->
</div><!--/.container -->
</div><!--/.navbar -->
</header>
<div class="page-header">
<h1>Function reference <small>version&nbsp;0.3.0</small></h1>
</div>
<div class="row">
<div class="col-md-9">
<div class="section ">
<h2>All functions</h2>
<p class="section-desc"></p>
<dl class="dl-horizontal">
<dt><a href="highlight_styles.html">highlight_styles</a></dt>
<dd>List available HTML/XML highlight styles</dd>
<dt><a href="htmltidy.html">htmltidy</a></dt><dt><a href="htmltidy.html">htmltidy-package</a></dt>
<dd>Tidy Up and Test XPath Queries on HTML and XML Content</dd>
<dt><a href="renderXmlview.html">renderXmlview</a></dt>
<dd>Widget render function for use in Shiny</dd>
<dt><a href="tidy_html.html">tidy_html</a></dt><dt><a href="tidy_html.html">tidy_html.HTMLInternalDocument</a></dt><dt><a href="tidy_html.html">tidy_html.character</a></dt><dt><a href="tidy_html.html">tidy_html.connection</a></dt><dt><a href="tidy_html.html">tidy_html.default</a></dt><dt><a href="tidy_html.html">tidy_html.raw</a></dt><dt><a href="tidy_html.html">tidy_html.response</a></dt><dt><a href="tidy_html.html">tidy_html.xml_document</a></dt>
<dd>Tidy or &quot;Pretty Print&quot; HTML/XHTML Documents</dd>
<dt><a href="xml_tree_view.html">html_tree_view</a></dt><dt><a href="xml_tree_view.html">xml_tree_view</a></dt>
<dd>HTML/XML tree viewer</dd>
<dt><a href="xml_view.html">html_view</a></dt><dt><a href="xml_view.html">xml_view</a></dt>
<dd>HTML/XML pretty printer and viewer</dd>
<dt><a href="xmltreeview-shiny.html">renderXmltreeview</a></dt><dt><a href="xmltreeview-shiny.html">xmltreeview-shiny</a></dt><dt><a href="xmltreeview-shiny.html">xmltreeviewOutput</a></dt>
<dd>Shiny bindings for xmltreeview</dd>
<dt><a href="xmlviewOutput.html">xmlviewOutput</a></dt>
<dd>Widget output function for use in Shiny</dd>
</dl>
</div>
</div>
</div>
<footer>
<p>Built by <a href="http://hadley.github.io/pkgdown/">pkgdown</a>. Styled with <a href="http://getbootstrap.com">Bootstrap 3</a>.</p>
</footer>
</div>
</body>
</html>

108
docs/reference/renderXmlview.html

@ -0,0 +1,108 @@
<!-- Generated by pkgdown: do not edit by hand -->
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>renderXmlview. htmltidy</title>
<!-- jquery -->
<script src="https://code.jquery.com/jquery-3.1.0.min.js" integrity="sha384-nrOSfDHtoPMzJHjVTdCopGqIqeYETSXhZDFyniQ8ZHcVy08QesyHcnOUpMpqnmWq" crossorigin="anonymous"></script>
<!-- Bootstrap -->
<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<!-- Font Awesome icons -->
<link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-T8Gy5hrqNKT+hzMclPo118YTQO6cYprQmhrYwIiQ/3axmI1hQomh7Ud2hPOy8SP1" crossorigin="anonymous">
<!-- pkgdown -->
<link href="../pkgdown.css" rel="stylesheet">
<script src="../pkgdown.js"></script>
<!-- mathjax -->
<script src='https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></script>
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="container">
<header>
<div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="../index.html">htmltidy</a>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="../index.html">Home</a>
</li>
<li>
<a href="../reference/index.html">Reference</a>
</li>
<li>
<a href="../news/index.html">News</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://github.com/hrbrmstr/htmltidy">
<span class="fa fa-github fa-lg"></span>
</a>
</li>
</ul>
</div><!--/.nav-collapse -->
</div><!--/.container -->
</div><!--/.navbar -->
</header>
<div class="page-header">
<h1>Widget render function for use in Shiny</h1>
</div>
<div class="row">
<div class="col-md-9">
<p>Widget render function for use in Shiny</p>
<pre><span class='fu'>renderXmlview</span>(<span class='no'>expr</span>, <span class='kw'>env</span> <span class='kw'>=</span> <span class='fu'>parent.frame</span>(), <span class='kw'>quoted</span> <span class='kw'>=</span> <span class='fl'>FALSE</span>)</pre>
<h2>Arguments</h2>
<dl class="dl-horizontal">
<dt>expr</dt>
<dd>expr</dd>
<dt>env</dt>
<dd>env</dd>
<dt>quoted</dt>
<dd>quoted</dd>
</dl>
</div>
<div class="col-md-3">
</div>
</div>
<footer>
<p>Built by <a href="http://hadley.github.io/pkgdown/">pkgdown</a>. Styled with <a href="http://getbootstrap.com">Bootstrap 3</a>.</p>
</footer>
</div>
</body>
</html>

284
docs/reference/tidy_html.html

@ -0,0 +1,284 @@
<!-- Generated by pkgdown: do not edit by hand -->
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>tidy_html.response. htmltidy</title>
<!-- jquery -->
<script src="https://code.jquery.com/jquery-3.1.0.min.js" integrity="sha384-nrOSfDHtoPMzJHjVTdCopGqIqeYETSXhZDFyniQ8ZHcVy08QesyHcnOUpMpqnmWq" crossorigin="anonymous"></script>
<!-- Bootstrap -->
<link href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u" crossorigin="anonymous">
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha384-Tc5IQib027qvyjSMfHjOMaLkfuWVxZxUPnCJA7l2mCWNIpG9mGCD8wGNIcPD7Txa" crossorigin="anonymous"></script>
<!-- Font Awesome icons -->
<link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.6.3/css/font-awesome.min.css" rel="stylesheet" integrity="sha384-T8Gy5hrqNKT+hzMclPo118YTQO6cYprQmhrYwIiQ/3axmI1hQomh7Ud2hPOy8SP1" crossorigin="anonymous">
<!-- pkgdown -->
<link href="../pkgdown.css" rel="stylesheet">
<script src="../pkgdown.js"></script>
<!-- mathjax -->
<script src='https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></script>
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="container">
<header>
<div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="../index.html">htmltidy</a>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="../index.html">Home</a>
</li>
<li>
<a href="../reference/index.html">Reference</a>
</li>
<li>
<a href="../news/index.html">News</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://github.com/hrbrmstr/htmltidy">
<span class="fa fa-github fa-lg"></span>
</a>
</li>
</ul>
</div><!--/.nav-collapse -->
</div><!--/.container -->
</div><!--/.navbar -->
</header>
<div class="page-header">
<h1>Tidy or &quot;Pretty Print&quot; HTML/XHTML Documents</h1>
</div>
<div class="row">
<div class="col-md-9">
<p>Pass in HTML content as either plain or raw text or parsed objects (either with the
<code>XML</code> or <code>xml2</code> packages) or as an <code>httr</code> <code>response</code> object
along with an options list that specifies how the content will be tidied and get back
tidied content of the same object type as passed in to the function.</p>
<pre><span class='co'># S3 method for response</span>
<span class='fu'>tidy_html</span>(<span class='no'>content</span>, <span class='kw'>options</span> <span class='kw'>=</span> <span class='fu'>list</span>(<span class='kw'>TidyXhtmlOut</span> <span class='kw'>=</span> <span class='fl'>TRUE</span>),
<span class='kw'>verbose</span> <span class='kw'>=</span> <span class='fl'>FALSE</span>)
<span class='fu'>tidy_html</span>(<span class='no'>content</span>, <span class='kw'>options</span> <span class='kw'>=</span> <span class='fu'>list</span>(<span class='kw'>TidyXhtmlOut</span> <span class='kw'>=</span> <span class='fl'>TRUE</span>), <span class='kw'>verbose</span> <span class='kw'>=</span> <span class='fl'>FALSE</span>)
<span class='co'># S3 method for default</span>
<span class='fu'>tidy_html</span>(<span class='no'>content</span>, <span class='kw'>options</span> <span class='kw'>=</span> <span class='fu'>list</span>(<span class='kw'>TidyXhtmlOut</span> <span class='kw'>=</span> <span class='fl'>TRUE</span>),
<span class='kw'>verbose</span> <span class='kw'>=</span> <span class='fl'>FALSE</span>)
<span class='co'># S3 method for character</span>
<span class='fu'>tidy_html</span>(<span class='no'>content</span>, <span class='kw'>options</span> <span class='kw'>=</span> <span class='fu'>list</span>(<span class='kw'>TidyXhtmlOut</span> <span class='kw'>=</span> <span class='fl'>TRUE</span>),
<span class='kw'>verbose</span> <span class='kw'>=</span> <span class='fl'>FALSE</span>)
<span class='co'># S3 method for raw</span>
<span class='fu'>tidy_html</span>(<span class='no'>content</span>, <span class='kw'>options</span> <span class='kw'>=</span> <span class='fu'>list</span>(<span class='kw'>TidyXhtmlOut</span> <span class='kw'>=</span> <span class='fl'>TRUE</span>),
<span class='kw'>verbose</span> <span class='kw'>=</span> <span class='fl'>FALSE</span>)
<span class='co'># S3 method for xml_document</span>
<span class='fu'>tidy_html</span>(<span class='no'>content</span>, <span class='kw'>options</span> <span class='kw'>=</span> <span class='fu'>list</span>(<span class='kw'>TidyXhtmlOut</span> <span class='kw'>=</span> <span class='fl'>TRUE</span>),
<span class='kw'>verbose</span> <span class='kw'>=</span> <span class='fl'>FALSE</span>)
<span class='co'># S3 method for HTMLInternalDocument</span>
<span class='fu'>tidy_html</span>(<span class='no'>content</span>, <span class='kw'>options</span> <span class='kw'>=</span> <span class='fu'>list</span>(<span class='kw'>TidyXhtmlOut</span>
<span class='kw'>=</span> <span class='fl'>TRUE</span>), <span class='kw'>verbose</span> <span class='kw'>=</span> <span class='fl'>FALSE</span>)
<span class='co'># S3 method for connection</span>
<span class='fu'>tidy_html</span>(<span class='no'>content</span>, <span class='kw'>options</span> <span class='kw'>=</span> <span class='fu'>list</span>(<span class='kw'>TidyXhtmlOut</span> <span class='kw'>=</span> <span class='fl'>TRUE</span>),
<span class='kw'>verbose</span> <span class='kw'>=</span> <span class='fl'>FALSE</span>)</pre>
<h2>Arguments</h2>
<dl class="dl-horizontal">
<dt>content</dt>
<dd>accepts a character vector, raw vector or parsed content from the <code>xml2</code>
or <code>XML</code> packages.</dd>
<dt>options</dt>
<dd>named list of options</dd>
<dt>verbose</dt>
<dd>output document errors? (default: <code>FALSE</code>)</dd>
</dl>
<div class="Value">
<h2>Value</h2>
<p>Tidied HTML/XHTML content. The object type will be the same as that of the input type
except when it is a <code>connection</code>, then a character vector will be returned.</p>
</div>
<div class="Details">
<h2>Details</h2>
<p>The default option <code>TixyXhtmlOut</code> will convert the input content to XHTML.</p>
<p>Currently supported options:</p>
<p><ul>
<li>Ones taking a logical value: <code>TidyAltText</code>, <code>TidyBodyOnly</code>, <code>TidyBreakBeforeBR</code>,
<code>TidyCoerceEndTags</code>, <code>TidyDropEmptyElems</code>, <code>TidyDropEmptyParas</code>,
<code>TidyFixBackslash</code>, <code>TidyFixComments</code>, <code>TidyGDocClean</code>, <code>TidyHideComments</code>,
<code>TidyHtmlOut</code>, <code>TidyIndentContent</code>, <code>TidyJoinClasses</code>, <code>TidyJoinStyles</code>,
<code>TidyLogicalEmphasis</code>, <code>TidyMakeBare</code>, <code>TidyMakeClean</code>, <code>TidyMark</code>,
<code>TidyOmitOptionalTags</code>, <code>TidyReplaceColor</code>, <code>TidyUpperCaseAttrs</code>,
<code>TidyUpperCaseTags</code>, <code>TidyWord2000</code>, <code>TidyXhtmlOut</code>
</li>
<li>Ones taking a character value: <code>TidyDoctype</code>, <code>TidyInlineTags</code>, <code>TidyBlockTags</code>,
<code>TidyEmptyTags</code>, <code>TidyPreTags</code>
</li>
<li>Ones taking an integer value: <code>TidyIndentSpaces</code>, <code>TidyTabSize</code>, <code>TidyWrapLen</code>
</li>
</ul></p>
<p>File <a href = 'an issue'>https://github.com/hrbrmstr/htmltidy/issues</a> if there are other <code>libtidy</code>
options you&#39;d like supported.</p>
<p>It is likely that the most used options will be:</p>
<p><ul>
<li><code>TidyXhtmlOut</code> (logical),
</li>
<li><code>TidyHtmlOut</code> (logical) and
</li>
<li><code>TidyDocType</code> which should be one of &quot;<code>omit</code>&quot;,
&quot;<code>html5</code>&quot;, &quot;<code>auto</code>&quot;, &quot;<code>strict</code>&quot; or &quot;<code>loose</code>&quot;.
</li>
</ul></p>
<p>You can clean up Microsoft Word (2000) and Google Docs HTML via logical settings for
<code>TidyWord2000</code> and <code>TidyGDocClean</code>, respectively.</p>
<p>It may also be advantageous to remove all comments with <code>TidyHideComments</code>.</p>
</div>
<div class="Note">
<h2>Note</h2>
<p>If document parsing errors are severe enough, <code>tidy_html()</code> will not be able
to clean the document and will display the errors (this output can be captured with
<code>sink()</code> or <code>capture.output()</code>) along with a warning and return a &quot;best effort&quot;
cleaned version of the document.</p>
</div>
<div class="References">
<h2>References</h2>
<p><a href = 'http://api.html-tidy.org/tidy/quickref_5.1.25.html'>http://api.html-tidy.org/tidy/quickref_5.1.25.html</a> &amp;
<a href = 'https://github.com/htacg/tidy-html5/blob/master/include/tidyenum.h'>https://github.com/htacg/tidy-html5/blob/master/include/tidyenum.h</a>
for definitions of the options supported above and <a href = 'https://www.w3.org/People/Raggett/tidy/'>https://www.w3.org/People/Raggett/tidy/</a>
for an explanation of what &quot;tidy&quot; HTML is and some canonical examples of what it can do.</p>
</div>
<h2 id="examples">Examples</h2>
<pre class="examples"><div class='input'><span class='no'>opts</span> <span class='kw'>&lt;-</span> <span class='fu'>list</span>(
<span class='kw'>TidyDocType</span><span class='kw'>=</span><span class='st'>"html5"</span>,
<span class='kw'>TidyMakeClean</span><span class='kw'>=</span><span class='fl'>TRUE</span>,
<span class='kw'>TidyHideComments</span><span class='kw'>=</span><span class='fl'>TRUE</span>,
<span class='kw'>TidyIndentContent</span><span class='kw'>=</span><span class='fl'>TRUE</span>,
<span class='kw'>TidyWrapLen</span><span class='kw'>=</span><span class='fl'>200</span>
)
<span class='no'>txt</span> <span class='kw'>&lt;-</span> <span class='fu'>paste0</span>(
<span class='fu'>c</span>(<span class='st'>"&lt;html&gt;&lt;head&gt;&lt;style&gt;p { color: red; }&lt;/style&gt;&lt;body&gt;&lt;!-- ===== body ====== --&gt;"</span>,
<span class='st'>"&lt;p&gt;Test&lt;/p&gt;&lt;/body&gt;&lt;!--Default Zone --&gt; &lt;!--Default Zone End--&gt;&lt;/html&gt;"</span>),
<span class='kw'>collapse</span><span class='kw'>=</span><span class='st'>""</span>)
<span class='fu'>cat</span>(<span class='fu'>tidy_html</span>(<span class='no'>txt</span>, <span class='kw'>option</span><span class='kw'>=</span><span class='no'>opts</span>))</div><div class='output co'>#&gt; &lt;!DOCTYPE html&gt;
#&gt; &lt;html&gt;
#&gt; &lt;head&gt;
#&gt; &lt;meta name=&quot;generator&quot; content=&quot;HTML Tidy for HTML5 for R version 5.0.0&quot;&gt;
#&gt; &lt;style&gt;
#&gt; p { color: red; }
#&gt; &lt;/style&gt;
#&gt; &lt;title&gt;&lt;/title&gt;
#&gt; &lt;/head&gt;
#&gt; &lt;body&gt;
#&gt; &lt;p&gt;
#&gt; Test
#&gt; &lt;/p&gt;
#&gt; &lt;/body&gt;
#&gt; &lt;/html&gt;
#&gt; </div><div class='input'>
<span class='fu'>library</span>(<span class='no'>httr</span>)
<span class='no'>res</span> <span class='kw'>&lt;-</span> <span class='fu'>GET</span>(<span class='st'>"http://rud.is/test/untidy.html"</span>)
<span class='co'># look at the original, un-tidy source</span>
<span class='fu'>cat</span>(<span class='fu'>content</span>(<span class='no'>res</span>, <span class='kw'>as</span><span class='kw'>=</span><span class='st'>"text"</span>, <span class='kw'>encoding</span><span class='kw'>=</span><span class='st'>"UTF-8"</span>))</div><div class='output co'>#&gt; &lt;head&gt;
#&gt; &lt;style&gt;
#&gt; body { font-family: sans-serif; }
#&gt; &lt;/style&gt;
#&gt; &lt;/head&gt;
#&gt; &lt;body&gt;
#&gt; &lt;b&gt;This is &lt;b&gt;some &lt;i&gt;really &lt;/i&gt; poorly formatted HTML&lt;/b&gt;
#&gt;
#&gt; as is this &lt;span id=&quot;sp&quot;&gt;portion&lt;div&gt;
#&gt; </div><div class='input'>
<span class='co'># see the tidied version</span>
<span class='fu'>cat</span>(<span class='fu'>tidy_html</span>(<span class='fu'>content</span>(<span class='no'>res</span>, <span class='kw'>as</span><span class='kw'>=</span><span class='st'>"text"</span>, <span class='kw'>encoding</span><span class='kw'>=</span><span class='st'>"UTF-8"</span>),
<span class='fu'>list</span>(<span class='kw'>TidyDocType</span><span class='kw'>=</span><span class='st'>"html5"</span>, <span class='kw'>TidyWrapLen</span><span class='kw'>=</span><span class='fl'>200</span>)))</div><div class='output co'>#&gt; &lt;!DOCTYPE html&gt;
#&gt; &lt;html&gt;
#&gt; &lt;head&gt;
#&gt; &lt;meta name=&quot;generator&quot; content=&quot;HTML Tidy for HTML5 for R version 5.0.0&quot;&gt;
#&gt; &lt;style&gt;
#&gt; body { font-family: sans-serif; }
#&gt; &lt;/style&gt;
#&gt; &lt;title&gt;&lt;/title&gt;
#&gt; &lt;/head&gt;
#&gt; &lt;body&gt;
#&gt; &lt;b&gt;This is some &lt;i&gt;really&lt;/i&gt; poorly formatted HTML as is this &lt;span id=&quot;sp&quot;&gt;portion&lt;/span&gt;&lt;/b&gt;
#&gt; &lt;div&gt;&lt;span id=&quot;sp&quot;&gt;&lt;/span&gt;&lt;/div&gt;
#&gt; &lt;/body&gt;
#&gt; &lt;/html&gt;
#&gt; </div><div class='input'>
<span class='co'># but, you could also just do:</span>
<span class='fu'>cat</span>(<span class='fu'>tidy_html</span>(<span class='fu'>url</span>(<span class='st'>"http://rud.is/test/untidy.html"</span>)))</div><div class='output co'>#&gt; &lt;!DOCTYPE html&gt;
#&gt; &lt;html xmlns=&quot;http://www.w3.org/1999/xhtml&quot;&gt;
#&gt; &lt;head&gt;
#&gt; &lt;meta name=&quot;generator&quot; content=
#&gt; &quot;HTML Tidy for HTML5 for R version 5.0.0&quot; /&gt;
#&gt; &lt;style&gt;
#&gt; &lt;![CDATA[
#&gt; body { font-family: sans-serif; }
#&gt; ]]&gt;
#&gt; &lt;/style&gt;
#&gt; &lt;title&gt;&lt;/title&gt;
#&gt; &lt;/head&gt;
#&gt; &lt;body&gt;
#&gt; &lt;b&gt;This is some &lt;i&gt;really&lt;/i&gt; poorly formatted HTMLas is this
#&gt; &lt;span id=&quot;sp&quot;&gt;portion&lt;/span&gt;&lt;/b&gt;
#&gt; &lt;div&gt;&lt;span id=&quot;sp&quot;&gt;&lt;/span&gt;&lt;/div&gt;