`htmltidy` — Clean up gnarly HTML/XML Inspired by [this SO question](http://stackoverflow.com/questions/37061873/identify-a-weblink-in-bold-in-r) and because there's a great deal of cruddy HTML out there that needs fixing to use properly when scraping data. NOTE: Requires [`libtidy`](http://www.html-tidy.org/) and presently is super-basic (no way to set options and pretty much only does HTML) The following functions are implemented: - `tidy` : Clean up gnarly HTML/XML ### Installation ``` r devtools::install_github("hrbrmstr/htmltidy") ``` ### Usage ``` r library(htmltidy) # current verison packageVersion("htmltidy") #> [1] '0.0.0.9000' cat(tidy("google >
")) #> #> #> #> "HTML Tidy for HTML5 for Mac OS X version 5.2.0" /> #> #> #> #>

google >

#> #> ``` ### Code of Conduct Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms.