You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 

28 lines
1012 B

% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/clean.r
\name{clean_text}
\alias{clean_text}
\title{Remove all tags from a document}
\usage{
clean_text(doc)
}
\arguments{
\item{doc}{atomic character vector (i.e. plain text) or an \code{html_document}}
}
\value{
atomic character vector of cleaned text
}
\description{
This is designed to be run on the \code{$content} component of the \code{data.frame} returned
by \code{just_the_facts()}. It can be run on any \code{html_document} or atomic character vectors
(which it will parse into an \code{html_document}) and it will return an atomic character
vector of only plain text (i.e. it will remove all tags).
}
\note{
the XSLT can be a bit aggressive for some URLs and this function will first
try the XSLT and test for an empty return. If that condition exists, then
it will revert to a plain text conversion with just straight \code{rvest::html_text()}.
}
\examples{
clean_text(system.file("extdata", "raw.html", package="hgr"))
}