@ -22,109 +22,106 @@ Let's see what extra goodies `splashr` provides to make our lives easier.
## Handling `splashr` Objects
One of the most powerful functions in `splashr` is `render_har()`. You get every component loaded by dynamic web page, and some sites have upwards of 100 elements for any given page. How can you get to the bits that you want?
Let's use a different example that's a bit gnarly (i.e. you may need to work through it a couple times).
The U.K. government has an open data portal and one of the sections contains map tiles for various grid quadrants. It's a really nice site, but it's designed for interactive use and we want to be able to get to all the tile files programmatically. For our example, we'll be grabbing data from <http://environment.data.gov.uk/ds/survey/index.jsp#/survey?grid=TQ38>.
Since we don't know what we need, let's use `render_har()` to get everything back into R:
We'll check <https://apple.com/> first since Apple claims to care about our privacy. If that's true, then they'll will load few or no third-party content.
Now, we're getting somewhere. The `har_entries()` function makes it easy to get to the individual elements and we can use the `is_json()` helper with `purrr` functions to slice and dice at will. Here are all the `is_` functions you can use with HAR objects:
So, locale metadata and something to do with on-page links/suggestions.
As demonstrated, the `har_entries()` function makes it easy to get to the individual elements and we used the `is_json()` helper with `purrr` functions to slice and dice the structure at will. Here are all the `is_` functions you can use with HAR objects:
@ -145,60 +142,141 @@ You can also use various `get_` helpers to avoid gnarly `$` or `[]` constructs
- `get_body_size()` --- Retrieve size of content | body | headers
- `get_content_size()` --- Retrieve size of content | body | headers
- `get_content_type()` --- Retrieve or test content type of a HAR request object
- `get_headers` --- Retrieve response headers as a data frame
- `get_headers_size()` --- Retrieve size of content | body | headers
- `get_request_type()` --- Retrieve or test request type
- `get_request_url()` --- Retrieve request URL
- `get_response_url()` --- Retrieve response URL
- `get_response_body()` --- Retrieve the body content of a HAR entry
We've seen one example of them already, here's another: