Access and Query Amazon Athena via the Tidyverse
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

121 lines
2.9 KiB

7 years ago
---
output: rmarkdown::github_document
editor_options:
chunk_output_type: console
7 years ago
---
5 years ago
```{r include=FALSE}
knitr::opts_chunk$set(
echo = TRUE,
message = FALSE,
warning = FALSE,
fig.retina = 2
)
Sys.setenv(
AWS_S3_STAGING_DIR = "s3://aws-athena-query-results-569593279821-us-east-1"
)
6 years ago
5 years ago
options(width=120)
5 years ago
```
7 years ago
5 years ago
[![Travis-CI Build Status](https://travis-ci.org/hrbrmstr/metis-tidy.svg?branch=master)](https://travis-ci.org/hrbrmstr/metis-tidy)
[![Coverage Status](https://codecov.io/gh/hrbrmstr/metis-tidy/branch/master/graph/badge.svg)](https://codecov.io/gh/hrbrmstr/metis-tidy
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/metis.tidy)](https://cran.r-project.org/package=metis.tidy)
5 years ago
# metis.tidy
6 years ago
5 years ago
Access and Query Amazon Athena via the Tidyverse
6 years ago
5 years ago
## Description
6 years ago
5 years ago
Methods are provided to use the 'metis' JDBC/DBI interface via the Tidyverse
(e.g. 'dbplyr'/'dplyr' idioms).
7 years ago
5 years ago
## What's Inside The Tin?
6 years ago
5 years ago
Lightweight helpers to make it easier to `filter` and `mutate` plus type support for Athena `BIGINT` (64-bit integers).
6 years ago
## Installation
7 years ago
```{r eval=FALSE}
5 years ago
devtools::install_git("https://git.sr.ht/~hrbrmstr/metis-tidy")
# OR
5 years ago
devtools::install_gitlab("hrbrmstr/metis-tidy")
# OR
5 years ago
devtools::install_github("hrbrmstr/metis-tidy")
7 years ago
```
6 years ago
## Usage
7 years ago
5 years ago
```{r}
library(metis.tidy)
7 years ago
# current verison
5 years ago
packageVersion("metis.tidy")
7 years ago
```
5 years ago
### Basic Setup (using an alternate provider)
6 years ago
5 years ago
```{r}
library(metis.tidy)
library(tidyverse)
metis::dbConnect(
metis::Athena(),
Schema = "sampledb",
AwsCredentialsProviderClass = "com.simba.athena.amazonaws.auth.PropertiesFileCredentialsProvider",
AwsCredentialsProviderArguments = path.expand("~/.aws/athenaCredentials.props")
) -> con
6 years ago
5 years ago
elb_logs <- tbl(con, "elb_logs")
5 years ago
glimpse(elb_logs)
```
5 years ago
#### Using custom Athena functions
```{r}
5 years ago
filter(elb_logs, elbresponsecode == "200") %>%
mutate(
tsday = as.Date(substring(timestamp, 1L, 10L)),
host = url_extract_host(url),
proto_version = regexp_extract(protocol, "([[:digit:]\\.]+)"),
) %>%
select(tsday, host, receivedbytes, requestprocessingtime, proto_version) %>%
glimpse()
```
5 years ago
#### All the types work. Some are useful.
```{r}
tbl(con, sql("
SELECT
CAST('chr' AS CHAR(4)) achar,
CAST('varchr' AS VARCHAR) avarchr,
CAST(SUBSTR(timestamp, 1, 10) AS DATE) AS tsday,
CAST(100.1 AS DOUBLE) AS justadbl,
CAST(127 AS TINYINT) AS asmallint,
CAST(100 AS INTEGER) AS justanint,
CAST(100000000000000000 AS BIGINT) AS abigint,
CAST(('GET' = 'GET') AS BOOLEAN) AS is_get,
ARRAY[1, 2, 3] AS arr,
ARRAY['1', '2, 3', '4'] AS arr,
MAP(ARRAY['foo', 'bar'], ARRAY[1, 2]) AS mp,
CAST(ROW(1, 2.0) AS ROW(x BIGINT, y DOUBLE)) AS rw,
CAST('{\"a\":1}' AS JSON) js
FROM elb_logs
LIMIT 1
")) %>%
glimpse()
7 years ago
```
6 years ago
5 years ago
```{r cloc}
cloc::cloc_pkg_md()
```
6 years ago
## Code of Conduct
5 years ago
Please note that this project is released with a
[Contributor Code of Conduct](CONDUCT.md). By participating in this project you
agree to abide by its terms.