Access and Query Amazon Athena via DBI/JDBC
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

157 lines
4.1 KiB

пре 7 година
---
output: rmarkdown::github_document
editor_options:
chunk_output_type: console
пре 7 година
---
пре 6 година
пре 5 година
# metis
пре 7 година
пре 5 година
Access and Query Amazon Athena via DBI/JDBC
пре 7 година
пре 6 година
## Description
пре 5 година
In Greek mythology, Metis was Athena's "helper" so methods are provided to help you accessing and querying Amazon Athena via DBI/JDBC and/or `dplyr`.
#' Methods are provides to connect to 'Amazon' 'Athena', lookup schemas/tables,
пре 7 година
пре 5 година
## IMPORTANT
Since R 3.5 (I don't remember this happening in R 3.4.x) signals sent from interrupting Athena JDBC calls crash the R interpreter. You need to set the `-Xrs` option to avoid signals being passed on to the JVM owner. That has to be done _before_ `rJava` is loaded so you either need to remember to put it at the top of all scripts _or_ stick this in your local `~/.Rprofile` and/or sitewide `Rprofile`:
```r
if (!grepl("-Xrs", getOption("java.parameters", ""))) {
options(
пре 5 година
"java.parameters" = c(getOption("java.parameters", default = NULL), "-Xrs")
пре 5 година
)
}
```
пре 6 година
## What's Inside The Tin?
пре 7 година
The following functions are implemented:
пре 6 година
Easy-interface connection helper:
пре 5 година
- `athena_connect` Simplified Athena JDBC connection helper
пре 6 година
Custom JDBC Classes:
- `Athena`: AthenaJDBC (make a new Athena con obj)
пре 7 година
- `AthenaConnection-class`: AthenaJDBC
- `AthenaDriver-class`: AthenaJDBC
- `AthenaResult-class`: AthenaJDBC
пре 6 година
Custom JDBC Class Methods:
пре 5 година
- `dbConnect-method`
- `dbExistsTable-method`
- `dbGetQuery-method`
- `dbListFields-method`
- `dbListTables-method`
- `dbReadTable-method`
- `dbSendQuery-method`
пре 7 година
пре 6 година
Pulled in from other `cloudyr` pkgs:
пре 6 година
- `read_credentials`: Use Credentials from .aws/credentials File
- `use_credentials`: Use Credentials from .aws/credentials File
пре 6 година
## Installation
пре 7 година
```{r eval=FALSE}
пре 5 година
devtools::install_git("https://git.sr.ht/~hrbrmstr/metis-lite")
# OR
devtools::install_gitlab("hrbrmstr/metis-lite")
# OR
devtools::install_github("hrbrmstr/metis-lite")
пре 7 година
```
пре 5 година
```{r message=FALSE, warning=FALSE, include=FALSE}
пре 7 година
options(width=120)
```
пре 6 година
## Usage
пре 7 година
пре 5 година
```{r message=FALSE, warning=FALSE}
library(metis.lite)
пре 7 година
# current verison
пре 5 година
packageVersion("metis.lite")
пре 7 година
```
пре 5 година
```{r message=FALSE, warning=FALSE}
library(rJava)
library(RJDBC)
library(metis.lite)
library(magrittr)
library(dbplyr)
library(dplyr)
пре 7 година
пре 5 година
dbConnect(
drv = metis.lite::Athena(),
schema_name = "sampledb",
provider = "com.simba.athena.amazonaws.auth.PropertiesFileCredentialsProvider",
AwsCredentialsProviderArguments = path.expand("~/.aws/athenaCredentials.props"),
s3_staging_dir = "s3://aws-athena-query-results-569593279821-us-east-1",
) -> con
пре 7 година
пре 5 година
dbListTables(con, schema="sampledb")
пре 6 година
пре 5 година
dbExistsTable(con, "elb_logs", schema="sampledb")
пре 6 година
пре 5 година
dbListFields(con, "elb_logs", "sampledb")
dbGetQuery(con, "SELECT * FROM sampledb.elb_logs LIMIT 10") %>%
glimpse()
```
### Check types
```{r}
dbGetQuery(con, "
SELECT
CAST('chr' AS CHAR(4)) achar,
CAST('varchr' AS VARCHAR) avarchr,
CAST(SUBSTR(timestamp, 1, 10) AS DATE) AS tsday,
CAST(100.1 AS DOUBLE) AS justadbl,
CAST(127 AS TINYINT) AS asmallint,
CAST(100 AS INTEGER) AS justanint,
CAST(100000000000000000 AS BIGINT) AS abigint,
CAST(('GET' = 'GET') AS BOOLEAN) AS is_get,
ARRAY[1, 2, 3] AS arr1,
ARRAY['1', '2, 3', '4'] AS arr2,
MAP(ARRAY['foo', 'bar'], ARRAY[1, 2]) AS mp,
CAST(ROW(1, 2.0) AS ROW(x BIGINT, y DOUBLE)) AS rw,
CAST('{\"a\":1}' AS JSON) js
FROM elb_logs
LIMIT 1
") %>%
glimpse()
```
пре 7 година
пре 5 година
#### dplyr
```{r}
tbl(con, sql("
SELECT
CAST('chr' AS CHAR(4)) achar,
CAST('varchr' AS VARCHAR) avarchr,
CAST(SUBSTR(timestamp, 1, 10) AS DATE) AS tsday,
CAST(100.1 AS DOUBLE) AS justadbl,
CAST(127 AS TINYINT) AS asmallint,
CAST(100 AS INTEGER) AS justanint,
CAST(100000000000000000 AS BIGINT) AS abigint,
CAST(('GET' = 'GET') AS BOOLEAN) AS is_get,
ARRAY[1, 2, 3] AS arr,
ARRAY['1', '2, 3', '4'] AS arr,
MAP(ARRAY['foo', 'bar'], ARRAY[1, 2]) AS mp,
CAST(ROW(1, 2.0) AS ROW(x BIGINT, y DOUBLE)) AS rw,
CAST('{\"a\":1}' AS JSON) js
FROM elb_logs
LIMIT 1
")) %>%
пре 7 година
glimpse()
пре 7 година
```
пре 6 година
## Code of Conduct
Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms.