--- output: rmarkdown::github_document editor_options: chunk_output_type: console --- # `metis` Helpers for Accessing and Querying Amazon Athena Including a lightweight RJDBC shim. In Greek mythology, Metis was Athena's "helper". ## Description Still fairly beta-quality level but getting there. The goal will be to get around enough of the "gotchas" that are preventing raw RJDBC Athena connections from "just working" with `dplyr` v0.6.0+ and also get around the [`fetchSize` problem](https://www.reddit.com/r/aws/comments/6aq22b/fetchsize_limit/) without having to not use `dbGetQuery()`. The `AthenaJDBC42_2.0.2.jar` JAR file is included out of convenience but that will likely move to a separate package as this gets closer to prime time if this goes on CRAN. NOTE that the updated driver *REQUIRES JDK 1.8+*. See the **Usage** section for an example. ## IMPORTANT Since R 3.5 (I don't remember this happening in R 3.4.x) signals sent from interrupting Athena JDBC calls crash the R interpreter. You need to set the `-Xrs` option to avoid signals being passed on to the JVM owner. That has to be done _before_ `rJava` is loaded so you either need to remember to put it at the top of all scripts _or_ stick this in your local `~/.Rprofile` and/or sitewide `Rprofile`: ```r if (!grepl("-Xrs", getOption("java.parameters", ""))) { options( "java.parameters" = paste0( c(getOption("java.parameters", default = NULL), "-Xrs"), collapse=" " ) ) } ``` ## What's Inside The Tin? The following functions are implemented: Easy-interface connection helper: - `athena_connect` Make a JDBC connection to Athena Custom JDBC Classes: - `Athena`: AthenaJDBC (make a new Athena con obj) - `AthenaConnection-class`: AthenaJDBC - `AthenaDriver-class`: AthenaJDBC - `AthenaResult-class`: AthenaJDBC Custom JDBC Class Methods: - `dbConnect-method`: AthenaJDBC - `dbExistsTable-method`: AthenaJDBC - `dbGetQuery-method`: AthenaJDBC - `dbListFields-method`: AthenaJDBC - `dbListTables-method`: AthenaJDBC - `dbReadTable-method`: AthenaJDBC - `dbSendQuery-method`: AthenaJDBC Pulled in from other `cloudyr` pkgs: - `read_credentials`: Use Credentials from .aws/credentials File - `use_credentials`: Use Credentials from .aws/credentials File ## Installation ```{r eval=FALSE} devtools::install_github("hrbrmstr/metis") ``` ```{r message=FALSE, warning=FALSE, error=FALSE, include=FALSE} options(width=120) ``` ## Usage ```{r message=FALSE, warning=FALSE, error=FALSE} library(metis) library(tidyverse) # current verison packageVersion("metis") ``` ```{r message=FALSE, warning=FALSE, error=FALSE} use_credentials("default") athena_connect( default_schema = "sampledb", s3_staging_dir = "s3://accessible-bucket", log_path = "/tmp/athena.log", log_level = "DEBUG" ) -> ath dbListTables(ath, schema="sampledb") dbExistsTable(ath, "elb_logs", schema="sampledb") dbListFields(ath, "elb_logs", "sampledb") dbGetQuery(ath, "SELECT * FROM sampledb.elb_logs LIMIT 10") %>% type_convert() %>% glimpse() ``` ## Code of Conduct Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms.