Access and Query Amazon Athena via DBI/JDBC
Ви не можете вибрати більше 25 тем Теми мають розпочинатися з літери або цифри, можуть містити дефіси (-) і не повинні перевищувати 35 символів.

2.9 KiB

---
output: rmarkdown::github_document
editor_options:
chunk_output_type: console
---

# metis

Access and Query Amazon Athena via DBI/JDBC

## Description

In Greek mythology, Metis was Athena's "helper" so...

Methods are provided to connect to 'Amazon' 'Athena', lookup schemas/tables,
perform queries and retrieve query results via the included JDBC DBI driver.

## What's Inside The Tin?

The following functions are implemented:

Easy-interface connection helper:

- `athena_connect` Simplified Athena JDBC connection helper

Custom JDBC Classes:

- `Athena`: AthenaJDBC (make a new Athena con obj)
- `AthenaConnection-class`: AthenaJDBC
- `AthenaDriver-class`: AthenaJDBC
- `AthenaResult-class`: AthenaJDBC

Custom JDBC Class Methods:

- `dbConnect-method`
- `dbExistsTable-method`
- `dbGetQuery-method`
- `dbListFields-method`
- `dbListTables-method`
- `dbReadTable-method`
- `dbSendQuery-method`

Pulled in from other `cloudyr` pkgs:

- `read_credentials`: Use Credentials from .aws/credentials File
- `use_credentials`: Use Credentials from .aws/credentials File

## Installation

```{r eval=FALSE}
devtools::install_git("https://git.sr.ht/~hrbrmstr/metis")
# OR
devtools::install_gitlab("hrbrmstr/metis")
# OR
devtools::install_github("hrbrmstr/metis")
```

```{r message=FALSE, warning=FALSE, include=FALSE}
options(width=120)
```

## Usage

```{r message=FALSE, warning=FALSE}
library(metis)

# current verison
packageVersion("metis")
```

```{r message=FALSE, warning=FALSE}
library(rJava)
library(RJDBC)
library(metis)
library(magrittr)

dbConnect(
drv = metis::Athena(),
schema_name = "sampledb",
provider = "com.simba.athena.amazonaws.auth.PropertiesFileCredentialsProvider",
AwsCredentialsProviderArguments = path.expand("~/.aws/athenaCredentials.props"),
s3_staging_dir = "s3://aws-athena-query-results-569593279821-us-east-1",
) -> con

dbListTables(con, schema="sampledb")

dbExistsTable(con, "elb_logs", schema="sampledb")

dbListFields(con, "elb_logs", "sampledb")

dbGetQuery(con, "SELECT * FROM sampledb.elb_logs LIMIT 10") %>%
dplyr::glimpse()
```

### Check types

```{r}
dbGetQuery(con, "
SELECT
CAST('chr' AS CHAR(4)) achar,
CAST('varchr' AS VARCHAR) avarchr,
CAST(SUBSTR(timestamp, 1, 10) AS DATE) AS tsday,
CAST(100.1 AS DOUBLE) AS justadbl,
CAST(127 AS TINYINT) AS asmallint,
CAST(100 AS INTEGER) AS justanint,
CAST(100000000000000000 AS BIGINT) AS abigint,
CAST(('GET' = 'GET') AS BOOLEAN) AS is_get,
ARRAY[1, 2, 3] AS arr1,
ARRAY['1', '2, 3', '4'] AS arr2,
MAP(ARRAY['foo', 'bar'], ARRAY[1, 2]) AS mp,
CAST(ROW(1, 2.0) AS ROW(x BIGINT, y DOUBLE)) AS rw,
CAST('{\"a\":1}' AS JSON) js
FROM elb_logs
LIMIT 1
") %>%
dplyr::glimpse()
```

```{r cloc}
cloc::cloc_pkg_md()
```

## Code of Conduct

Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms.