Access and Query Amazon Athena via DBI/JDBC
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 6.1KB

3 years ago
2 years ago
3 years ago
2 years ago
3 years ago
1 year ago
3 years ago
1 year ago
2 years ago
3 years ago
2 years ago
2 years ago
2 years ago
3 years ago
2 years ago
2 years ago
3 years ago
1 year ago
1 year ago
1 year ago
3 years ago
2 years ago
3 years ago
1 year ago
3 years ago
1 year ago
3 years ago
3 years ago
1 year ago
1 year ago
1 year ago
1 year ago
3 years ago
2 years ago
2 years ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
2 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181
  1. [![Travis-CI Build Status](https://travis-ci.org/hrbrmstr/metis.svg?branch=master)](https://travis-ci.org/hrbrmstr/metis)
  2. [![Coverage Status](https://codecov.io/gh/hrbrmstr/metis/branch/master/graph/badge.svg)](https://codecov.io/gh/hrbrmstr/metis
  3. [![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/metis)](https://cran.r-project.org/package=metis)
  4. # metis
  5. Access and Query Amazon Athena via DBI/JDBC
  6. ## Description
  7. In Greek mythology, Metis was Athena’s “helper” so…
  8. Methods are provided to connect to ‘Amazon’ ‘Athena’, lookup
  9. schemas/tables, perform queries and retrieve query results via the
  10. included JDBC DBI driver.
  11. ## What’s Inside The Tin?
  12. The following functions are implemented:
  13. Easy-interface connection helper:
  14. - `athena_connect` Simplified Athena JDBC connection helper
  15. Custom JDBC Classes:
  16. - `Athena`: AthenaJDBC (make a new Athena con obj)
  17. - `AthenaConnection-class`: AthenaJDBC
  18. - `AthenaDriver-class`: AthenaJDBC
  19. - `AthenaResult-class`: AthenaJDBC
  20. Custom JDBC Class Methods:
  21. - `dbConnect-method`
  22. - `dbExistsTable-method`
  23. - `dbGetQuery-method`
  24. - `dbListFields-method`
  25. - `dbListTables-method`
  26. - `dbReadTable-method`
  27. - `dbSendQuery-method`
  28. Pulled in from other `cloudyr` pkgs:
  29. - `read_credentials`: Use Credentials from .aws/credentials File
  30. - `use_credentials`: Use Credentials from .aws/credentials File
  31. ## Installation
  32. ``` r
  33. devtools::install_git("https://git.sr.ht/~hrbrmstr/metis")
  34. # OR
  35. devtools::install_gitlab("hrbrmstr/metis")
  36. # OR
  37. devtools::install_github("hrbrmstr/metis")
  38. ```
  39. ## Usage
  40. ``` r
  41. library(metis)
  42. # current verison
  43. packageVersion("metis")
  44. ```
  45. ## [1] '0.3.0'
  46. ``` r
  47. library(rJava)
  48. library(RJDBC)
  49. library(metis)
  50. library(magrittr) # for piping b/c I'm addicted
  51. ```
  52. ``` r
  53. dbConnect(
  54. metis::Athena(),
  55. Schema = "sampledb",
  56. AwsCredentialsProviderClass = "com.simba.athena.amazonaws.auth.PropertiesFileCredentialsProvider",
  57. AwsCredentialsProviderArguments = path.expand("~/.aws/athenaCredentials.props")
  58. ) -> con
  59. dbListTables(con, schema="sampledb")
  60. ```
  61. ## [1] "elb_logs"
  62. ``` r
  63. dbExistsTable(con, "elb_logs", schema="sampledb")
  64. ```
  65. ## [1] TRUE
  66. ``` r
  67. dbListFields(con, "elb_logs", "sampledb")
  68. ```
  69. ## [1] "timestamp" "elbname" "requestip" "requestport"
  70. ## [5] "backendip" "backendport" "requestprocessingtime" "backendprocessingtime"
  71. ## [9] "clientresponsetime" "elbresponsecode" "backendresponsecode" "receivedbytes"
  72. ## [13] "sentbytes" "requestverb" "url" "protocol"
  73. ``` r
  74. dbGetQuery(con, "SELECT * FROM sampledb.elb_logs LIMIT 10") %>%
  75. dplyr::glimpse()
  76. ```
  77. ## Observations: 10
  78. ## Variables: 16
  79. ## $ timestamp <chr> "2014-09-27T00:00:25.424956Z", "2014-09-27T00:00:56.439218Z", "2014-09-27T00:01:27.4417…
  80. ## $ elbname <chr> "lb-demo", "lb-demo", "lb-demo", "lb-demo", "lb-demo", "lb-demo", "lb-demo", "lb-demo",…
  81. ## $ requestip <chr> "241.230.198.83", "252.26.60.51", "250.244.20.109", "247.59.58.167", "254.64.224.54", "…
  82. ## $ requestport <int> 27026, 27026, 27026, 27026, 27026, 27026, 27026, 27026, 27026, 27026
  83. ## $ backendip <chr> "251.192.40.76", "249.89.116.3", "251.111.156.171", "251.139.91.156", "251.111.156.171"…
  84. ## $ backendport <int> 443, 8888, 8888, 8888, 8000, 8888, 8888, 8888, 8888, 8888
  85. ## $ requestprocessingtime <dbl> 9.1e-05, 9.4e-05, 8.4e-05, 9.7e-05, 9.1e-05, 9.3e-05, 9.4e-05, 8.3e-05, 9.0e-05, 9.0e-05
  86. ## $ backendprocessingtime <dbl> 0.046598, 0.038973, 0.047054, 0.039845, 0.061461, 0.037791, 0.047035, 0.048792, 0.04572…
  87. ## $ clientresponsetime <dbl> 4.9e-05, 4.7e-05, 4.9e-05, 4.9e-05, 4.0e-05, 7.7e-05, 7.5e-05, 7.3e-05, 4.0e-05, 6.7e-05
  88. ## $ elbresponsecode <chr> "200", "200", "200", "200", "200", "200", "200", "200", "200", "200"
  89. ## $ backendresponsecode <chr> "200", "200", "200", "200", "200", "400", "400", "200", "200", "200"
  90. ## $ receivedbytes <S3: integer64> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
  91. ## $ sentbytes <S3: integer64> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2
  92. ## $ requestverb <chr> "GET", "GET", "GET", "GET", "GET", "GET", "GET", "GET", "GET", "GET"
  93. ## $ url <chr> "http://www.abcxyz.com:80/jobbrowser/?format=json&state=running&user=20g578y", "http://…
  94. ## $ protocol <chr> "HTTP/1.1", "HTTP/1.1", "HTTP/1.1", "HTTP/1.1", "HTTP/1.1", "HTTP/1.1", "HTTP/1.1", "HT…
  95. ### Check types
  96. ``` r
  97. dbGetQuery(con, "
  98. SELECT
  99. CAST('chr' AS CHAR(4)) achar,
  100. CAST('varchr' AS VARCHAR) avarchr,
  101. CAST(SUBSTR(timestamp, 1, 10) AS DATE) AS tsday,
  102. CAST(100.1 AS DOUBLE) AS justadbl,
  103. CAST(127 AS TINYINT) AS asmallint,
  104. CAST(100 AS INTEGER) AS justanint,
  105. CAST(100000000000000000 AS BIGINT) AS abigint,
  106. CAST(('GET' = 'GET') AS BOOLEAN) AS is_get,
  107. ARRAY[1, 2, 3] AS arr1,
  108. ARRAY['1', '2, 3', '4'] AS arr2,
  109. MAP(ARRAY['foo', 'bar'], ARRAY[1, 2]) AS mp,
  110. CAST(ROW(1, 2.0) AS ROW(x BIGINT, y DOUBLE)) AS rw,
  111. CAST('{\"a\":1}' AS JSON) js
  112. FROM elb_logs
  113. LIMIT 1
  114. ") %>%
  115. dplyr::glimpse()
  116. ```
  117. ## Observations: 1
  118. ## Variables: 13
  119. ## $ achar <chr> "chr "
  120. ## $ avarchr <chr> "varchr"
  121. ## $ tsday <date> 2014-09-29
  122. ## $ justadbl <dbl> 100.1
  123. ## $ asmallint <int> 127
  124. ## $ justanint <int> 100
  125. ## $ abigint <S3: integer64> 100000000000000000
  126. ## $ is_get <lgl> TRUE
  127. ## $ arr1 <chr> "1, 2, 3"
  128. ## $ arr2 <chr> "1, 2, 3, 4"
  129. ## $ mp <chr> "{bar=2, foo=1}"
  130. ## $ rw <chr> "{x=1, y=2.0}"
  131. ## $ js <chr> "\"{\\\"a\\\":1}\""
  132. ``` r
  133. cloc::cloc_pkg_md()
  134. ```
  135. | Lang | \# Files | (%) | LoC | (%) | Blank lines | (%) | \# Lines | (%) |
  136. | :--- | -------: | ---: | --: | ---: | ----------: | ---: | -------: | ---: |
  137. | R | 8 | 0.89 | 250 | 0.83 | 83 | 0.72 | 194 | 0.79 |
  138. | Rmd | 1 | 0.11 | 50 | 0.17 | 32 | 0.28 | 53 | 0.21 |
  139. ## Code of Conduct
  140. Please note that this project is released with a [Contributor Code of
  141. Conduct](CONDUCT.md). By participating in this project you agree to
  142. abide by its terms.