README

5 years ago · c6e0bb7e89
4 changed files with 291 additions and 300 deletions
--- a/.Rbuildignore
+++ b/.Rbuildignore
@ -12,3 +12,4 @@
 ^apache-drill-1\.10\.0\.tar\.gz$
 ^cdh4-repository_1\.0_all\.deb$
 ^cran-comments\.md$
+^pre$
--- a/README.md
+++ b/README.md
@ -6,7 +6,7 @@
 Status](https://travis-ci.org/hrbrmstr/sergeant.svg?branch=master)](https://travis-ci.org/hrbrmstr/sergeant)
 [![Coverage
 Status](https://codecov.io/gh/hrbrmstr/sergeant/branch/master/graph/badge.svg)](https://codecov.io/gh/hrbrmstr/sergeant)
-[![CRAN\_Status\_Badge](http://www.r-pkg.org/badges/version/sergeant)](https://cran.r-project.org/package=sergeant)
+[![CRAN\_Status\_Badge](https://www.r-pkg.org/badges/version/sergeant)](https://cran.r-project.org/package=sergeant)

 # 💂 sergeant

@ -14,26 +14,12 @@ Tools to Transform and Query Data with ‘Apache’ ‘Drill’

 ## \*\* IMPORTANT \*\*

-Version 0.7.0 splits off the JDBC interface into a separate package
-`sergeant.caffeinated`
-([sr.ht](https://git.sr.ht/~hrbrmstr/sergeant);
+Version 0.7.0 (a.k.a. the main branch) splits off the JDBC interface
+into a separate package `sergeant.caffeinated`
 ([GitLab](https://gitlab.com/hrbrmstr/sergeant-caffeinated);
 [GitHub](https://github.com/hrbrmstr/sergeant-caffeinated)).

-If you want to try all the new features coming in 0.8.0 please install from the 0.8.0 branch via:
-
-``` r
-# sr.ht
-devtools::install_git("https://git.sr.ht/~hrbrmstr/sergeant", ref="0.8.0")
-
-# GitLab
-devtools::install_git("https://gitlab.com/hrbrmstr/sergeant", ref="0.8.0")
-
-# GitHub
-devtools::install_git("https://github.com/hrbrmstr/sergeant", ref="0.8.0")
-```
-
-## Description
+I\# Description

 Drill + `sergeant` is (IMO) a streamlined alternative to Spark +
 `sparklyr` if you don’t need the ML components of Spark (i.e. just need
@ -133,14 +119,28 @@ function mappings.
 # Installation

 ``` r
+install.packages("sergeant", repos = "https://cinc.rud.is")
+# or
+devtools::install_git("https://git.rud.is/hrbrmstr/sergeant.git")
+# or
+devtools::install_git("https://git.sr.ht/~hrbrmstr/sergeant")
+# or
+devtools::install_gitlab("hrbrmstr/sergeant")
+# or
 devtools::install_github("hrbrmstr/sergeant")
 ```

+\`\`{r echo=FALSE, message=FALSE, warning=FALSE, error=FALSE}
+options(width=120)
+
+```` 
+
 ## Usage

 ### `dplyr` interface

-``` r
+
+```r
 library(sergeant)
 library(tidyverse)

@ -198,30 +198,32 @@ arrange(db, desc(employee_id)) %>% print(n = 20)
 ##  # Source:     table<cp.`employee.json`> [?? x 20]
 ##  # Database:   DrillConnection
 ##  # Ordered by: desc(employee_id)
-##     employee_id full_name first_name last_name position_id position_title store_id department_id birth_date hire_date
-##     <chr>       <chr>     <chr>      <chr>     <chr>       <chr>          <chr>    <chr>         <chr>      <chr>    
-##   1 999         Beverly … Beverly    Dittmar   17          Store Permane… 8        17            1914-02-02 1998-01-…
-##   2 998         Elizabet… Elizabeth  Jantzer   17          Store Permane… 8        17            1914-02-02 1998-01-…
-##   3 997         John Swe… John       Sweet     17          Store Permane… 8        17            1914-02-02 1998-01-…
-##   4 996         William … William    Murphy    17          Store Permane… 8        17            1914-02-02 1998-01-…
-##   5 995         Carol Li… Carol      Lindsay   17          Store Permane… 8        17            1914-02-02 1998-01-…
-##   6 994         Richard … Richard    Burke     17          Store Permane… 8        17            1914-02-02 1998-01-…
-##   7 993         Ethan Bu… Ethan      Bunosky   17          Store Permane… 8        17            1914-02-02 1998-01-…
-##   8 992         Claudett… Claudette  Cabrera   17          Store Permane… 8        17            1914-02-02 1998-01-…
-##   9 991         Maria Te… Maria      Terry     17          Store Permane… 8        17            1914-02-02 1998-01-…
-##  10 990         Stacey C… Stacey     Case      17          Store Permane… 8        17            1914-02-02 1998-01-…
-##  11 99          Elizabet… Elizabeth  Horne     18          Store Tempora… 6        18            1976-10-05 1997-01-…
-##  12 989         Dominick… Dominick   Nutter    17          Store Permane… 8        17            1914-02-02 1998-01-…
-##  13 988         Brian Wi… Brian      Willeford 17          Store Permane… 8        17            1914-02-02 1998-01-…
-##  14 987         Margaret… Margaret   Clendenen 17          Store Permane… 8        17            1914-02-02 1998-01-…
-##  15 986         Maeve Wa… Maeve      Wall      17          Store Permane… 8        17            1914-02-02 1998-01-…
-##  16 985         Mildred … Mildred    Morrow    16          Store Tempora… 8        16            1914-02-02 1998-01-…
-##  17 984         French W… French     Wilson    16          Store Tempora… 8        16            1914-02-02 1998-01-…
-##  18 983         Elisabet… Elisabeth  Duncan    16          Store Tempora… 8        16            1914-02-02 1998-01-…
-##  19 982         Linda An… Linda      Anderson  16          Store Tempora… 8        16            1914-02-02 1998-01-…
-##  20 981         Selene W… Selene     Watson    16          Store Tempora… 8        16            1914-02-02 1998-01-…
-##  # … with more rows, and 6 more variables: salary <chr>, supervisor_id <chr>, education_level <chr>,
-##  #   marital_status <chr>, gender <chr>, management_role <chr>
+##     employee_id full_name first_name last_name position_id position_title
+##     <chr>       <chr>     <chr>      <chr>     <chr>       <chr>         
+##   1 999         Beverly … Beverly    Dittmar   17          Store Permane…
+##   2 998         Elizabet… Elizabeth  Jantzer   17          Store Permane…
+##   3 997         John Swe… John       Sweet     17          Store Permane…
+##   4 996         William … William    Murphy    17          Store Permane…
+##   5 995         Carol Li… Carol      Lindsay   17          Store Permane…
+##   6 994         Richard … Richard    Burke     17          Store Permane…
+##   7 993         Ethan Bu… Ethan      Bunosky   17          Store Permane…
+##   8 992         Claudett… Claudette  Cabrera   17          Store Permane…
+##   9 991         Maria Te… Maria      Terry     17          Store Permane…
+##  10 990         Stacey C… Stacey     Case      17          Store Permane…
+##  11 99          Elizabet… Elizabeth  Horne     18          Store Tempora…
+##  12 989         Dominick… Dominick   Nutter    17          Store Permane…
+##  13 988         Brian Wi… Brian      Willeford 17          Store Permane…
+##  14 987         Margaret… Margaret   Clendenen 17          Store Permane…
+##  15 986         Maeve Wa… Maeve      Wall      17          Store Permane…
+##  16 985         Mildred … Mildred    Morrow    16          Store Tempora…
+##  17 984         French W… French     Wilson    16          Store Tempora…
+##  18 983         Elisabet… Elisabeth  Duncan    16          Store Tempora…
+##  19 982         Linda An… Linda      Anderson  16          Store Tempora…
+##  20 981         Selene W… Selene     Watson    16          Store Tempora…
+##  # … with more rows, and 10 more variables: store_id <chr>,
+##  #   department_id <chr>, birth_date <chr>, hire_date <chr>, salary <chr>,
+##  #   supervisor_id <chr>, education_level <chr>, marital_status <chr>,
+##  #   gender <chr>, management_role <chr>

 mutate(db, position_title = tolower(position_title)) %>%
  mutate(salary = as.numeric(salary)) %>%
@ -244,7 +246,7 @@ mutate(db, position_title = tolower(position_title)) %>%
 ##   9 6                            4
 ##  10 36                           2
 ##  # … with 102 more rows
-```
+````

 ### REST API

@ -258,57 +260,60 @@ drill_version(dc)
 ##  [1] "1.15.0"

 drill_storage(dc)$name
-##   [1] "cp"       "dfs"      "drilldat" "hbase"    "hdfs"     "hive"     "kudu"     "mongo"    "my"       "s3"
+##   [1] "cp"       "dfs"      "drilldat" "hbase"    "hdfs"     "hive"    
+##   [7] "kudu"     "mongo"    "my"       "s3"

 drill_query(dc, "SELECT * FROM cp.`employee.json` limit 100")
 ##  # A tibble: 100 x 16
-##     employee_id full_name first_name last_name position_id position_title store_id department_id birth_date hire_date
-##     <chr>       <chr>     <chr>      <chr>     <chr>       <chr>          <chr>    <chr>         <chr>      <chr>    
-##   1 1           Sheri No… Sheri      Nowmer    1           President      0        1             1961-08-26 1994-12-…
-##   2 2           Derrick … Derrick    Whelply   2           VP Country Ma… 0        1             1915-07-03 1994-12-…
-##   3 4           Michael … Michael    Spence    2           VP Country Ma… 0        1             1969-06-20 1998-01-…
-##   4 5           Maya Gut… Maya       Gutierrez 2           VP Country Ma… 0        1             1951-05-10 1998-01-…
-##   5 6           Roberta … Roberta    Damstra   3           VP Informatio… 0        2             1942-10-08 1994-12-…
-##   6 7           Rebecca … Rebecca    Kanagaki  4           VP Human Reso… 0        3             1949-03-27 1994-12-…
-##   7 8           Kim Brun… Kim        Brunner   11          Store Manager  9        11            1922-08-10 1998-01-…
-##   8 9           Brenda B… Brenda     Blumberg  11          Store Manager  21       11            1979-06-23 1998-01-…
-##   9 10          Darren S… Darren     Stanz     5           VP Finance     0        5             1949-08-26 1994-12-…
-##  10 11          Jonathan… Jonathan   Murraiin  11          Store Manager  1        11            1967-06-20 1998-01-…
-##  # … with 90 more rows, and 6 more variables: salary <chr>, supervisor_id <chr>, education_level <chr>,
-##  #   marital_status <chr>, gender <chr>, management_role <chr>
+##     employee_id full_name first_name last_name position_id position_title
+##     <chr>       <chr>     <chr>      <chr>     <chr>       <chr>         
+##   1 1           Sheri No… Sheri      Nowmer    1           President     
+##   2 2           Derrick … Derrick    Whelply   2           VP Country Ma…
+##   3 4           Michael … Michael    Spence    2           VP Country Ma…
+##   4 5           Maya Gut… Maya       Gutierrez 2           VP Country Ma…
+##   5 6           Roberta … Roberta    Damstra   3           VP Informatio…
+##   6 7           Rebecca … Rebecca    Kanagaki  4           VP Human Reso…
+##   7 8           Kim Brun… Kim        Brunner   11          Store Manager 
+##   8 9           Brenda B… Brenda     Blumberg  11          Store Manager 
+##   9 10          Darren S… Darren     Stanz     5           VP Finance    
+##  10 11          Jonathan… Jonathan   Murraiin  11          Store Manager 
+##  # … with 90 more rows, and 10 more variables: store_id <chr>,
+##  #   department_id <chr>, birth_date <chr>, hire_date <chr>, salary <chr>,
+##  #   supervisor_id <chr>, education_level <chr>, marital_status <chr>,
+##  #   gender <chr>, management_role <chr>

 drill_query(dc, "SELECT COUNT(gender) AS gctFROM cp.`employee.json` GROUP BY gender")

 drill_options(dc)
 ##  # A tibble: 179 x 6
-##     name                                                        value    defaultValue accessibleScopes kind   optionScope
-##     <chr>                                                       <chr>    <chr>        <chr>            <chr>  <chr>      
-##   1 debug.validate_iterators                                    FALSE    false        ALL              BOOLE… BOOT       
-##   2 debug.validate_vectors                                      FALSE    false        ALL              BOOLE… BOOT       
-##   3 drill.exec.functions.cast_empty_string_to_null              FALSE    false        ALL              BOOLE… BOOT       
-##   4 drill.exec.hashagg.fallback.enabled                         FALSE    false        ALL              BOOLE… BOOT       
-##   5 drill.exec.hashjoin.fallback.enabled                        FALSE    false        ALL              BOOLE… BOOT       
-##   6 drill.exec.memory.operator.output_batch_size                16777216 16777216     SYSTEM           LONG   BOOT       
-##   7 drill.exec.memory.operator.output_batch_size_avail_mem_fac… 0.1      0.1          SYSTEM           DOUBLE BOOT       
-##   8 drill.exec.storage.file.partition.column.label              dir      dir          ALL              STRING BOOT       
-##   9 drill.exec.storage.implicit.filename.column.label           filename filename     ALL              STRING BOOT       
-##  10 drill.exec.storage.implicit.filepath.column.label           filepath filepath     ALL              STRING BOOT       
+##     name              value  defaultValue accessibleScopes kind  optionScope
+##     <chr>             <chr>  <chr>        <chr>            <chr> <chr>      
+##   1 debug.validate_i… FALSE  false        ALL              BOOL… BOOT       
+##   2 debug.validate_v… FALSE  false        ALL              BOOL… BOOT       
+##   3 drill.exec.funct… FALSE  false        ALL              BOOL… BOOT       
+##   4 drill.exec.hasha… FALSE  false        ALL              BOOL… BOOT       
+##   5 drill.exec.hashj… FALSE  false        ALL              BOOL… BOOT       
+##   6 drill.exec.memor… 16777… 16777216     SYSTEM           LONG  BOOT       
+##   7 drill.exec.memor… 0.1    0.1          SYSTEM           DOUB… BOOT       
+##   8 drill.exec.stora… dir    dir          ALL              STRI… BOOT       
+##   9 drill.exec.stora… filen… filename     ALL              STRI… BOOT       
+##  10 drill.exec.stora… filep… filepath     ALL              STRI… BOOT       
 ##  # … with 169 more rows

 drill_options(dc, "json")
 ##  # A tibble: 10 x 6
-##     name                                                    value defaultValue accessibleScopes kind    optionScope
-##     <chr>                                                   <chr> <chr>        <chr>            <chr>   <chr>      
-##   1 store.hive.maprdb_json.optimize_scan_with_native_reader FALSE false        ALL              BOOLEAN BOOT       
-##   2 store.json.all_text_mode                                TRUE  false        ALL              BOOLEAN SYSTEM     
-##   3 store.json.extended_types                               TRUE  false        ALL              BOOLEAN SYSTEM     
-##   4 store.json.read_numbers_as_double                       FALSE false        ALL              BOOLEAN BOOT       
-##   5 store.json.reader.allow_nan_inf                         TRUE  true         ALL              BOOLEAN BOOT       
-##   6 store.json.reader.print_skipped_invalid_record_number   TRUE  false        ALL              BOOLEAN SYSTEM     
-##   7 store.json.reader.skip_invalid_records                  TRUE  false        ALL              BOOLEAN SYSTEM     
-##   8 store.json.writer.allow_nan_inf                         TRUE  true         ALL              BOOLEAN BOOT       
-##   9 store.json.writer.skip_null_fields                      TRUE  true         ALL              BOOLEAN BOOT       
-##  10 store.json.writer.uglify                                TRUE  false        ALL              BOOLEAN SYSTEM
+##     name               value defaultValue accessibleScopes kind  optionScope
+##     <chr>              <chr> <chr>        <chr>            <chr> <chr>      
+##   1 store.hive.maprdb… FALSE false        ALL              BOOL… BOOT       
+##   2 store.json.all_te… TRUE  false        ALL              BOOL… SYSTEM     
+##   3 store.json.extend… TRUE  false        ALL              BOOL… SYSTEM     
+##   4 store.json.read_n… FALSE false        ALL              BOOL… BOOT       
+##   5 store.json.reader… TRUE  true         ALL              BOOL… BOOT       
+##   6 store.json.reader… TRUE  false        ALL              BOOL… SYSTEM     
+##   7 store.json.reader… TRUE  false        ALL              BOOL… SYSTEM     
+##   8 store.json.writer… TRUE  true         ALL              BOOL… BOOT       
+##   9 store.json.writer… TRUE  true         ALL              BOOL… BOOT       
+##  10 store.json.writer… TRUE  false        ALL              BOOL… SYSTEM
 ```

 ## Working with parquet files
@ -375,7 +380,7 @@ select columns[2] as city, columns[4] as lon, columns[3] as lat
 | Lang | \# Files |  (%) |  LoC |  (%) | Blank lines |  (%) | \# Lines |  (%) |
 | :--- | -------: | ---: | ---: | ---: | ----------: | ---: | -------: | ---: |
 | R    |       18 | 0.95 | 1212 | 0.96 |         349 | 0.86 |      716 | 0.89 |
-| Rmd  |        1 | 0.05 |   54 | 0.04 |          56 | 0.14 |       92 | 0.11 |
+| Rmd  |        1 | 0.05 |   56 | 0.04 |          55 | 0.14 |       90 | 0.11 |

 ## Code of Conduct

--- a/pre/README.Rmd
+++ b/pre/README.Rmd
@ -19,7 +19,7 @@ options(sergeant.bigint.warnonce = FALSE)
 [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.1248912.svg)](https://doi.org/10.5281/zenodo.1248912) 
 [![Travis-CI Build Status](https://travis-ci.org/hrbrmstr/sergeant.svg?branch=master)](https://travis-ci.org/hrbrmstr/sergeant) 
 [![Coverage Status](https://codecov.io/gh/hrbrmstr/sergeant/branch/master/graph/badge.svg)](https://codecov.io/gh/hrbrmstr/sergeant)
-[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/sergeant)](https://cran.r-project.org/package=sergeant)
+[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/sergeant)](https://cran.r-project.org/package=sergeant)

 # 💂 sergeant

@ -29,21 +29,7 @@ Tools to Transform and Query Data with 'Apache' 'Drill'

 Version 0.7.0 (a.k.a. the main branch) splits off the JDBC interface into a separate package `sergeant.caffeinated` ([GitLab](https://gitlab.com/hrbrmstr/sergeant-caffeinated); [GitHub](https://github.com/hrbrmstr/sergeant-caffeinated)).

-If you want to try all the new features coming in 0.8.0 please install from the 0.8.0 branch via:
-
-```{r eval=FALSE}
-# sr.ht
-devtools::install_git("https://git.sr.ht/~hrbrmstr/sergeant", ref="0.8.0")
-
-# GitLab
-devtools::install_git("https://gitlab.com/hrbrmstr/sergeant", ref="0.8.0")
-
-# GitHub
-devtools::install_git("https://github.com/hrbrmstr/sergeant", ref="0.8.0")
-```
-
-
-## Description
+I# Description

 Drill + `sergeant` is (IMO) a streamlined alternative to Spark + `sparklyr` if you don't need the ML components of Spark (i.e. just need to query "big data" sources, need to interface with parquet, need to combine disparate data source types — json, csv, parquet, rdbms - for aggregation, etc). Drill also has support for spatial queries.

@ -107,11 +93,10 @@ Note that a number of Drill SQL functions have been mapped to R functions (e.g.

 # Installation

-```{r eval=FALSE}
-devtools::install_github("hrbrmstr/sergeant")
-```
-
-```{r echo=FALSE, message=FALSE, warning=FALSE, error=FALSE}
+```{r einstall-ex, results='asis', echo = FALSE}
+hrbrpkghelpr::install_block()
+````
+``{r echo=FALSE, message=FALSE, warning=FALSE, error=FALSE}
 options(width=120)
 ```