Use src_drill()
to connect to a Drill cluster and `tbl()` to connect to a
fully-qualified "table reference". The vast majority of Drill SQL functions have
also been made available to the dplyr
interface. If you have custom Drill
SQL functions that need to be implemented please file an issue on GitHub.
src_drill(host = Sys.getenv("DRILL_HOST", "localhost"), port = as.integer(Sys.getenv("DRILL_PORT", 8047L)), ssl = FALSE) # S3 method for src_drill tbl(src, from, ...)
host | Drill host (will pick up the value from |
---|---|
port | Drill port (will pick up the value from |
ssl | use ssl? |
src | A Drill "src" created with |
from | A Drill view or table specification |
... | Extra parameters |
This is a DBI wrapper around the Drill REST API. TODO username/password support
not_run({ db <- src_drill("localhost", "8047") print(db) emp <- tbl(db, "cp.`employee.json`") count(emp, gender, marital_status) # Drill-specific SQL functions are also available select(emp, full_name) %>% mutate( loc = strpos(full_name, "a"), first_three = substr(full_name, 1L, 3L), len = length(full_name), rx = regexp_replace(full_name, "[aeiouAEIOU]", "*"), rnd = rand(), pos = position("en", full_name), rpd = rpad(full_name, 20L), rpdw = rpad_with(full_name, 20L, "*")) })