Uses very generic dplyr code to aggregate data. Because of this approach, the calculations automatically run inside the database if `data` has a database or sparklyr connection. The `class()` of such tables in R are: tbl_sql, tbl_dbi, tbl_sql
db_compute_count(data, x, ..., y = n())
data | A table (tbl) |
---|---|
x | A discrete variable |
... | A set of named or unamed aggregations |
y | The aggregation formula. Defaults to count (n) |
# Returns the row count per am mtcars %>% db_compute_count(am)#> # A tibble: 2 x 2 #> am `n()` #> <dbl> <int> #> 1 0 19 #> 2 1 13#> # A tibble: 2 x 2 #> am `mean(mpg)` #> <dbl> <dbl> #> 1 0 17.1 #> 2 1 24.4#> # A tibble: 2 x 3 #> am `mean(mpg)` `sum(mpg)` #> <dbl> <dbl> <dbl> #> 1 0 17.1 326. #> 2 1 24.4 317.