Uses dplyr operations to aggregate data. Because of this approach, the calculations automatically run inside the database if `data` has a database or sparklyr connection. The `class()` of such tables in R are: tbl_sql, tbl_dbi, tbl_spark
Usage
db_compute_count(data, x, ..., y = n())Value
An ungrouped data.frame with the discrete variable and one or more aggregation columns. The first column is the grouping variable (x), followed by the aggregated values.
Examples
# \dontrun{
library(DBI)
library(dplyr)
con <- dbConnect(duckdb::duckdb(), ":memory:")
db_mtcars <- copy_to(con, mtcars, "mtcars")
# Returns the row count per am
db_mtcars |>
db_compute_count(am)
#> # A tibble: 2 × 2
#> am `n()`
#> <dbl> <dbl>
#> 1 0 19
#> 2 1 13
# Returns the average mpg per am
db_mtcars |>
db_compute_count(am, mean(mpg))
#> Warning: Missing values are always removed in SQL aggregation functions.
#> Use `na.rm = TRUE` to silence this warning
#> This warning is displayed once every 8 hours.
#> # A tibble: 2 × 2
#> am `mean(mpg)`
#> <dbl> <dbl>
#> 1 0 17.1
#> 2 1 24.4
# Returns the average and sum of mpg per am
db_mtcars |>
db_compute_count(am, mean(mpg), sum(mpg))
#> # A tibble: 2 × 3
#> am `mean(mpg)` `sum(mpg)`
#> <dbl> <dbl> <dbl>
#> 1 0 17.1 326.
#> 2 1 24.4 317.
dbDisconnect(con)
# }