Calculates a histogram bins

Uses very generic dplyr code to create histogram bins. Because of this approach, the calculations automatically run inside the database if `data` has a database or sparklyr connection. The `class()` of such tables in R are: tbl_sql, tbl_dbi, tbl_spark

db_compute_bins(data, x, bins = 30, binwidth = NULL)

Arguments

data	A table (tbl)
x	A continuous variable
bins	Number of bins. Defaults to 30.
binwidth	Single value that sets the side of the bins, it overrides bins

Examples


# Returns record count for 30 bins in mpg
mtcars %>%
  db_compute_bins(mpg)
#> # A tibble: 19 x 2
#>      mpg count
#>    <dbl> <int>
#>  1  10.4     2
#>  2  12.8     1
#>  3  13.5     1
#>  4  14.3     2
#>  5  15.1     4
#>  6  15.9     1
#>  7  16.7     1
#>  8  17.4     2
#>  9  18.2     1
#> 10  19.0     3
#> 11  20.6     2
#> 12  21.4     3
#> 13  22.2     2
#> 14  23.7     1
#> 15  25.3     1
#> 16  26.8     1
#> 17  30.0     2
#> 18  32.3     1
#> 19  33.1     1

# Returns record count for bins of size 10
mtcars %>%
  db_compute_bins(mpg, binwidth = 10)
#> # A tibble: 2 x 2
#>     mpg count
#>   <dbl> <int>
#> 1  10.4    18
#> 2  20.4    14

Arguments

See also

Examples

Contents