Intro

Create pivot tables with commonly used terms as commands such as: pivot_rows(), pivot_columns() and pivot_values(), and string them together with a pipe (%>%).

The idea is that the creation of a pivot table is done using code, as opposed to drag-and-drop. This means that actions such as pivot_flip() and pivot_drill() are also possible, and performed using R commands.

Another goal of pivotable is to provide a framework to easily define prep_measures() and prep_dimensions() of your data. The resulting R object can then be used as the source of the pivot table. This should make it possible to create consistent analysis, and subsequent reporting of the data.

Installation

Pivot table

Values

The pivot_values() function is used to add an aggregation in the pivot table. When used against a data frame, you will have to provide an aggregation formula of a field, or fields, within the data. If using a pre-defined set of dimensions and measures, then simply call the desired measure field, no need to re-aggregate. For more info see Define dimensions and measures.

Multiple aggregations are supported by pivot_values()

The aggregations can also be named inside pivot_values()

Pivot table operations

Switch rows to columns

Instead of “manually” switching the content of pivot_rows() and pivot_columns(), specially during data exploration, simply pipe the code to the pivot_flip() command.

pivotable also includes support for the t() method from base R. It will perform the exact same operation as pivot_flip()

Filter

To limit the pivot table to display only a subset of the pivot table, use pivot_focus()

pivotable also supports dplyr’s filter() command.

Drill

Another powerful thing of pivot tables is the ability to drill down into the data. To do this in pivotable, you will need to define a hierarchy dimension using the dim_hierarchy() command. That command is made to be called within one of the dimension definition functions, such as pivot_rows() or pivot_columns(). The order of the hierarchy is defined by the order in which the variables is passed to the function.

The pivot_drill() command will add the next level of the hierarchy dimension to the pivot table.

A helper function called dim_hierarchy_mqy() creates a three level date hierarchy: year, quarter and month. The function will create the formulas to calculate each level, but those formulas will not be evaluated until the drilling into the level. The formulas are generic enough to work on database back-ends.

Totals

The display of the totals can be controlled using pivot_totals(). It is possible to control the display of row and column totals individually. Currently, sub-totals and grand totals are sum() aggregates, so if the actual calculations are not record counts, or another sum() consider leaving them off. For example, if the calculation is a mean(), then the sub-total and grand total will be the aggregate sum() total of the averages, which would be incorrect.

A default can be set to control if the column or row totals are displayed. The default is set to not show.

pivot_default_totals(FALSE, FALSE)

Default values

By themselves, pivot_rows() and pivot_columns() will only provide a list of the unique values of the data frame. There is a way to setup a default aggregation by using pivot_default_values()

Define dimensions and measures

With pivotable, it is possible to pre-define a set of dimensions and measures that can then be easily accesses and re-used by pivot tables. The idea is to provide a way to centralize data definitions, which creates a consistent reporting.

Database, Spark and data.table connections

Because pivotable uses dplyr commands to create the aggregations. This allows pivotable to take advantage of the same integration that dplyr has, such as Spark, databases and data.table.

Measures and dimensions

It is also possible create a data definition against a database connection. The prep_dimensions() and prep_measures() calculations will not be send to the database until used in the pivot table.

orders_db <- tbl_sales %>%
  prep_dimensions(
    status, 
    country
  ) %>%
  prep_measures(
    orders_qty = n(), 
    order_total = sum(sales, na.rm = TRUE),
    sales_qty = sum(ifelse(status %in% c("In Process", "Shipped"), 1, 0), na.rm = TRUE),
    sales_total = sum(ifelse(status %in% c("In Process", "Shipped"), sales, 0), na.rm = TRUE)
    )

pivottabler

pivotable uses pivottabler to print the pivot table into the R console. The to_pivottabler() returns the actual pivottabler object. This allows you to further customize the pivot table using that package’s API.