šŸ“£ Dash Pivottable Released

Summary:

Introduction

Iā€™m happy to announce the initial release of Dash Pivottable (v0.0.1). Dash Pivottable is Dash component that wraps react-pivottable, a library created by @nicolaskruchten. You can find an example here, which will look like this:

The library has gained traction with the Dash Community after it was published as a Show & Tell, so we felt it was the right time to publish it as a library.

Dash Pivottable is already a complete and self-contained component for manipulating tabular datasets, but the project itself is the perfect example of how the Dash community can create their own components. It was created using the component boilerplate and written in less than 100 lines of React code (if we donā€™t count prop typing), since it is wrapping an existing React library. This makes it easy to modify (if you want to create a custom pivottable component), or serve as a template for building a completely different component.

Quickstart

All you need to do to get started is to install the package from PyPi:

pip install dash-pivottable

and create an app in the following way:

import dash
import dash_html_components as html
import dash_pivottable

app = dash.Dash(__name__)
server = app.server

app.layout = html.Div(
    dash_pivottable.PivotTable(
        data=[
            ['Animal', 'Count', 'Location'],
            ['Zebra', 5, 'SF Zoo'],
            ['Tiger', 3, 'SF Zoo'],
            ['Zebra', 2, 'LA Zoo'],
            ['Tiger', 4, 'LA Zoo'],
        ],
        cols=["Animal"],
        rows=["Location"],
        vals=["Count"]
    )
)

if __name__ == "__main__":
    app.run_server(debug=True)

And you will get the following pivot table:

Dash for R and CRAN

The release v0.0.2 includes support for R, and it will soon be contributed to CRAN. You will be able to find the download instructions in the README.md.

Changelog

In PR #21, I updated and improved the code, so that we would have automated testing, templates, and more. Below is a summary of the changes:

Added

  • Up-to-date Contributing instructions
  • Up-to-date Readme
  • Issue and PR templates
  • Dash testing with Pytest fixtures
  • CircleCI deployment and Percy Snapshots
  • Create package-lock.json instead of yarn.lock
  • Up-to-date license
  • Up-to-date setup.py
17 Likes

This feature is a deal breaker. Canā€™t wait to try it out! Will it have issues when dealing with large datasets?

3 Likes

Thanks! It is a nice ready-to-use component, it speeds up many data exploration processes, I have recently digging the tool.

By the way, is it possible to create some kind of calculated fields like in Excel.
For instance, suposse, we have 4 columns: store, salesman, sales and profit. Letā€™s say we are interested about markup, its formula is markup = sum(profit)/sum(sales).
So, basically, if you want to know about stores markup you have to group by stores, then sum each profit and sales and finally perform the quotient. If you want to know the salesman markup it is almost the same but group by salesman.
Nowadays, is it posible to get the markup dinamically depending on the level desired? Thanks a lot!

I donā€™t think you can compute something like markup = sum(profit)/sum(sales) directly with pivottable, but itā€™s probably easy to do using a Dash callback and Dash DataTable.

Actually you can! I didnā€™t realize a built-in option was included: Sum over Sum. It is exactly what I was looking for.
Thanks anyway!

1 Like

Thanks very much for this! Iā€™ve been using the initial Show and Tell version (which I imagine isnā€™t very different to v0.0.1 that has been officially released?) and getting some great results with it. A question I previously posted which Iā€™ll bump hereā€¦

I think it will amount to a dynamic calculation on a subset of data based on your chosen groupings, so potentially out of scope for now, but if there is an easy-ish workaround that anyone can suggest that would be much appreciated!

Hello @xhlu, thanks for the great component!
Iā€™m having trouble to dynamically change the data property. I made a dropdown to select from various different datasets and once one column gets dropped into the pivot table I cannot get the component to work properly after changing to other dataset. Am I missing something?

Hi, happy you appreciate it! Have you tried compute what you want with pandas? If itā€™s possible with pandas (or any other data wrangling software for that matter), it should be easy to wrap it inside a callback that takes as input the dash-pivottable, and outputs the correctly formatted table as a Dash table.

Hi, Iā€™m not sure I correctly understood your question. Do you have a short, runnable example you could share so I can see if itā€™s a bug or itā€™s an intended feature? Thanks!

Of course, here it is:

data does not get reloaded once the pivot table gets used with other dataset. Example:

import dash
from dash.dependencies import Input, Output
import dash_html_components as html
import dash_core_components as dcc
import dash_pivottable

data_0 = [
    ["Total Bill", "Tip", "Payer Gender", "Payer Smoker", "Day of Week", "Meal", "Party Size"],
    [16.99, 1.01, "Female", "Non-Smoker", "Sunday", "Dinner", 2],
    [10.34, 1.66, "Male", "Non-Smoker", "Monday", "Dinner", 3],
    [21.01, 3.5, "Male", "Non-Smoker", "Saturday", "Dinner", 3],
    [23.68, 3.31, "Male", "Non-Smoker", "Sunday", "Dinner", 2],
]

data_1 = [["animal", "is_good_boy", "kg_of_food"],
          ["dog", "yes", 100],
          ["cat", "no", 12],
          ["bird", "no", 4]]

data_2 = [["book", "pages", "liked_it"],
          ["dogs are cool", 400, 1],
          ["cats are bad", 350, 0],
          ["birds fly", 20, 0]]


data_dict = {
    "data_0": data_0,
    "data_1": data_1,
    "data_2": data_2
}

app = dash.Dash(__name__)
app.title = 'My Dash example'

app.layout = html.Div([
    dcc.Dropdown(
        id='data_selection_dropdown',
        options=[{"label": e, "value": e} for e in data_dict],
        value=list(data_dict.keys())[0]
    ),
    dash_pivottable.PivotTable(
        id='table',
        colOrder="key_a_to_z",
        rowOrder="key_a_to_z",
        rendererName="Table",
        aggregatorName="Count",
    ),
    html.Div(
        id='output'
    )
])

@app.callback(Output("table", "data"),
              [Input("data_selection_dropdown", "value")])
def select_data(selected_data):
    return data_dict[selected_data]


@app.callback(Output('output', 'children'),
              [Input('table', 'cols'),
               Input('table', 'rows'),
               Input('table', 'rowOrder'),
               Input('table', 'colOrder'),
               Input('table', 'aggregatorName'),
               Input('table', 'rendererName')])
def display_props(cols, rows, row_order, col_order, aggregator, renderer):
    return [
        html.P(str(cols), id='columns'),
        html.P(str(rows), id='rows'),
        html.P(str(row_order), id='row_order'),
        html.P(str(col_order), id='col_order'),
        html.P(str(aggregator), id='aggregator'),
        html.P(str(renderer), id='renderer'),
    ]


if __name__ == '__main__':
    app.run_server(debug=True)

Thanks for the help!

3 Likes

Well, I was able to go around the issue by creating the PivotTable component again each time the data changes. But this brought me to other issue. How to set the inside plots width and height? They grow bigger than the parent PivotTable container.

1 Like

how do you solve it? please advise.

This definitely seems weird. If you remove the second callback, do you still get this problem?

Hi @xhlu, absolutely amazed by this powerful feature and was looking for this in a while!

Wonder if thereā€™s a way to have multiple values column, e.g. displaying ā€œheightā€ and ā€œweightā€ as 2 value columns simultaneously in the ā€œTableā€ mode, grouped by the rows?

Thanks in advance! Sorry if this is already covered in other threads.

Iā€™m not sure if I understand correctly, but do you mean something like this:

But rather than ā€œFemaleā€ and ā€œMaleā€, itā€™d be ā€œheightā€ and ā€œweightā€? Feel free to share a diagram and dataset of what you have in mind!

@xhlu Thanks for the reply. Iā€™m trying to display the aggregate results of 2 value columns in the same table, similar to how the PivotTable works in Excel below. In the current world, I think it is only achievable if the user toggle manually in the dropdown between ā€œHeightā€ and ā€œWeightā€, but I want to see if there is a way to display the aggregated result without having to toggle the dropdown for values column in Dash.

In Excel (expected output with 2 aggregated values column):

Currently in Dash (have to toggle in dropdown):

import dash
import dash_html_components as html
import dash_pivottable

app = dash.Dash(__name__)
server = app.server

app.layout = html.Div(
    dash_pivottable.PivotTable(
        data=[
            ['Name', 'Country', 'Sex', 'Height', 'Weight'],
            ['John', 'US','M', 180, 70],
            ['Paul', 'US','M', 200, 65],
            ['Peter', 'Canada','M', 160, 60],
            ['Mark', 'Canada','M', 190, 80],
            ['Mary', 'Canada','F', 190, 50],
            ['Jane', 'Canada','F', 160, 70],
            ['Susan', 'US','F', 170, 70],
            ['Mandy', 'US','F', 150, 90],
        ],
        rows=["Sex","Country"],
        vals=["Height"],
        aggregatorName='Average'
    )
)

if __name__ == "__main__":
    app.run_server(debug=True)

Hello
I have similar requirement, an example is attached


Regards

Would you mind sharing your solution? I am having the same problem but recreating the element doesnā€™t seem to work. I am not sure if it is because I am calling it as page content within a dcc.location or there is some other issue. The pivot table is updated if I refresh the page so I assume the element needs to be ā€˜redrawnā€™ in some manner for changes from callbacks to be reflected.

@bphillip, @yts61 This is the code:

import dash
from dash.dependencies import Input, Output
import dash_html_components as html
import dash_core_components as dcc
import dash_pivottable

data_0 = [
   ["Total Bill", "Tip", "Payer Gender", "Payer Smoker", "Day of Week", "Meal", "Party Size"],
   [16.99, 1.01, "Female", "Non-Smoker", "Sunday", "Dinner", 2],
   [10.34, 1.66, "Male", "Non-Smoker", "Monday", "Dinner", 3],
   [21.01, 3.5, "Male", "Non-Smoker", "Saturday", "Dinner", 3],
   [23.68, 3.31, "Male", "Non-Smoker", "Sunday", "Dinner", 2],
]

data_1 = [["animal", "is_good_boy", "kg_of_food"],
         ["dog", "yes", 100],
         ["cat", "no", 12],
         ["bird", "no", 4]]

data_2 = [["book", "pages", "liked_it"],
         ["dogs are cool", 400, 1],
         ["cats are bad", 350, 0],
         ["birds fly", 20, 0]]

data_dict = {
   "data_0": data_0,
   "data_1": data_1,
   "data_2": data_2
}

app = dash.Dash(__name__)
app.title = 'My Dash example'

app.layout = html.Div([
   dcc.Dropdown(
       id='data_selection_dropdown',
       options=[{"label": e, "value": e} for e in data_dict],
       value=list(data_dict.keys())[0]
   ),
   html.Div(id="pivot_table_container"),
])


@app.callback(Output("pivot_table_container", "children"),
             [Input("data_selection_dropdown", "value")])
def select_data(selected_data):
   return make_pivot_table(data_dict[selected_data], f"{selected_data}_pivot")


def make_pivot_table(data, component_id):
   component = dash_pivottable.PivotTable(
       id=component_id,
       data=data,
       colOrder="key_a_to_z",
       rowOrder="key_a_to_z",
       rendererName="Table",
       aggregatorName="Count",
   )
   return component


if __name__ == '__main__':
   app.run_server(debug=True)

If you find out how to control the size of the inner plots or to limit the options in the dropdowns Iā€™ll really appreciate you share it also.

BR,

1 Like

Thank you, I wasnā€™t redefining the id each time I created the PivotTable so it wasnā€™t actually changing. This works for switching tables, now I will have to come up with a workaround to take date-selector as input and make a unique id each time a new date range is selected. That or find a way to reload the element. As for the inner plots and options, those are defined in the original js library that this wraps. I remember seeing the drop down defined in the original here https://github.com/plotly/react-pivottable/blob/a3a8070d895b3731d0d4b0f60bec98c7423af06b/src/Utilities.js#L422