✊🏿 Black Lives Matter. Please consider donating to Black Girls Code today.
📊 Dash 2.0 is Arriving. Register here.

How to apply cross-filtering/linked brushing to aggregated data?

Hi!
I guess this post is more about what kind of approach should I take rather than straight code. So I’ve made this dynamic dashboard which can have multiple kinds of basic charts (line, bar etc.) with linked brushing i.e. if you select points in one graph, it highlights them in all the other graphs. For some charts where you can’t directly select points such as a pie chart, it just makes a new pie chart with only the selected points. All the graphs are made from a single dataframe.

Now, I did this my maintaining a central dcc.Store which just store the point indexes of whatever’s highlighted. This works pretty well if the data is used as is e.g. “col1, col2, col3”. But what I generally need to do is handle aggregated data, meaning I’ll usually be dealing aggregated measures e.g. “col1, max(col2), min(col3)”. And each chart can have different basic aggregations (min,max,sum,avg,count). So I want to know how should I go about handling this kind of problem. What should it even look like?

Bump! (Hope that’s allowed under forum rules. Apologies if not.) Still looking for an answer to this.

Is the number of aggregations few or numerous? Fixed or dynamic? Could you give a small example?

Thank you for a reply! How do you mean “number of aggregations”? I’m working on quite a basic level so it’s probably fixed, I would say. Aggregation functions are set (min,max,sum,count,avg) so there aren’t any other. Each column would only have 1 aggregation applied on it, so it can only be sum(col2), there’s no possibility of chaining functions or anything (if that’s even a thing)

If the number of aggregations is limited and fixed, my first thought would be to create some kind of reverse mapping from the aggregate index to the original data index. I was asking for a small example to assess if the idea makes sense for your usecase (:

That was my immediate thought process too, “Can I just create a map that somehow links the new aggregations to the original dataframe indexes?”. I just wasn’t sure if that’s ‘correct’, or if that’s the way other BI tools handle it.

I made a little dashboard on Plotly Chart Studio to show you what I am trying to replicate. I’m trying to replicate this exact behavior: https://plotly.com/~MegamanEXE/27/dashboard/

This is the dummy data for it:
11111111

I’ve noticed indeed what’s happening is that the original row indexes are somehow maintained. For example, all points on this dashboard are aggregated using sum(). So if you select any point, it somehow knows which points were used to make that aggregated point, then it somehow uses the original dataset (without aggregation) and looks for those exact points in all the others graphs and highlights those aggregated points which were composed of these ‘selected’ points.

Or if someone who worked on the Chart Studio cross-filtering feature themselves can clarify how it actually works would be nice too. Or if someone knows of an efficient way to somehow extract points from an aggregation, it would be nice.

Thanks in advance!