Here is my original SO question.
I work with sensitive data and I would like to protect it as much as I can.
Let’s say I have age, gender, and location among other things in a dataset.
I am building a dashboard with Dash & Plotly to allow practioners to gain insights from the data.
It is really easy to retrieve the raw data from a plotly graph, after all it is mandatory for interactivity.
My problem is that when we have multiple plotly graph (such as univariate distributions), we just have to concatenate the underlying arrays to retrieve the original dataset. Tadaa, you have age, gender & location, you may now identify pretty much everyone!
The dashboard is not meant to be public but well, one is never careful enough with security/privacy.
As of today we shuffle the data for each graph with
go.Histogram(x=df["age"].sample(frac=1)) for example.
Is anyone else using Plotly/Dash with sensitive data? How do you deal with this issue?