Difference in performance between local website and server

I am building a dashboard on a website with Dash to visualise data from a database. It involves creating a query, gathering the data, processing those data and then plotting them. The process is quite time-consuming, but that’s not the main concern here.

I am running the dashboard both on my local computer and on a server to compare the performance. While the computation time needed to visualise the data is more or less equal, there is a huge difference in the network time; the server is around 4 to 5 times slower than my local server.

What could be the root cause of the weak performance of the website on the server?

I understand that I probably did not provide enough information for the question to be answered completely, so I would like to know what further details are necessary to do so. Which characteristic of the server could be the cause of the weak performance?

It is most likely due to the increase in network overhead. Apart from storing intermediate results on the server, there is not much you can do.

Would it thus make sense to use dcc.Store + clientside callback?

Those two components could be used in the following way:

  • dcc.store to transfer the data onto the server
  • clientside callback to create the plot

However, I have got some doubts about how fast it could be. Uploading the data to the server takes some time as well; this step might be hindered by the long network time too.

The data in the dcc.store will only be used once (it only comes in handy when some slight changes are made to the plot). Normally new data needs to be queried from the database every time. This will reduce the time savings as well.

In most cases (and in particular if the amount of data is large), I would prefer to

  • Query to data from the server and save the data on the server
  • Send only the filtered data to the client

If the data is not too large, you could achieve event better performance by

  • Saving the data on the client
  • Filtering the data on the client using client side callbacks

For the latter solution to perform optimally, you need to convert all callbacks that use the data (i.e. including plotting etc.) to client side callbacks (i.e. in javascript), so it might require some work.

Just to get things right: with ‘server’ you mean the server that is hosting the website, right? With ‘client’ you mean the web browser of the website visitor? And what you basically suggest is to have as many clientside-operations as possible?

My idea based on your suggestion is the following:

  • On the server-side I could query the data from the database, process those data, store the data in vector-form so that they are ready to plot and then store those vectors on the clientside using the dcc.store component from Dash.
  • Those data can then be plotted in the next step by a clientside callback on the client-side. This way, the security of the data is not an issue (the data in the database belong to the company for which I work).

Would this be a useful implementation of your idea?

Yes, you are correct about the definitions.

If you can do all operations (including plotting, I have never done this, but I guess it is possible) clientside, that is probably the fastest. But doing “most of them” can be problematic, as you might end up transferring the data between the server and the client, potentially multiple times. Therefore, unless you do it all clientside, I would recommend storing all intermediate results (i.e. you data vectors) serverside.

Thank you so much for your input!

So the data vectors should be saved on the server-side (not the client-side) and later transferred to the client-side, where they will be plotted with a client-side callback. How can I do this? dcc.store is only meant to store the data in the browser and is therefore not suited. Should I implement the ideas from your thread about server-side caching? (Show and Tell - Server Side Caching)

Yes, one options is to use the ServersideOutput component when you want data to be kept serverside. Since it is basically a drop in replacement for Output, you could easily try it out to get an idea of how it impacts performance in your case. Then you can always move on to doing clientside callback later, if you need to.

I read throught the documentation on github and two questions popped up:

  • do I need to replace every callback by two new callbacks, with the first new callback responsible for querying, processing and storing the data and the second new one responsible for the creation of the plots?
  • how do I store the data of the ServersideCallback? It appears that dcc.store components are used for this purpose.

No, you only need the ServersideOutput in the case where you store are storing an intermediate result, typically you do this to avoid redoing expensive database queries and/or to enable sharing of data between callbacks. It might be easier to explain of you post some code/pseudo code to illustrate your usecase.

I implemented the ServersideOutput, but an error keeps on occurring. I used the ServersideOutput in the following way:

from dash_extensions.enrich import Dash, ServersideOutput, Output, Input

app = Dash(...)
server = app.server


@app.callback(
    ServersideOutput('graph_1', 'figure'),
    Input('primary_key', 'value'),
)
def make_graph(primary_key):
    # Here is the code that turns the database data into a plot
    return figure1

The error I receive is the following:

Invalid argument `figure` passed into Graph with ID "graph_1".
Expected `object`.
Was supplied type `string`.
Value provided: "9fa150dc6e7045c37bcc157cc8e85261"

Is the ServersideOutput compatible with a figure as output? Or can it only be used for intermediate data that is afterwards processed by another callback?

It is only makes sense for intermediate data. When you use it with a figure you are keeping the figure on the server, so there is nothing for the client to render.

I implemented the ServersideOutput, by increasing the number of callbacks that are responsible for each plot to two. The first callback is responsible for the database call and the data processing, the second one of the plot itself. The intermediate data are stored in a dcc.Store component.

This way, the number of download bytes for the first callback and the number of upload bytes for the second callback are reduced drastically (from 10^5 down to 10^2). The idea is that the database call does not have to be repeated if only an estethical aspect of the plot changes. However, it does not yet result in a measurable decrease in execution time. Could that be because the number of bytes is still too low to experience an effect? Or might there be a different reason?

Are there other alternatives to speed up the performance of the website on the server?