There is also
c) Re-run the query every x minutes and save it to a .csv file in a separate process. Replace your dataframe variable (e.g. df) in your code with a function like:
def df():
return pd.read_csv(...)
d) Cache the query with flask caching, see https://plot.ly/dash/performance. Replace your dataframe variable (e.g. df) in your code with a function like (pseucode, I haven’t run this myself):
@cache.memoize(timeout=60*5) # 5 minutes
def compute_df():
# [...] run query
return df.to_json() # serialize so that it can be easily written to a file for caching
def get_df():
return pd.read_json(compute_df())
df = get_df()
Sorry for trying to awake this old thread but I have a very similar problem where the data of interest in coming from Kafka. Thus I had a look at this code https://github.com/renardeinside/rtvis-proj/blob/master/visualizer/app/server.py which is linked in the “Show and Tell” thread. But it is also using a global DF that is updated. Just for clarification: This is not the way to go, right?
Thanks for the clarification, Chris! Yes, your (btw great) user guide made me realise that I should do it in a different way. But the linked project from “Show & Tell” made me wonder whether it would be okay in some cases. The solution based on Redis and Celery looks interesting, thanks!