The original solution you posted @nedned is probably good for my purposes. My real ‘computationally expensive function’ is not CPU or I/O resource intensive really. It just spends some time (20 to 60 s worth) getting data from slow web sources and I don’t want to put that time onto a user app start up time. Using multiple threads is fine and if I use the write-to-file option I don’t have any lingering concerns over using global variables in Dash.
Using a multiple processes would be nice as it would allow me to use the scheduling solution proposed @chriddyp which seems tidy to me. I have got the multiple processes working (sort of) using the code below . I had to guard the statements with if __name__ == '__main__'
which apparently is required. When I use the multiple processes option I don’t see the output of any print statements in the functions called by that process (e.g. get_new_data_every
) though I can see that it is producing output. The strange thing is that if I repeatedly refresh the web page (app) in my browser the plotted data is updated every 5 seconds, but sometimes it gets updated twice in quick succession at every 5 seconds. I have no idea why. As I say your original solution is a good one but I am curious whether this is a coding problem or something with my python setup. If something is screamingly obviously wrong with my code below it would be good to know, but otherwise you’ve given me heaps of help already and I’m happy to accept your solution.
import time
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
import dash
import dash_html_components as html
import dash_core_components as dcc
import plotly.graph_objs as go
import numpy as np
# number of seconds between re-calculating the data
UPDADE_INTERVAL = 5
def get_new_data():
print('get_new_data')
"""Updates the global variable 'data' with new data"""
#global data
data = np.random.normal(size=1000)
np.save('data1',data)
def get_new_data_every(period=UPDADE_INTERVAL):
print('get_new_data_every')
"""Update the data every 'period' seconds"""
while True:
get_new_data()
print("data updated")
time.sleep(period)
def make_layout():
data = np.load('data1.npy')
chart_title = "data updates server-side every {} seconds".format(UPDADE_INTERVAL)
return html.Div(
dcc.Graph(
id='chart',
figure={
'data': [go.Histogram(x=data)],
'layout': {'title': chart_title}
}
)
)
app = dash.Dash(__name__)
# get initial data
get_new_data()
# we need to set layout to be a function so that for each new page load
# the layout is re-created with the current data, otherwise they will see
# data that was generated when the Dash app was first initialised
app.layout = make_layout
def start_multi():
executor = ProcessPoolExecutor(max_workers=1)
executor.submit(get_new_data_every)
if __name__ == '__main__':
start_multi()
app.run_server(debug=True)