Dash app not reading modified file on refreshing the browser

Hi
I have simple Dash app which reads one file and display content in a Data Table.

There is one other program which update the file,

If I refresh the dash app it’s not reading updated file. It’s displaying old data only even though file is modified. It will show modified data if I restart the app.

So how to read modified data without restarting. If should read modified data on refresh.

Hi @dipak96, https://dash.plot.ly/live-updates is your friend for this case.

Thanks, @byronz for the suggestion but the thing is I can’t refresh the page again and again. People will be reading information and few will be copying the information from the app. So I can’t refresh the page automatically. The page needs to refresh only when the person using app explicitly refresh it. It’s a requirement so I can’t do anything about it.

Hi @byronz
Code sample is attached below.

import dash
from dash.dependencies import Input, Output, State
import dash_core_components as dcc
import dash_html_components as html
import dash_table
import pandas as pd

df = pd.read_csv(“data.csv”)
print (df)

external_stylesheets = [‘https://codepen.io/chriddyp/pen/bWLwgP.css’]

app = dash.Dash(__name__, external_stylesheets=external_stylesheets)

app.layout = html.Div([
dash_table.DataTable(
data=df.to_dict(‘records’),
columns=[{‘name’: i, ‘id’: i} for i in df.columns]
),

])

if name == ‘main’:
app.run_server(port = 8082)

Here data.csv file is being modified by another program, when I refresh the page it’s not reading modified data.csv file. It’s showing old data only.

Can anyone explain the reason behind this?

When I close the app and restart it then it will read the modified file.

to control that, you can either use state or new feature no_update 📣 Dash 0.41.0 released

so in your callback you can for example check if the file is modified by os.stat or sth else, and return no_update if not changed

@dipak96 from your sample code, the refresh action will not trigger a reload of csv file with pandas

you need an interval to regular check the file timestamp, if it’s been modified, then your callback need to return/update the reloaded data content to datatable component, otherwise just return no_update to prevent the unnecessary update

@waly01 a sample could be like this


import os
import pandas as pd
from dash import Dash, callback_context, no_update
from dash.dependencies import Input, Output
from dash_table import DataTable
import dash_core_components as dcc
import dash_html_components as html

path = "demo.csv"
df = pd.read_csv(path)
lastmt = os.stat(path).st_mtime
print(lastmt)
app = Dash(__name__)
app.layout = html.Div(
    [
        DataTable(
            id="table",
            columns=[{"name": i, "id": i} for i in df.columns],
            data=df.to_dict("records"),
            export_format="csv",
        ),
        dcc.Interval(id='interval', interval=1000, n_intervals=0)
    ]
)

@app.callback(Output('table', 'data'), [Input('interval', 'n_intervals')])
def trigger_by_modify(n):
    if os.stat(path).st_mtime > lastmt:
        print("modified")
        lastmt = os.stat(path).st_mtime
        return pd.read_csv(path).to_dict('records')
    return no_update

if __name__ == "__main__":
    app.run_server(debug=True)

2 Likes

@byronz Super helpful thank you.

If I understand this correctly, the below code block wont be run at each interval, right? So we’d have to update the lastmt variable in the dash app within the if loop after each update?

path = "demo.csv"
df = pd.read_csv(path)
lastmt = os.stat(path).st_mtime
print(lastmt)

that’s just the initialization for the app, and whenever there is a n_intervals change, the check is executed inside the callback, which will still happen every interval milliseconds, but it will be a 204 underneath for case of no_update.

forgive me, I must be misunderstanding.

When the app is initialized, it’ll set lastmt to the last time the file was modified. Lets say it was modified at 1130 at the time we first run this code. 5 min later, it’s modified again. When the interval changes and it does the check 1135 will always be greater than 1130 and will remain greater even if nothing else is changed so wouldn’t it print modified at every interval going forward even if nothing is modified?

os.stat(path).st_mtime > lastmt:
1 Like

you are right, my bad, we should reset the lastmt to the new time

ahh ok - so if I have one file that’s controlling multiple components on my dash, is there a workaround for this since if we set it to the latest time in the first component, all the others wont run since it’ll think the last modified time will equal the lasttm as set by the first component

@dipak96 can you share code?

@dipak96 It’s because when you execute your code the first time, it tells pandas to read your data.csv file, use the data in it to output to your dash table however, you only tell it to do that one time. That’s why you’d have to keep restarting your code block for it to keep reading the data file to pick up the latest data.

The best way forward would prob be to have dash check the csv file at a timed interval and return the data in it to your dashboard.

For that you have to add an dcc.Interval as well as a call back. The interval will set up an amount of time which when passed will allow you to take action. The call back will define what action you take when that amount of time has passed. In this case, it updates the rows and the columns of your table.


app.layout = html.Div([
dash_table.DataTable(
data=df.to_dict(‘records’),
columns=[{‘name’: i, ‘id’: i} for i in df.columns],
id='your_data_table'
),
dcc.Interval(id='interval_component',
                                              interval=1200000,
                                              n_intervals=0
                                              )

])

@app.callback(Output('your_datatable_name', 'data'),
              [Input('interval_component', 'n_intervals')])

def update_rows(n_intervals, n_clicks):
    data = pd.read_csv(filename)
    dict = data.to_dict('records')

    return dict


@app.callback(Output('your_datatable_name', 'columns'),
              [Input('interval_component', 'n_intervals')
              ])
def update_cols(n_intervals):
    data = pd.read_csv(filename)
    columns = [{'id': i, 'names': i} for i in data.columns]
    return columns

As someone else suggested, you can add the time check to only read the csv if the file has been modified.

Also, you need to name your data table something so the call back knows where to output the data to.

But the crux of whats happening here is every time the n_intervals changes (in this example its set to 1.2e6 millieseconds or 20 minutes), it’ll read the csv and send the rows and columns to your dashboard.

Thanks, @byronz, and @waly01.

no_update feature solved my issue.

Thank you @byronz for super helpful explanation.