Reading an Updated File on Dash Enterprise

I am using Dash to create a search engine tool for company reports. I am using Whoosh for indexing and searching - I am very confident Whoosh is not the issue, but I am not 100% sure.

Either way, up to now, each morning I run a batch script that gathers new reports from the previous day, writes/appends them to the search index file, and RE DEPLOYS the app with the updated data to the server. Dash supports automatic data updating, so I followed their examples to create a tasks.py file that retrieves reports from our database and writes them to an index file. NOTE I am not writing anything to the on-server Postgres DB - I am writing to the index file referenced in the project folder. I kept their connection code because I did not know what I could get rid of and what needed to stay. I have shortened my task, but I connect to a DB, pull data, write to the filename referenced, and print a 2nd check:

# Copy of their tasks.py
if os.environ.get("DASH_ENTERPRISE_ENV") == "WORKSPACE":
    parsed_url = urlparse(os.environ.get("REDIS_URL"))
    if parsed_url.path == "" or parsed_url.path == "/":
        i = 0
    else:
        try:
            i = int(parsed_url.path[1:])
        except:
            raise Exception("Redis database should be a number")
    parsed_url = parsed_url._replace(path="/{}".format((i + 1) % 16))

    updated_url = parsed_url.geturl()
    REDIS_URL = "redis://%s" % (updated_url.split("://")[1])
else:
    REDIS_URL = os.environ.get("REDIS_URL", "redis://dataupdateredis:99429b1023166af3d3de765f24a9d06398b95f9bd88b83aebd969a65fa216fc0@dokku-redis-dataupdateredis:6379")

celery_app = Celery(
    "Celery App", broker=REDIS_URL
)


connection_string = "postgresql+pg8000" + os.environ.get(
    "DATABASE_URL", "postgres://postgres:d6832cb53032a6819d0c01b707d571c5@dokku-postgres-dataupdatetest:5432/dataupdatetest"
).lstrip("postgresql")


postgres_engine = create_engine(connection_string, poolclass=NullPool)

@celery_app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):

    # This command invokes a celery task at an interval of every 5 Min. You can change this.
    sender.add_periodic_task(300, update_data.s(), name="Update data")

# MY TASK

@celery_app.task
def update_data(if_exists="append"):
    filename = os.path.join('assets','testUpdate2023')
    print("TEST 1")
    indexTest()
    print("Filename Check: ", filename)
    modtime = modification_date(filename)
    modtime = modtime.strftime('%m/%d/%Y %H:%M:%S')
    print(modtime)
    # Database connect
    # Pull Data
    # Write to file above
    # Print second check (still inside task)

Long winded to get here, but this all works. I can see on the server when I deploy data coming in and the second check containing new data…

However, when I go to the app, the new data does not appear. Per Plotly recommendations, I have referenced both the file and layout as functions in order to reload on page load with no luck. This is a multipage app, so @app and some other stuff might look a little off from a single page - however, this works as expected other than not having the new data.

def serveIndex():
    filename = os.path.join('assets','testUpdate2023')
    print(os.path.getmtime(filename))
    myIndex = open_dir(filename)
    print("Last Mod", myIndex.last_modified())
    print("Up to date", myIndex.up_to_date())
    return myIndex

def layout():
    return html.Div([...])

@callback(
    [Output(component_id='outTableTrend', component_property='data'),
     Output(component_id='outTableTrend', component_property='columns')],
     
    [Input(component_id='dateRange', component_property='startDate'),
     Input(component_id='dateRange', component_property='endDate')])


def allCrs(start_date, end_date):
    myIndex2 = serveIndex()
    # do stuff to index to create desired DF
    return data, columns

I still do not get new rows in the app or on the backend logs. Any ideas on what is going on here? I tried setting up gunicorn to reload/restart as well with no luck. The page only reads data from the last deploy even though I can see the updated data and modification times in the tasks.

Hi @samdean332

I think @michaelbabyn can help you with this Dash Enterprise question :slight_smile:

1 Like