Best Practices for Caching Generative AI Outputs in Dash Apps Without Breaking Callback Logic

Hello

I am building a Dash app that uses a backend Generative AI model to create charts; text summaries / layout elements based on user input. :slightly_smiling_face:To improve performance, I’ve added a caching layer (using flask_caching) ; but I am running into issues where the callback logic becomes inconsistent. Dash doesn’t reflect updated outputs when a user prompt changes slightly ; likely due to stale cache keys / callback memoization confusion. :upside_down_face:

Has anyone implemented caching for AI-generated content within Dash without interfering with the reactivity of Dash callbacks? :thinking: I am trying to prevent unnecessary regeneration from the AI when prompts are similar but I still want the app to remain responsive. Additionally; I want to avoid full layout refreshes since parts of the UI are static.
I have checked Performance | Dash for Python Documentation | Plotly guide with Background Callbacks | Dash for Python Documentation | Plotly for reference.

One context behind this issue relates to understanding what is Generative AI and how it introduces new challenges in Dash especially when AI responses are expensive and non-deterministic. :thinking:I would love any architectural suggestions or code examples that solve this elegantly.

Thank you !! :thinking:

Hey @tasosad welcome to the forums.

You are trying to cache the user input on the dash side? What is the aim for it? return the already available LLM answer from a previous, similar question and skip the LLM call?

Why don’t you do the caching on the LLM side?