Monitoring Dashboard with Dash & Mongodb - New to Dash

Hello,

I want to create a monitoring dashboard with Dash which will be using MongoDB as a source but I’m new to Dash and wanted to ask if what I’m thinking is achievable.

I want to be able to:
use pymongo to connect to MongoDB
query MongoDB documents
perform needed manipulations (mainly json_normalize) and then a few more to get to needed format for the pandas dataframes
plot graphs from a number of dataframes
to be able to update the df’s and the graphs at least once a day

What troubles me is:
I cannot find many examples or references to using Dash with MongoDB, but I believe that should work, right?
All examples of Dash dashboards I’ve seen online connect to a DB and then do not make any manipulations for the df but rather the df is already in the wanted structure. Is it possible to do what I’m planning?
It appears that querying the documents and using json_normalize is a bit slow, does that matter?
The database contains more than 2M documents, but for different graphs I need different parts of the db and different keys. Is it better to query all documents with all needed keys and then create the df’s out of these? Or is it better to have smaller document queries to create the df’s? Example: for graph_1 I need all instances (2M) but only 3 keys, but for graph_2 I need instances from a certain timeframe (e.g. max 1 month old=150k instances) but 40 keys, so querying the whole database leads to many datapoints that won’t be used. I guess it’s better to query parts of it that correspond closer to what I need from each df and I hope the answer to my next question is positive. Will I be able to use live-updates but update the data source for graph_1 and graph_1 like once a month or once every few days but update the source and the graph for graph_2 every day?

Hope my questions are clear enough. Thanks for any help!

I’m working on something similar at work, trying to implement a Datawarehouse monitoring dashboard to allow daily monitoring of various aspects of our data platform (ingestion pipelines, data quality, load performance, data delivery etc). I’m heavily repurposing this awesome Dash app by someone which looks aesthetically pleasing as well as contains multiple tabs focusing on each individual subject area: http://cepel-medical-charges.herokuapp.com/

The code: GitHub - tolgahancepel/medical-charges-prediction

From what I have learned so far, Dash doesn’t have any specific requirements with regards to how you’d implement your business logic. Everything is in Python, so querying MongoDB documents and transforming them into a Pandas dataframe is perfectly normal. For as long as you end up with a data structure that’s consumable by Dash objects, the rest of the implementation details won’t matter much.

Regarding data volumes and performance, do read through the official guide around how data volumes impact Dash and some of the best practices/suggestions here: Performance | Dash for Python Documentation | Plotly

Lastly, if your MongoDB data is just too voluminous for repeated querying, you could pre-build necessary filtered data views/aggregates in the backend to serve to your app since it doesn’t sound like your graphs will change much throughout the day. Such bulk data operations are better addressed using backend technologies rather than doing them in Python at run time every single time.