Any application you build with the Dash framework is invisible to LLMs & Bots
Most AI crawlers cannot execute JavaScript. This is the fundamental issue for Dash applications. While Google’s Gemini can render JavaScript through Googlebot, ChatGPT’s GPTBot, Anthropic’s ClaudeBot, and Perplexity all only fetch static HTML. When they encounter your Dash app, they see empty <div> containers because all content renders client-side via React.
Behind all the python, behind all the dependencies lives javascript & react. A fundamental tool that gives dash its interactive frontend design. The issue is with rendered Javascript in the browser, the majority of llms are not able to comprehend or fully render the pages the same way a person does. What a person interacts with to understand an applications contents is completely foreign to how an LLM comprehends the content of a url.
Test it out yourself
# Test what AI sees (disable JS in browser, or:)
curl -s https://yoursite.com | less
# Test specific bot
curl -A "Mozilla/5.0 (compatible; GPTBot/1.0)" https://yoursite.com
# Check files exist
curl https://yoursite.com/robots.txt
curl https://yoursite.com/llms.txt
curl https://yoursite.com/sitemap.xml
This is a major problem!
In this age of development, as important as designing an application to be mobile friendly, you also need to consider how to design your application ai friendly. The largest user of your software isn’t people.. it’s likely that the largest user base for new software will be bots, agents and algorithms.
Bots make up at least 51% of all internet traffic
Mobile devices accounted for approximately 60-65% of global web traffic in 2025
If web development was a pie chart, the majority of your user base is not human, then the majority of the minority is visiting from their cell phone. With this my approach to development was typically build mobile first design, however with the recent advancements with artificial intelligence we need to build ai first design, then mobile, lastly desktop user interface.
How do we fix this problem?
Well their are a few paths we could take as i’ve been exploring this topic and the effects it has on the dash developer and applications they host.
- llms.txt This has already been adopted by dmc whom is leading the charge to modernizing the documentation of dash-mantine-components
- A markdown file at
/llms.txtthat provides curated navigation - Simple format: H1 title, blockquote summary, H2 sections with links and descriptions
- Major adopters: Anthropic, Vercel, Cloudflare, LangChain
- Impact: Vercel reports 10% of signups now come from ChatGPT (vs traditional search)
- Server-Side Rendering Solution
The more effective approach: Detect bot traffic and serve pre-rendered
How i see this working is something similar to:
from flask import request, Response
AI_BOTS = ['gptbot', 'chatgpt-user', 'claudebot', 'perplexitybot', 'google-extended']
def is_ai_bot():
user_agent = request.headers.get('User-Agent', '').lower()
return any(bot in user_agent for bot in AI_BOTS)
@app.server.before_request
def handle_bots():
if is_ai_bot() and not request.path.startswith('/assets'):
# Serve static HTML with your content
html = f'''<!DOCTYPE html>
<html>
<head>
<title>Dashboard Name</title>
<meta name="description" content="Description">
</head>
<body>
<h1>Dashboard Title</h1>
<p>Description of what your dashboard does</p>
<nav>
<ul>
<li><a href="/">Home</a></li>
<li><a href="/dashboard">Dashboard</a></li>
<li><a href="/api">API</a></li>
</ul>
</nav>
</body>
</html>'''
return Response(html, mimetype='text/html')
- robots.txt is also another good tool to help direct ai or llms on how to interact with your application example:
# Robots.txt for Plotly.pro
# This file tells search engine crawlers which pages or files they can or can't request from your site.
# Allow all crawlers
User-agent: *
Allow: /
# Disallow authentication endpoints
Disallow: /api/
Disallow: /_dash-
Disallow: /_routes/
# Disallow cart and checkout pages (these are user-specific)
Disallow: /cart
Disallow: /checkout
Disallow: /dashboard
# Allow specific important pages
Allow: /
Allow: /about
Allow: /tutorials
Allow: /tutorial/
Allow: /product_details/
Allow: /about
# Crawl delay (in seconds) - optional, helps prevent server overload
Crawl-delay: 1
# Sitemap location
Sitemap: https://plotly.pro/sitemap.xml
# Specific rules for major search engines
# Google
User-agent: Googlebot
Allow: /
Crawl-delay: 0
# Bing
User-agent: Bingbot
Allow: /
Crawl-delay: 1
# Block bad bots (optional but recommended)
User-agent: AhrefsBot
Disallow: /
User-agent: SemrushBot
Disallow: /
User-agent: DotBot
Disallow: /
User-agent: MJ12bot
Disallow: /
# Block AI/LLM crawlers if you want to prevent AI training on your content
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Claude-Web
Disallow: /
- sitemap.xml is a a file that lists a website’s URLs, helping search engines like Google discover, crawl, and index content more efficiently which looks like:
<?xml version="1.0" encoding="UTF-8"?>
<!--
SITEMAP.XML STRUCTURE EXPLANATION
This file tells search engines and AI crawlers about all the pages on your site.
It's like a table of contents for your website.
-->
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<!--
Each <url> represents one page on your site
-->
<!-- Homepage - Usually highest priority -->
<url>
<loc>https://example.com/</loc>
<!-- The full URL to the page -->
<lastmod>2025-01-15</lastmod>
<!-- When this page was last modified (YYYY-MM-DD format) -->
<changefreq>daily</changefreq>
<!-- How often this page changes: always, hourly, daily, weekly, monthly, yearly, never -->
<priority>1.0</priority>
<!-- Priority relative to other pages on YOUR site (0.0 to 1.0, where 1.0 is most important) -->
</url>
<!-- Main Dashboard Page -->
<url>
<loc>https://example.com/dashboard</loc>
<lastmod>2025-01-15</lastmod>
<changefreq>daily</changefreq>
<priority>0.9</priority>
<!-- High priority, updates daily -->
</url>
<!-- Analytics Page -->
<url>
<loc>https://example.com/analytics</loc>
<lastmod>2025-01-10</lastmod>
<changefreq>weekly</changefreq>
<priority>0.8</priority>
<!-- Medium-high priority, updates weekly -->
</url>
<!-- Equipment/Catalog Page -->
<url>
<loc>https://example.com/equipment</loc>
<lastmod>2025-01-05</lastmod>
<changefreq>weekly</changefreq>
<priority>0.8</priority>
</url>
<!-- API Documentation -->
<url>
<loc>https://example.com/api</loc>
<lastmod>2024-12-20</lastmod>
<changefreq>monthly</changefreq>
<priority>0.7</priority>
<!-- Lower priority, rarely changes -->
</url>
<!-- Reports Section -->
<url>
<loc>https://example.com/reports</loc>
<lastmod>2025-01-12</lastmod>
<changefreq>weekly</changefreq>
<priority>0.7</priority>
</url>
<!-- About Page -->
<url>
<loc>https://example.com/about</loc>
<lastmod>2024-11-01</lastmod>
<changefreq>yearly</changefreq>
<priority>0.5</priority>
<!-- Low priority, rarely changes -->
</url>
<!--
NOTES:
1. ALL URLs MUST be absolute (include https://)
2. Keep file under 50MB and 50,000 URLs
3. If you have more than 50,000 URLs, create multiple sitemaps
and use a sitemap index file
4. changefreq and priority are HINTS, not commands
5. Search engines may ignore these hints and crawl as they wish
6. lastmod is very important - helps crawlers know what changed
MINIMAL VALID SITEMAP (only <loc> is required):
<url>
<loc>https://example.com/page</loc>
</url>
SITEMAP INDEX (for multiple sitemaps):
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://example.com/sitemap-main.xml</loc>
<lastmod>2025-01-15</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-reports.xml</loc>
<lastmod>2025-01-10</lastmod>
</sitemap>
</sitemapindex>
-->
</urlset>
Which could be paired with dash using app.server.route like:
"""
Example: How to dynamically generate sitemap.xml in a Dash application
"""
from dash import Dash
from flask import Response
from datetime import datetime
app = Dash(__name__)
# Define your routes with metadata
ROUTES = [
{
'path': '/',
'priority': '1.0',
'changefreq': 'daily',
'lastmod': '2025-01-15'
},
{
'path': '/equipment',
'priority': '0.9',
'changefreq': 'daily',
'lastmod': '2025-01-15'
},
{
'path': '/analytics',
'priority': '0.8',
'changefreq': 'weekly',
'lastmod': '2025-01-10'
},
]
# Simple version - static routes
@app.server.route('/sitemap.xml')
def sitemap_simple():
"""Basic sitemap with static routes"""
xml = ['<?xml version="1.0" encoding="UTF-8"?>']
xml.append('<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">')
for route in ROUTES:
xml.append(' <url>')
xml.append(f' <loc>https://example.com{route["path"]}</loc>')
xml.append(f' <lastmod>{route["lastmod"]}</lastmod>')
xml.append(f' <changefreq>{route["changefreq"]}</changefreq>')
xml.append(f' <priority>{route["priority"]}</priority>')
xml.append(' </url>')
xml.append('</urlset>')
return Response('\n'.join(xml), mimetype='application/xml')
# More realistic version - with dynamic routes from database
@app.server.route('/sitemap-dynamic.xml')
def sitemap_dynamic():
"""Sitemap with both static and dynamic routes"""
xml = ['<?xml version="1.0" encoding="UTF-8"?>']
xml.append('<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">')
# Static routes
static_routes = [
{'path': '/', 'priority': '1.0', 'changefreq': 'daily'},
{'path': '/dashboard', 'priority': '0.9', 'changefreq': 'daily'},
{'path': '/api', 'priority': '0.7', 'changefreq': 'monthly'},
]
today = datetime.now().strftime('%Y-%m-%d')
for route in static_routes:
xml.append(' <url>')
xml.append(f' <loc>https://example.com{route["path"]}</loc>')
xml.append(f' <lastmod>{today}</lastmod>')
xml.append(f' <changefreq>{route["changefreq"]}</changefreq>')
xml.append(f' <priority>{route["priority"]}</priority>')
xml.append(' </url>')
# Dynamic routes (e.g., from database)
# In real app, you'd query your database:
# reports = db.query("SELECT slug, updated_at FROM reports WHERE published=true")
# Example dynamic reports
dynamic_reports = [
{'slug': 'quarterly-sales', 'updated_at': '2025-01-10'},
{'slug': 'annual-review', 'updated_at': '2025-01-01'},
]
for report in dynamic_reports:
xml.append(' <url>')
xml.append(f' <loc>https://example.com/reports/{report["slug"]}</loc>')
xml.append(f' <lastmod>{report["updated_at"]}</lastmod>')
xml.append(' <changefreq>monthly</changefreq>')
xml.append(' <priority>0.6</priority>')
xml.append(' </url>')
xml.append('</urlset>')
return Response('\n'.join(xml), mimetype='application/xml')
# Minimal version - just URLs (perfectly valid!)
@app.server.route('/sitemap-minimal.xml')
def sitemap_minimal():
"""Minimal valid sitemap with just URLs"""
routes = ['/', '/dashboard', '/analytics', '/equipment', '/api']
xml = ['<?xml version="1.0" encoding="UTF-8"?>']
xml.append('<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">')
for path in routes:
xml.append(' <url>')
xml.append(f' <loc>https://example.com{path}</loc>')
xml.append(' </url>')
xml.append('</urlset>')
return Response('\n'.join(xml), mimetype='application/xml')
if __name__ == '__main__':
print("Sitemap will be available at:")
print(" http://localhost:8050/sitemap.xml")
print(" http://localhost:8050/sitemap-dynamic.xml")
print(" http://localhost:8050/sitemap-minimal.xml")
app.run_server(debug=True)
Emerging Standards to Watch
- W3C AI Agent Protocol Community Group (formed May 2025) - Working on formal standards
- Model Context Protocol (MCP) by Anthropic - For dynamic context exposure
- Agent-to-Agent (A2A) Protocol by Google - References llms.txt
Lets put a hook into this 
The problem: AI crawlers can’t execute JavaScript, so they see:
<div id="react-entry-point">
<div class="_dash-loading">Loading...</div>
</div>
My solution to this problem is to create an all encompassing dash hook that allows bots and ai to actually see your dash applications. dash-improve-my-llms:
With a simple pip install dash-improve-my-llms you can now expand the footprint of your application by 60%+:
from dash import Dash
from dash_improve_my_llms import add_llms_routes
app = Dash(__name__, use_pages=True)
add_llms_routes(app) # ✨ That's it!
app.run(debug=True)
What this does is it allows your dash application to be able to setup automatic pages to improve seo and ai crawling. It automatically creates:
Automatic Documentation
llms.txt- Comprehensive, context-rich markdown optimized for LLM understandingpage.json- Detailed technical architecture with interactivity and data flowarchitecture.txt- ASCII art representation of entire application
Bot Management & SEO
robots.txt- Intelligent bot control with AI training bot blockingsitemap.xml- SEO-optimized sitemap with intelligent priority inference- Static HTML - Bot-friendly pages with structured data
I could dive further into this topic, but to keep it simple, this dash hook makes your invisible dash applications visible to the algorithms, bots & ai web scrapers. Much more context and configuration within github repo:
I built this project out with a limited working example that you can explore that could help you understand what this hook actually does and is capable of:
Wait a moment! What about MCP server? What is that and how does this hook compare?
Ah yes, MCP servers are a recent term which is establishing as a gateway for ai to have bidirectional access within your application. This is more configured for much more complexity, requiring authentication and integrated access with the ai to use the MCP server to interact within your application in real time. This dash-improve-my-llms could be considered the inverse of the MCP server, as its purpose is to allow llms to be capable of reading and understanding your applications contents and not designed for bidirectional interactions where the llm is only in a READ ONLY status and cant be an active partisapent within the forums, fields or dynamic dashboard.
Comparison Table
| Feature | MCP Server | dash-improve-my-llms |
|---|---|---|
| Communication | Bidirectional | One-way (read-only) |
| Connection | Persistent | Stateless HTTP |
| Authentication | Required | Optional (public) |
| AI Capability | Execute actions | Understand structure |
| Real-time Data | ||
| Function Calling | ||
| Web Crawlers | ||
| SEO Benefits | ||
| Setup Complexity | Higher | Lower |
| Use Case | Active AI agent | Passive discovery |
Final commentary
This is an initial release of a hook that was primarily designed to provide a READ url for llms to have access to your dash application contents. This is early in the development cycle and everything is subject to change. This wasn’t designed to be a complex hook, its a simple one line integrated, wrapper that allows your applications the ability to be visible to llms and bots while also expanding the footprint of your software and usefulness it provides. Please jump in, contribute, provide input on your thoughts regarding the AI crawlers cannot execute JavaScript issue outlined in this post.
Check out my other work and publicly hosted applications:
My digital shop / tutorial platform
Open Source Components and documentation
Ai Canvas Project
and follow me on Github:
