Ai is not able to read a dash application 😱

PipInstallPython · November 4, 2025, 7:31pm

Any application you build with the Dash framework is invisible to LLMs & Bots

Most AI crawlers cannot execute JavaScript. This is the fundamental issue for Dash applications. While Google’s Gemini can render JavaScript through Googlebot, ChatGPT’s GPTBot, Anthropic’s ClaudeBot, and Perplexity all only fetch static HTML. When they encounter your Dash app, they see empty <div> containers because all content renders client-side via React.
Behind all the python, behind all the dependencies lives javascript & react. A fundamental tool that gives dash its interactive frontend design. The issue is with rendered Javascript in the browser, the majority of llms are not able to comprehend or fully render the pages the same way a person does. What a person interacts with to understand an applications contents is completely foreign to how an LLM comprehends the content of a url.

Test it out yourself

# Test what AI sees (disable JS in browser, or:)
curl -s https://yoursite.com | less

# Test specific bot
curl -A "Mozilla/5.0 (compatible; GPTBot/1.0)" https://yoursite.com

# Check files exist
curl https://yoursite.com/robots.txt
curl https://yoursite.com/llms.txt
curl https://yoursite.com/sitemap.xml

This is a major problem!

In this age of development, as important as designing an application to be mobile friendly, you also need to consider how to design your application ai friendly. The largest user of your software isn’t people.. it’s likely that the largest user base for new software will be bots, agents and algorithms.

Bots make up at least 51% of all internet traffic
Mobile devices accounted for approximately 60-65% of global web traffic in 2025

If web development was a pie chart, the majority of your user base is not human, then the majority of the minority is visiting from their cell phone. With this my approach to development was typically build mobile first design, however with the recent advancements with artificial intelligence we need to build ai first design, then mobile, lastly desktop user interface.

How do we fix this problem?

Well their are a few paths we could take as i’ve been exploring this topic and the effects it has on the dash developer and applications they host.

llms.txt This has already been adopted by dmc whom is leading the charge to modernizing the documentation of dash-mantine-components

A markdown file at /llms.txt that provides curated navigation
Simple format: H1 title, blockquote summary, H2 sections with links and descriptions
Major adopters: Anthropic, Vercel, Cloudflare, LangChain
Impact: Vercel reports 10% of signups now come from ChatGPT (vs traditional search)

Server-Side Rendering Solution
The more effective approach: Detect bot traffic and serve pre-rendered
How i see this working is something similar to:

from flask import request, Response

AI_BOTS = ['gptbot', 'chatgpt-user', 'claudebot', 'perplexitybot', 'google-extended']

def is_ai_bot():
    user_agent = request.headers.get('User-Agent', '').lower()
    return any(bot in user_agent for bot in AI_BOTS)

@app.server.before_request
def handle_bots():
    if is_ai_bot() and not request.path.startswith('/assets'):
        # Serve static HTML with your content
        html = f'''<!DOCTYPE html>
        <html>
        <head>
            <title>Dashboard Name</title>
            <meta name="description" content="Description">
        </head>
        <body>
            <h1>Dashboard Title</h1>
            <p>Description of what your dashboard does</p>
            <nav>
                <ul>
                    <li><a href="/">Home</a></li>
                    <li><a href="/dashboard">Dashboard</a></li>
                    <li><a href="/api">API</a></li>
                </ul>
            </nav>
        </body>
        </html>'''
        return Response(html, mimetype='text/html')

robots.txt is also another good tool to help direct ai or llms on how to interact with your application example:

# Robots.txt for Plotly.pro
# This file tells search engine crawlers which pages or files they can or can't request from your site.

# Allow all crawlers
User-agent: *
Allow: /

# Disallow authentication endpoints
Disallow: /api/
Disallow: /_dash-
Disallow: /_routes/

# Disallow cart and checkout pages (these are user-specific)
Disallow: /cart
Disallow: /checkout
Disallow: /dashboard

# Allow specific important pages
Allow: /
Allow: /about
Allow: /tutorials
Allow: /tutorial/
Allow: /product_details/
Allow: /about

# Crawl delay (in seconds) - optional, helps prevent server overload
Crawl-delay: 1

# Sitemap location
Sitemap: https://plotly.pro/sitemap.xml

# Specific rules for major search engines
# Google
User-agent: Googlebot
Allow: /
Crawl-delay: 0

# Bing
User-agent: Bingbot
Allow: /
Crawl-delay: 1

# Block bad bots (optional but recommended)
User-agent: AhrefsBot
Disallow: /

User-agent: SemrushBot
Disallow: /

User-agent: DotBot
Disallow: /

User-agent: MJ12bot
Disallow: /

# Block AI/LLM crawlers if you want to prevent AI training on your content
User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Claude-Web
Disallow: /

sitemap.xml is a a file that lists a website’s URLs, helping search engines like Google discover, crawl, and index content more efficiently which looks like:

<?xml version="1.0" encoding="UTF-8"?>
<!-- 
  SITEMAP.XML STRUCTURE EXPLANATION
  
  This file tells search engines and AI crawlers about all the pages on your site.
  It's like a table of contents for your website.
-->

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

  <!-- 
    Each <url> represents one page on your site
  -->
  
  <!-- Homepage - Usually highest priority -->
  <url>
    <loc>https://example.com/</loc>
    <!-- The full URL to the page -->
    
    <lastmod>2025-01-15</lastmod>
    <!-- When this page was last modified (YYYY-MM-DD format) -->
    
    <changefreq>daily</changefreq>
    <!-- How often this page changes: always, hourly, daily, weekly, monthly, yearly, never -->
    
    <priority>1.0</priority>
    <!-- Priority relative to other pages on YOUR site (0.0 to 1.0, where 1.0 is most important) -->
  </url>

  <!-- Main Dashboard Page -->
  <url>
    <loc>https://example.com/dashboard</loc>
    <lastmod>2025-01-15</lastmod>
    <changefreq>daily</changefreq>
    <priority>0.9</priority>
    <!-- High priority, updates daily -->
  </url>

  <!-- Analytics Page -->
  <url>
    <loc>https://example.com/analytics</loc>
    <lastmod>2025-01-10</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.8</priority>
    <!-- Medium-high priority, updates weekly -->
  </url>

  <!-- Equipment/Catalog Page -->
  <url>
    <loc>https://example.com/equipment</loc>
    <lastmod>2025-01-05</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.8</priority>
  </url>

  <!-- API Documentation -->
  <url>
    <loc>https://example.com/api</loc>
    <lastmod>2024-12-20</lastmod>
    <changefreq>monthly</changefreq>
    <priority>0.7</priority>
    <!-- Lower priority, rarely changes -->
  </url>

  <!-- Reports Section -->
  <url>
    <loc>https://example.com/reports</loc>
    <lastmod>2025-01-12</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.7</priority>
  </url>

  <!-- About Page -->
  <url>
    <loc>https://example.com/about</loc>
    <lastmod>2024-11-01</lastmod>
    <changefreq>yearly</changefreq>
    <priority>0.5</priority>
    <!-- Low priority, rarely changes -->
  </url>

  <!--
    NOTES:
    
    1. ALL URLs MUST be absolute (include https://)
    2. Keep file under 50MB and 50,000 URLs
    3. If you have more than 50,000 URLs, create multiple sitemaps
       and use a sitemap index file
    4. changefreq and priority are HINTS, not commands
    5. Search engines may ignore these hints and crawl as they wish
    6. lastmod is very important - helps crawlers know what changed
    
    MINIMAL VALID SITEMAP (only <loc> is required):
    
    <url>
      <loc>https://example.com/page</loc>
    </url>
    
    SITEMAP INDEX (for multiple sitemaps):
    
    <?xml version="1.0" encoding="UTF-8"?>
    <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      <sitemap>
        <loc>https://example.com/sitemap-main.xml</loc>
        <lastmod>2025-01-15</lastmod>
      </sitemap>
      <sitemap>
        <loc>https://example.com/sitemap-reports.xml</loc>
        <lastmod>2025-01-10</lastmod>
      </sitemap>
    </sitemapindex>
  -->

</urlset>

Which could be paired with dash using app.server.route like:

"""
Example: How to dynamically generate sitemap.xml in a Dash application
"""

from dash import Dash
from flask import Response
from datetime import datetime

app = Dash(__name__)

# Define your routes with metadata
ROUTES = [
    {
        'path': '/',
        'priority': '1.0',
        'changefreq': 'daily',
        'lastmod': '2025-01-15'
    },
    {
        'path': '/equipment',
        'priority': '0.9',
        'changefreq': 'daily',
        'lastmod': '2025-01-15'
    },
    {
        'path': '/analytics',
        'priority': '0.8',
        'changefreq': 'weekly',
        'lastmod': '2025-01-10'
    },
]

# Simple version - static routes
@app.server.route('/sitemap.xml')
def sitemap_simple():
    """Basic sitemap with static routes"""
    xml = ['<?xml version="1.0" encoding="UTF-8"?>']
    xml.append('<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">')
    
    for route in ROUTES:
        xml.append('  <url>')
        xml.append(f'    <loc>https://example.com{route["path"]}</loc>')
        xml.append(f'    <lastmod>{route["lastmod"]}</lastmod>')
        xml.append(f'    <changefreq>{route["changefreq"]}</changefreq>')
        xml.append(f'    <priority>{route["priority"]}</priority>')
        xml.append('  </url>')
    
    xml.append('</urlset>')
    
    return Response('\n'.join(xml), mimetype='application/xml')


# More realistic version - with dynamic routes from database
@app.server.route('/sitemap-dynamic.xml')
def sitemap_dynamic():
    """Sitemap with both static and dynamic routes"""
    xml = ['<?xml version="1.0" encoding="UTF-8"?>']
    xml.append('<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">')
    
    # Static routes
    static_routes = [
        {'path': '/', 'priority': '1.0', 'changefreq': 'daily'},
        {'path': '/dashboard', 'priority': '0.9', 'changefreq': 'daily'},
        {'path': '/api', 'priority': '0.7', 'changefreq': 'monthly'},
    ]
    
    today = datetime.now().strftime('%Y-%m-%d')
    
    for route in static_routes:
        xml.append('  <url>')
        xml.append(f'    <loc>https://example.com{route["path"]}</loc>')
        xml.append(f'    <lastmod>{today}</lastmod>')
        xml.append(f'    <changefreq>{route["changefreq"]}</changefreq>')
        xml.append(f'    <priority>{route["priority"]}</priority>')
        xml.append('  </url>')
    
    # Dynamic routes (e.g., from database)
    # In real app, you'd query your database:
    # reports = db.query("SELECT slug, updated_at FROM reports WHERE published=true")
    
    # Example dynamic reports
    dynamic_reports = [
        {'slug': 'quarterly-sales', 'updated_at': '2025-01-10'},
        {'slug': 'annual-review', 'updated_at': '2025-01-01'},
    ]
    
    for report in dynamic_reports:
        xml.append('  <url>')
        xml.append(f'    <loc>https://example.com/reports/{report["slug"]}</loc>')
        xml.append(f'    <lastmod>{report["updated_at"]}</lastmod>')
        xml.append('    <changefreq>monthly</changefreq>')
        xml.append('    <priority>0.6</priority>')
        xml.append('  </url>')
    
    xml.append('</urlset>')
    
    return Response('\n'.join(xml), mimetype='application/xml')


# Minimal version - just URLs (perfectly valid!)
@app.server.route('/sitemap-minimal.xml')
def sitemap_minimal():
    """Minimal valid sitemap with just URLs"""
    routes = ['/', '/dashboard', '/analytics', '/equipment', '/api']
    
    xml = ['<?xml version="1.0" encoding="UTF-8"?>']
    xml.append('<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">')
    
    for path in routes:
        xml.append('  <url>')
        xml.append(f'    <loc>https://example.com{path}</loc>')
        xml.append('  </url>')
    
    xml.append('</urlset>')
    
    return Response('\n'.join(xml), mimetype='application/xml')


if __name__ == '__main__':
    print("Sitemap will be available at:")
    print("  http://localhost:8050/sitemap.xml")
    print("  http://localhost:8050/sitemap-dynamic.xml")
    print("  http://localhost:8050/sitemap-minimal.xml")
    app.run_server(debug=True)

Emerging Standards to Watch

W3C AI Agent Protocol Community Group (formed May 2025) - Working on formal standards
Model Context Protocol (MCP) by Anthropic - For dynamic context exposure
Agent-to-Agent (A2A) Protocol by Google - References llms.txt

Lets put a hook into this

The problem: AI crawlers can’t execute JavaScript, so they see:

  <div id="react-entry-point">
      <div class="_dash-loading">Loading...</div>
  </div>

My solution to this problem is to create an all encompassing dash hook that allows bots and ai to actually see your dash applications. dash-improve-my-llms:
With a simple pip install dash-improve-my-llms you can now expand the footprint of your application by 60%+:

from dash import Dash
from dash_improve_my_llms import add_llms_routes

app = Dash(__name__, use_pages=True)
add_llms_routes(app)  # ✨ That's it!

app.run(debug=True)

What this does is it allows your dash application to be able to setup automatic pages to improve seo and ai crawling. It automatically creates:

Automatic Documentation

llms.txt - Comprehensive, context-rich markdown optimized for LLM understanding
page.json - Detailed technical architecture with interactivity and data flow
architecture.txt - ASCII art representation of entire application

Bot Management & SEO

robots.txt - Intelligent bot control with AI training bot blocking
sitemap.xml - SEO-optimized sitemap with intelligent priority inference
Static HTML - Bot-friendly pages with structured data

I could dive further into this topic, but to keep it simple, this dash hook makes your invisible dash applications visible to the algorithms, bots & ai web scrapers. Much more context and configuration within github repo:

I built this project out with a limited working example that you can explore that could help you understand what this hook actually does and is capable of:
https://554d9a17-106e-455a-a015-1194587c953f.plotly.app/

Wait a moment! What about MCP server? What is that and how does this hook compare?

Ah yes, MCP servers are a recent term which is establishing as a gateway for ai to have bidirectional access within your application. This is more configured for much more complexity, requiring authentication and integrated access with the ai to use the MCP server to interact within your application in real time. This dash-improve-my-llms could be considered the inverse of the MCP server, as its purpose is to allow llms to be capable of reading and understanding your applications contents and not designed for bidirectional interactions where the llm is only in a READ ONLY status and cant be an active partisapent within the forums, fields or dynamic dashboard.

Comparison Table

Feature	MCP Server	dash-improve-my-llms
Communication	Bidirectional	One-way (read-only)
Connection	Persistent	Stateless HTTP
Authentication	Required	Optional (public)
AI Capability	Execute actions	Understand structure
Real-time Data	Yes	No (static docs)
Function Calling	Yes	No
Web Crawlers	No	Yes
SEO Benefits	No	Yes
Setup Complexity	Higher	Lower
Use Case	Active AI agent	Passive discovery

Final commentary

This is an initial release of a hook that was primarily designed to provide a READ url for llms to have access to your dash application contents. This is early in the development cycle and everything is subject to change. This wasn’t designed to be a complex hook, its a simple one line integrated, wrapper that allows your applications the ability to be visible to llms and bots while also expanding the footprint of your software and usefulness it provides. Please jump in, contribute, provide input on your thoughts regarding the AI crawlers cannot execute JavaScript issue outlined in this post.

Check out my other work and publicly hosted applications:

My digital shop / tutorial platform

Open Source Components and documentation

Ai Canvas Project

and follow me on Github:

U-Danny · November 4, 2025, 8:31pm

Your hook initiative is a highly viable solution that modernizes load state management in Dash. Until now, to achieve the goal of AI crawlability, I used a Hybrid Pre-rendering approach: I utilized a JSON file as the single source of metadata for the dashboard, injecting static HTML (pre-render.html) containing that content directly into the index_string. This ensured that crawlers indexed the key content immediately, while Dash’s dynamic loading handled overwriting the static content. If your hook simplifies the management of this transition without compromising initial indexing, it will be a very useful step forward. Furthermore, assigning fixed routes for AI—such as a default static route for generated content, for instance, /dash_info.xml—would be a substantial and fantastic advancement.

PipInstallPython · November 4, 2025, 8:43pm

Thats exactly the goal of this project, is to automate a lot of dynamic files that are easily forgotten but important for SEO and readability for autonomous application views.

The /architecture.txt gives an outline of the project file structure and dependencies

The /page.json gives a component breakdown for the dash page being rendered, that route is dynamically created for every dash equivalent page. The /page.json gives an understanding to llms on what dash components & callbacks are bing used and how to go about re-creating them if needed.

The /llms.txt gives a comprehensive understanding of the contents of the page, whats it purpose and point of truth.

The robots.txt gives instructions to allow or disallow specific bots or crawlers for configurable reasons.

Lastly the sitemap.xml is the url directory for an llm or bot to understand what pages it could crawl which id assume fulfills the same purpose of the /dash_info.xml

Similar to you previously on my professional released applications I would create a templates/index.html which would automatically do a similar thing as your mentioned index_string approach with the pre-render.html

In a follow up release I’m also considering automating the creation of that templates/index.html to setup best practices for configuring within best practices an html injection within a dash application. Assuming this could be extended out further with design.txt instructions specific to the .css and theme of an application alongside other automatic hosted page urls that give ai’s better project knowledge and context to what your dash applications are, how they are build, designed and used.

Topic		Replies	Views
🎬 New Dash Video: Making Dash Apps AI-Friendly + Dash-Flows Node Based Component Dash Python show-and-tell , community-components , tips-and-tricks , dash-online-course	2	218	December 15, 2025
Dash Documentation Boilerplate - Release Dash Python show-and-tell , community-components , tips-and-tricks	5	202	November 18, 2025
Making Dash SEO-friendly Dash Python	16	705	October 1, 2024
Dash Club 10: Dash Enterprise 5.1, Dash-ChatGPT App Challenge, JupyterDash in Dash, Shape formulas Dash Python dash-club	0	944	June 5, 2023
Frontend Development Cheat Code for DMC Ai Prompting Dash Python show-and-tell , tips-and-tricks	5	255	June 5, 2025