Writing Secure Dash Apps - Community Thread

Hey folks –

Here’s a community thread to kick off security-related patterns & concerns when writing Dash apps.

The Dash framework itself is architected in a way to avoid many security issues outright (like many XSS issues encountered when rendering arbitrary HTML) and our commercial Dash Enterprise platform provides strict security controls beyond the application code (with authorization, authentication, sandboxing apps in containers away from the host server, security audits, and more).

However, it is still possible to write insecure code. This list is in no way exhaustive, and the scope is small - We’re not considering things like the operating system, authentication or authorization, securing remote connections, etc.

Please comment below if you’d like something added to the list!


Security Background

OWASP’s Top Ten Security Issues is a great place to start to learn about security. Note that not all of these issues are areas of risk in Dash (due to how Dash is architected).

1. Treat All Callback Inputs as Untrusted

Your UI might have a finite set of options (e.g. in a dcc.Dropdown) but malicious users can construct HTTP requests that fire your callbacks with any inputs that they want.

So, treat your callback input arguments as untrusted. This means:
a. Avoid eval or exec'ing anything

b. Use werkzeug.utils.secure_filename(filename) when constructing a file path that comes from a callback input. This prevents malicious users from reading or accessing files on your system that live outside of the project folder.

Instead of this:

@app.callback(Output('graph', 'figure'), Input('dropdown', 'value'))
def update_figure(file_path):
    with open(file_path, 'r') as f:  # DON'T DO THIS
        f.read()

Do this:

@app.callback(Output('graph', 'figure'), Input('dropdown', 'value'))
def update_figure(untrusted_file_path):
    safe_file_path = werkzeug.utils.secure_filename(unrusted_file_path)
    with open(safe_file_path, 'r') as f:  # THIS IS OK
        f.read()

c. Escape SQL parameters using whatever methods for parameterizing queries are available in the library:
Instead of this:

@app.callback(Output('graph', 'figure'), Input('dropdown', 'value'))
def update_figure(untrusted_value):
    query = 'SELECT x, y FROM data WHERE city = {}'.format(untrusted_value) # DON'T DO THIS
    query = 'SELECT x, y FROM data WHERE city = %s' % untrusted_value       # DON'T DO THIS
    return cur.execute(query)

Do this:

@app.callback(Output('graph', 'figure'), [Input('dropdown', 'value')])
def update_figure(untrusted_value):
    query = 'SELECT x, y FROM data WHERE city = %s'

    return cur.execute(query, (untrusted_value,)) # GOOD, untrusted_value WILL BE ESCAPED BY execute()

d. Avoid passing callback inputs into dangerously_set_inner_html and any other component properties in other community packages that render arbitrary HTML. Or better yet, avoid dangerously_set_inner_html altogether.

Rendering arbitrary HTML from untrusted user inputs is an XSS issue.

2. dcc.Store Data is accessible, even if it’s not visible

All of the data that is returned in a callback can be accessed by the current user of the app, even if that data isn’t necessarily visible.

So, be careful not to store secret or sensitive data in dcc.Store or other callbacks if you don’t intend all users to be able to access the data.

For example, if you have a dataframe that has several unused but sensitive columns (e.g. data about a particular client), then remove those columns before saving it in a dcc.Store. Even if your app doesn’t use those columns, the data is still accessible in the browser session.

3. Avoid Storing Secrets in Code

API keys, database passwords, etc - These should be saved in environment variables outside of your application code. If they are in your application code, then they are easily inadvertently stored in other systems like GitHub or between other people.

If you are using Dash Enterprise, then use the built-in Environment Variable manager:

4. Don’t Roll Your Own Authentication or User Management

It is very easy to inadvertently introduce security issues when writing authentication or user management code. Leave authentication to a separate industry-accepted platform, vendor, or library.

If you use Dash Enterprise, you don’t need to worry about this. Dash Enterprise connects to several IdPs (Active Directory FS, Azure AD, Ping Federate, Okta) through several standard protocols (LDAP, SAML). The authentication (who is logged in) and authorization (who has access to the happen) happens before the request even hits the Dash app, so no logic is required in the Dash app code.

14 Likes

Hey @chriddyp This is an important topic and thanks for starting this thread. Its so helpful to see specific examples for making Dash apps more secure. I have a couple question about security when using HTML in Markdown and with making or using custom components.

HTML in Markdown

In the next release of Dash (1.21.0) it will be possible to allow HTML in Markdown in the Dash DataTable. Previously, this was not allowed for security reasons. In the review of this PR @alexcjohnson provided some excellent information on XSS attacks, and I think it’s worth repeating here (lightly edited):

The general pattern is:

  • A malicious user enters information that gets saved in a database or file

  • An authorized user logs into the app

  • If the malicious content is rendered as html, it can contain JavaScript that executes

  • This JavaScript is executed with the permissions of the authorized user, so can make requests to the server to read and/or change information that public users normally are not allowed to access.

It obviously depends on the details of the app but there can be a lot of ways - often unexpected - that a malicious user can get data into a system. So typically the easiest way to plug this attack is by not rendering content directly as html.

See also the readme of GitHub - plotly/dash-dangerously-set-inner-html: Dash component to dangerously set inner raw HTML

This is great info, but it still leaves me with a lot of questions. Are there ways to mitigate the risk of using HTML tags in markdown? With app security, it’s common to talk about “untrusted data” But what exactly is untrusted data in this situation?

  • The Dash DataTable uses Remarkable for its markup. Given that Remarkable escapes the HTML, does that mean it’s not rending content directly as HTML, making it “safe” to use?

  • Is certain data riskier than others? For example, is it better to use an image from a local source (ie in the assets folder) rather than an external link to a public source?

  • Is it safe to use Fontawesome icons, for example, : <i class="fa fa-circle" ></i>

  • If a DataTable is editable and HTML is allowed, then a user has the ability to enter HTML tags or scripts. Would this considered “trusted data” if the users are known and the app is password protected?

  • If there is no sensitive or confidential information in an app, does that make security less of an issue?

Bottom line is: Can you provide examples of the difference between trusted and untrusted HTML and more specific guidelines on how to use this new feature safely?

Creating your own components

I’d also like to know if there is a potential security risk from using custom components.

I’ve created a few new Dash components by wrapping published React components. For those unfamiliar with the process, after creating a project with the Dash Component Boilerplate the next step is to install the React component, for example:

$ npm install rc-color-picker

When I run this, I see:

found 12 vulnerabilities (1 low, 6 moderate, 5 high)
run npm audit fix to fix them, or npm audit for details

There are guidelines on what to do in the NPM documentation, Auditing package dependencies for security vulnerabilities | npm Docs, and with the components I’ve created so far (excluding this one), I’ve been able to fix the vulnerabilities.

However, If you use a community created Dash component in your app, how do you know if the developer of that component addressed any vulnerabilities like these in the underlying React component?

I suppose you could clone the package and run npm audit, but how alarmed should you be if you see vulnerabilities? For example, I ran npm audit on the most current data-table v 4.12.0 built yesterday:

found 28 vulnerabilities (1 low, 21 moderate, 6 high) in 2413 scanned packages
28 vulnerabilities require manual review. See the full report for details.

(Since this library is maintained by Plotly, I expect that if anything was truly an issue in production that it would have been fixed)

Can you provide more guidance on security issues regarding creating and using custom components?

5 Likes

@AnnMarieW great questions! The short answer is that in many situations there isn’t a real risk but it can be very tricky to tell, so we try to always make our tools as powerful as possible in ways we know are safe, so that as few people as possible have any reason to use the risky patterns.

Given that Remarkable escapes the HTML

By default Remarkable escapes the HTML, meaning it doesn’t render it as HTML but as text, so it’s safe. But that’s exactly what we’re allowing users to turn off in the new version, so it will render HTML, in which case it’s not safe.

Is certain data riskier than others?

For sure, and there are other risks worse than XSS. SQL injection was mentioned above, that’s a very common class of problem but it has a straightforward solution: don’t assemble your own query strings with user input, pass them as parameters and let the library you’re using do it.

Some of the most serious of all are remote code execution attacks, which can arise in remarkably subtle ways, but often involving user input of binary data. For example, Plotly’s Chart Studio at one point had a vulnerability involving a user-provided image that we manipulated on the server with the Python library Pillow, and a hacker showed us how this could be used to run any shell command on the server - normal image formats are fine but the hacker made a PostScript file and tricked us into thinking it was a jpeg. Pillow then called out to GhostScript (which we didn’t even know we had installed) which then executed shell commands embedded in the PostScript.

Is it safe to use Fontawesome icons

Well-used open-source libraries like this are generally considered very safe, as they’re constantly attacked on many websites. But anyway front-end-only libraries are usually just fine, most are XSS-safe or (like React) will make it clear when you’re doing something potentially risky.

If a DataTable is editable and HTML is allowed, then a user has the ability to enter HTML tags or scripts. Would this considered “trusted data” if the users are known and the app is password protected?

If the html the user enters gets stored on the server and loaded by a different user, then this is a possible XSS vector. You’re right that on an app that only exists behind a password or a corporate firewall there’s less of a risk than on one that’s accessible to the whole internet. There’s still the possibility of disgruntled employees or attackers having gotten past the firewall and hunting for more. And sometimes the app has access to resources that individual users aren’t supposed to.

If there is no sensitive or confidential information in an app, does that make security less of an issue?

Absolutely, though there can still be avenues to exploit - for example if users are authenticated and you manage to steal passwords, this could give you access to other apps or the passwords could be reused on other sites. Or even if the app itself has no sensitive data, perhaps elsewhere in the same database there is some.

found 28 vulnerabilities (1 low, 21 moderate, 6 high) in 2413 scanned packages

I ran audit today, dash-table is already up to 31 :sweat_smile: But check out the great work @archmoj has done with Plotly.js - a clean 0 vulnerabilities :muscle:

We do try to keep these in check, but it’s also important to look where these vulnerabilities are in the dependency chain, either from the audit report itself or from calling npm ls <package-name>. Right now all the vulnerabilities in the table are in either the test environment or the build system, none of them are actually bundled into the component. There’s no risk if the vulnerability is just part of the development process. Also note that npm is generally on the overcautious side, they consider Regular Expression Denial of Service to be a high-priority vulnerability but DoS can’t compromise your app, only slow it down.

Can you provide more guidance on security issues regarding creating and using custom components?

Generally the only thing you need to worry about in a custom component is XSS - so rendering a prop directly into html or calling eval() on a prop to execute JavaScript. You don’t need to worry about props your component allows to be set by the user, that may then be used as callback inputs, because an attacker can always bypass any validation you do on prop values and pass fake props to the server.

In React code you write yourself it’s pretty easy to be safe just by using normal React patterns. The place to be careful is when you’re integrating with large 3rd-party libraries, particularly those written in vanilla JavaScript. These may assume that their configuration is hard-coded by the developer, not coming from user input. So when you’re calling someone else’s code and there’s a prop or parameter that looks like it may be rendered as HTML or JavaScript (ie strings to be handled in some rich way), the safe thing is to not expose that parameter as a prop of your new component until you know the library protects against these attacks. If it doesn’t then it’ll be quite hard for you to pass on any user input safely. Not impossible, but hard.

4 Likes

chriddyp said:

If you use Dash Enterprise, you don’t need to worry about this. Dash Enterprise connects to several IdPs (Active Directory FS, Azure AD, Ping Federate, Okta) through several standard protocols (LDAP, SAML)

I have a concern regarding this aspect of Dash Enterprise. We are supposed to pay for Dash Enterprise, so we should try to properly use its features to get what we pay for.

Regarding these Identity Providers, what if no one in our organization knows how to use identity providers? If we pay for Dash Enterprise, will some instructions or assistance be provided to let someone who knows nothing of it make it work?

If the organization uses Google Cloud Platform and has Google Identity and Access Management (IAM), does Google IAM count as an identity provider and does Dash Enterprise connect to it?

We help you! We have an enterprise support team that assists all of our customers with installation & configuration of all parts of the product, from writing apps to configuring the software. Get in touch with us to chat about your use case in more detail. Get in touch with us by filling out this form here: Get Demo :slight_smile: