Learn how to use Dash Bio for next-gen sequencing & quality control. 🧬 Access the recorded webinar.

Deploying your Dash app to Heroku - THE MAGICAL GUIDE

So you have your Dash app running on your local machine and you’re finally ready to share it with the world on a public site.

The problem is: words like like Git, Flask, Gunicorn and Heroku sound like strange mythical creatures, even after a few drinks. Worry not, having just gone through the process of deploying Dash to Heroku myself for the first time, I’ll share what I’ve learned along the way. I’ll outline some surprising pitfalls and solutions I found in the hope this will save you time and effort. This is the guide I wish I had when I started.

Background

In particular, I think the process to successfully deploy a Dash app on Heroku, as an example use case, is not trivial for many first timers. For example, simply serving static files (like audio, images, video) does not work out of the box with Heroku like it does from your local machine. I didn’t realise this, along with a few other quirks.

This essay is designed to supplement the existing documentation and attempts to fill in some gaps, explain the quirks, and provide a very brief fly-over of each component and how everything fits together. I’ll share notes from my personal experience. I’ll also attempt to explain the technology stack based on my imperfect understanding of how it all works.

No doubt I’ve got many things wrong so I welcome corrections.

Assumptions

  • You have a running Dash app locally hosted (with a requirements.txt)
  • You are relatively new to Python
  • You have a strong cup of coffee
  • You have never deployed a public web app before
  • Words like Heroku and Gunicorn scare you.

The Problem

In my research I scanned many blog articles and guides about deploying Dash to Heroku but found most to be a little lacklustre or not specific to Dash. They are typically bare minimum, light on explanation, and don’t outline key issues you will encounter in the detail you need. If you’re working with Dash, chances are you are newbie to Flask aswell so many concepts are not well understood. I’ve yet to see a guide that outlines the core concepts and pitfalls for specifically deploying a Dash app to Heroku. I believe this lack of comprehensive guidance is a (solvable) barrier to entry for many Dash users like myself.

How much pain is this going to be?

How hard really is it to Deploy to Heroku as a first timer? How much pain is involved? What problems will I encounter? In short, I’d say optimistically it can be done in a few hours, but realistically about 10hrs for a first timer. Pain meter: medium spicy.

This is because it takes time to setup the environments like Github, Heroku command line interface, add special new files to your project directory (repository), modify the code in a few spots, and get used to the commands you need in order to see what’s going on and make it all work. It can be a little daunting at first, but once you’ve done the initial setup, you’re on easy street. Deploying code running on your laptop to your live public web app is a few mouse clicks and single command in a terminal, which is super cool. (Caveat: this does not cover security and authentication, this is purely to deploy a hobby app).

Why Heroku?

Like myself, you have probably heard of Heroku as a well loved platform for deploying hobbyist web applications for free. Of course it’s not the only one, there are many, and it is scalable to enterprise level deployments. But it is universally known and liked by the community, so I chose to go this route and see what all the fuss was about. Verdict so far: loving it. Key reasons the community loves Heroku (as far as I know):

  • It’s free for hobby apps, which is great to get started and for demos
  • It’s got clear, concise documentation
  • It natively supports Python web apps (read: minimal config needed)

What is Heroku?

It’s a platform-as-a-service (PaaS) for deploying and hosting web applications. In the context of your Dash app, this means Heroku provides the physical hardware (storage, compute), software (linux/unix/sql) services, and dependencies (packages), in a containerised environment to deploy and host your application on a publicly accessible URL (end-point). It does this through provision of virtualised linux containers called “Dynos” which essentially act as your personal linux webserver, with ram, cpu, and linux ‘installed’. (It’s not quite like this in reality but a good analogy).

Dynos come in a variety of types and can be scaled vertically (more ram, compute, storage per instance) or horizontally (duplicate dynos in parallel) as your specific project requirements demand. This can be done almost instantaneously at command line. The free version gets you one dyno with up to 500MB storage and 500MB ram. It sleeps after 30 minutes of inactivity, presumably so Heroku resources are not drained. So the catch with the free version is that your website can take a good 30-60 seconds to load initially, as your free Dyno is provisioned on demand. If you go to a paid plan, starting at about $7USD/month, your dyno(s) stay on and ready 24hrs/day.

Why GitHub?

In short - Heroku natively supports deploying repositories that reside in GitHub. This is good news. Basically it means if your project is already in a GitHub repository (free for private/public repos) then you can easily deploy it on Heroku AFTER you have added a few additional files that are outlined in the deployment guides, and in a section further below.

If you’ve been developing your pet project on a local machine, this is an important step to take it to a public (or private) cloud repository. It’s a good move anyway because you have full versioning history, you are protected from hdd failure, and you can share publicly or privately etc. It does however come with intellectual debt, with some interesting concepts and terminology to get your head around, like clone, fork, merge, push, pull, commit.

Yet another barrier to newcomers is the issue with security and how this affects you accessing and changing the code in your cloud repository on GitHub or similar. Basically GitHub wants a secure connection between your computer and it’s servers before it will happily accept code changes. There are two main ways it achieves this: using credential authentication over HTTPS (requiring a username/pass every time a connection is made), or via SSH public/private key encryption which is not natively supported by Windows. This extra complication, combined with the other scary words like clone, fork, merge can be a little overwhelming at first. Fear not, there is a desktop app that can setup a secure connection between itself and your GitHub repo, facilitating seamless easy updates to your code repository.

If you are developing on a windows/mac machine (which I assume the majority of first timers are), I’d highly recommend getting the github Desktop application. This just makes the process of cloning, fetching and pushing changes back to your repository on github MUCH easier without the need for any command line. It’s not that I’m against command line, it’s just that this particular process can be clunky on windows, requiring either user credential authentication or SSH keys. (If you are more hardcore you can of course install Windows Subsystem for Linux (a way to have a fully functioning Linux system on Windows without partitioning your HDD). If you do this you can setup SSH keys and enjoy the benefits of linux for managing your code repo but for first timers, it’s really not necessary). In short, you can avoid a lot of hassle just by using the Desktop app, setting up the security within the app, then it’s single click of the mouse to commit changes and push updated code to your repo on GitHub.

What is GIT?

No one really knows.

What is Gunicorn?

When you figure it out, let me know. Gunicorn, to the best of my understanding, is a production-ready HTTP server specifically for Python web applications which runs natively in Unix. If you’ve been developing your dash app purely on your local machine @ ‘localhost:8080’ or ‘http://127.0.0.1:8050/’ you will be running a light weight HTTP server that is shipped with your Python installation. This is not Gunicorn. It’s likely you have not yet glimpsed this rare and mythical creature of the forest.

The local HTTP server (shipped with your Python installation) is automatically run by your Python Kernel when your dash app is executed on your local machine. The issue is, it’s not designed for handling incoming traffic from a production website and so when you deploy to the web, you need a production-ready HTTP server. A popular one is Gunicorn. Notably, Heroku provides native support for Gunicorn which makes things easy. It’s all outlined in the guides, but just to clarify, all you need to do is add a single line of code to your dash app (‘server = app.server’), add Gunicorn into your requirements.txt so it is installed as a package on your local machine (and by Heroku at deployment), and reference it in a special file you will create called the Procfile. More on this later but I think it’s worth briefly touching on the HTTP server as it’s all a bit mysterious the first time.

Web is hard

This is a simple truth. Web is multi-layer, multi-language, multi-protocol, multi-platform, multi-user. It’s a mind boggling chain of infrastructure bolted to other infrastructure to make a modern web application run. For many non-IT people (and IT people for that matter), even the concept of a locally hosted webserver takes a bit of abstract thought, let alone understanding the true technology stack that lies underneath a real application. It’s also worth reflecting on just how new some of this technology is, so I’ve indicated the year these tools were created in the table below.

The simplified technology stack

This is imperfect, so please help me to correct it. But it’s useful, I think, to see some of the layers required to get your code actually deployed onto the web. We start with your actual code at the very top of the stack, and drill down layers all the way to Heroku.

Layer Created Name Note
User Code today Your code Python code for your application
Web application 2017 Dash Allows entire website to be written in Python by wrapping web components and facilitating 2-way communication (callbacks)
Javascript library 2015 Plotly.js Allows access to powerful ecosystem of data visalisations (40 chart types) that run responsively client side. Dash is built on top of this library. The library itself is built on top of d3.js and stack.gl.
Javascript library 2013 React.js An open-source, front end, JavaScript library for building user interfaces or UI components. Notably Dash is witten on top of this.
Web framework 2010 Flask Collection of modules and libraries to abstract away hard things in web like protocols and threads. Flask is the underlying web application that sits under Dash. Dash is, in essence, a Flask application.
Web template engine nfi Jinga Template engine for Python. Something important for serious developers. Novices can ignore.
WSGI toolkit nfi Werkzeug List of web application libraries (used by Flask natively). Novices can ignore.
WSGI (HTTP) Server 2010 Gunicorn A popular python HTTP server; the thing that manages incoming requests from the browser.
Code Repository 2008 Github A free facility to store, collaborate, manage, update and deploy your code securely in the cloud.
Web Deployment & Hosting 2007 Heroku A scalable platform-as-a-service (PaaS) to deploy and physically host your web application and make it accessible on the internet.

The point I’m trying to make here is that this grossly oversimplified web technology stack is still far from simple; to say nothing about front end layers such as javascript, CSS etc. Web is hard because of the sheer number of abstraction layers. Dash, to me, is a beautiful abstraction that builds on everything below it to simplify what is actually an insanely complex machine: the modern data-rich web application.

Dash-Heroku deployment, in a nutshell

What actually needs to be done:

  1. Dash app running on localhost
  2. Install Git
  3. Setup github account (+ recommend install Github Desktop)
  4. Setup Heroku account (+ install the command line interface)
  5. Add dependencies and special files (i.e. install and import Gunicorn, create Procfile and runtime.txt)
  6. Clone repo from github to local machine (only once)
  7. Create heroku app linked to your repo (only once, ref deployment guides, heroku CLI)
  8. Commit and push your code changes to github repo (repetitively)
  9. Deploy/Re-deploy Heroku app by pushing changes from Heroku CLI (“git push heroku main”)

Deployment Guides

The guides below are concise and useful, and I would of course start with these. If I’m honest I think they are a little light on detail for newcomers and would benefit greatly by having a supplementary explanatory guide akin to something like this essay.

The magical ingredients to add to your project

A quick note on the special files you need uniquely to get your python project deployed to Heroku. This is outlined in the deployment guide, so I’ve just provided a few notes from my experience:

Ingredient 1: Procfile

This strange extensionless file must reside in your project root, and tells Heroku how to handle web processes (in our case using Gunicorn HTTP server) and the name of your Python application.

Typically the Procfile would contain a single line:

web: gunicorn app:server

Where:

  • ‘web:’ tells Heroku the dyno main process is a web process
  • ‘gunicorn’ tells heroku that the HTTP server to use is Gunicorn (for which it has native support for)
  • ‘app’ references the filename of the main python file without the .py extension. So if you follow the convention of ‘app.py’ you would use ‘app’ here. But note if your main python file is ‘anything.py’, you would have ‘anything’ in place of ‘app’.
  • ‘server’ references the underlying flask app. Commonly you would define a variable ‘server = app.server’ and this references that variable, I believe. To be more confusing, the ‘app’ in this variable declaration actually refers to the dash instantiation variable in the snippet below:

app = dash.Dash(__name__)
server = app.server

Yes I know what you’re thinking, this is finicky and it’s really easy to misunderstand with all these ‘app’ references everywhere. Take home is: as long as you are using an app.py main file, as is the convention, and you declare a ‘server = app.server’ line of code after your Dash declaration, you can use the example Procfile and it should work. If you get anything with the Procfile wrong: pain and suffering will ensue.

To make the Procfile, from memory in Windows, you can just create a text file, enter the single line. Then strip out the extension. (This worked for me and I do not need to have a secondary Procfile.win which is sometimes talked about in the documentation)

Ingredient 2: runtime.txt

This file (which must also be in your root project folder) simply tells Heroku which Python runtime to use. Currently it can contain a single line, e.g.:

python-3.7.8

Just create this as a notepad .txt file in windows. Done.

That’s really it. It’s mainly these two files (Procfile, runtime.txt) that Heroku needs in your repo project directory in order to work. As long as you have followed the basics, and added Gunicorn to your requirements.txt etc, in theory you are good to go.

Ingredient 3: perseverance

Not to be underestimated, dogged perseverance and determination is a key ingredient to the potion.

It’s magic time

You’ve got your code in a GitHub repository, with the required tweaks and files created. You have Heroku CLI installed and have created a Heroku app linked to your GitHub repo. It’s 4am and the sun is coming up soon. It’s show time.

Deploy from the Heroku command line interface:

git push heroku main

These four words are the spell that makes the magic happen.

Type them into the Heroku CLI in the right conditions, sit back smuggly, and enjoy the show.

For those new to Heroku, if everything has worked after your “git push heroku main” from the Heroku CLI, your app will be deployed to a Heroku subdomain like:

http://blah.herokuapp.com/

If this is the case, recommend a little dance.

Copy-paste the URL displayed in the Heroku CLI into your browser and get ready…to be disappointed. Chances are the first time you will see “APPLICATION ERROR” or something like that. Don’t panic.

The first thing you should do is bring up the log (which is effectively your python console) and see what’s going on, from the Heroku CLI. Any print statements, or logger outputs from your code will display here just as they do in the console on your local machine.

View logs:

heroku logs --tail

Check for things like “Module not found errors” and simple things like that. The most common problems I’ve found are forgetting to add packages to my requirements.txt file because I frantically installed them to my local machine with conda/pip to get something working. If you’ve found some obvious problems, fix them, repush your code to GitHub, and then redeploy from Heroku CLI with “git push heroku main”.

Notably though, the first time is the hardest. Errors in your Procfile can still cause Heroku to deploy successfully, but the dynos will crash or fail to start, so definitely check the Procfile. By now the sun is likely coming up and it’s a work day. But it’ll all be worth it when you see your app hosted publicly, so carry on.

Special note if there are no changes to your repo, Heroku will not deploy. Which makes sense. So if have a repo cloned onto your local machine, and you are making changes, be sure to commit and push changes to your GitHub repo first (either with command or with GitHub desktop), then in the Heroku CLI terminal, just type in the deploy command.

HEROKU TIP: Useful commands from Heroku CLI

Below is my list of critically important tips, pitfalls, and pitfall solutions when using Heroku.

Explicitly referencing your app name:

Note Heroku can sometimes be funny about requiring you to explicitly specify your app in the command. If you just have a single heroku app, often you can avoid it. But often you may need to append “-a <yourapp>” to the command.

Display current apps:

heroku apps

Display current dynos:

heroku ps

heroku ps -a <yourapp>

Scale dynos:

heroku ps:scale web=2:standard-2x

In this case we are provisioning two standard-2x dynos to run concurrently. Special note, if WEB_CONCURRENCY=4, this means each Dyno can serve 4 simultaneous HTTP incoming requests, meaning your whole application can serve 8 concurrent requests; the benefit of horizontal scaling.

Run bash terminal:

heroku run bash -a <yourapp>

Restart dynos:

heroku dyno:restart

Add additional log metrics:

heroku labs:enable log-runtime-metrics

View logs:

heroku logs --tail

HEROKU TIP: Add log-runtime-metrics to log

From the Heroku CLI (once logged in) when you have deployed your app, you can view a live log tail by typing

heroku logs --tail

Repeated just in case you missed it. This essentially gives you your console output. One thing I’d suggest is adding in a new feature that outputs resources statistics of your dyno(s) timestamped every 20 seconds, like memory levels, cpu load etc, which is very useful. Type this in the Heroku CLI, to permanently add it:

heroku labs:enable log-runtime-metrics

HEROKU PITFALL: Serving static files does not work

I repeat: serving static files DOES NOT WORK. Something of paramount importance that is not obvious, is that out-of-the-box, Heroku (I think more correctly: Gunicorn itself) does not natively support serving static files. This means, whilst your python application itself can access files in any subfolder in your project folder (such as .csv files and the like) it’s a very different story to actually serve them via http in the client browser.

This means any images, video, audio, anything you are currently serving from your ‘localhost’ webserver will fail on deployment with Heroku. I believe this is a quirk of the PaaS model in that files themselves are not stored in the traditional way you would imagine them to be on a file system so there are issues with low level headers that are attached to files, and/or Gunicorn itself does not natively support serving static files. In any regard, there is magic under the hood.

As an aside, If you don’t already know from the docs, it’s important to understand that the Heroku file system is not persistent. Like many of my past relationships, Heroku’s file system is ephemeral or transient. It lasts about as long as a one night stand. With the exception of the files you deploy with your repo (e.g. csv, json files etc) any new files created at runtime will disappear after a few days.

Anyway, to store and serve persistent static files, as I said any files uploaded to Heroku as part of your project file suite will be fine and persistent, and accessible by your dash app internally. BUT, the moment you want to serve static files externally to your Heroku-Dash deployment, you will rapidly run into problems. There are two main solutions, one is simple and fast.

Solutions:

  1. Host your files on a 3rd party like S3, Cloudfront and link the URL in your dash app (Worth doing if you will be hosting a serious footprint of files)
  2. Use the Whitenoise library. Quick and easy. A few lines of code and you’re serving files in the way you would imagine.

Personally I found whitenoise to be a life saver. Literally “pip install whitenoise” (and make sure it’s in your requirements.txt) and you’re almost there. Two lines of code needed in your dash app:

from whitenoise import WhiteNoise
server = app.server
server.wsgi_app = WhiteNoise(server.wsgi_app, root=‘static/’)

You should already have the server=app.server anyway as this is needed by Gunicorn and for the Procfile. What this essentially does is set a folder (which you must create) called “/static” in your root. Everything contained within this (including subfolders) can be statically served by Heroku. Images, videos, pdfs, whatever the hell you want. Just note Heroku is extension case-sensitive. So blah.png is different to blah.PNG.

Also, don’t try to get smart and change the ‘static’ folder name in the whitenoise code to some arbitrary name or ‘assets’ or anything like that: it has to be ‘static’ due to an underlying Flask constraint. Period.

This seems like a pretty major issue that I don’t think much documentation exists on. I spent a long time on Stackoverflow looking it up. I really think it should be a sticky thread on the Plotly Forums or something.

Also, the Whitenoise documentation is not specific to Dash, it is more focused on general Python apps which are typically Flask apps. This means that it’s still not obvious what you need to do, and the code snippets will not work without modification. For example whitenoise states for Flask apps, you must add the following code to your app:

app.wsgi_app = WhiteNoise(app.wsgi_app, root=‘static/’)

This won’t work for your dash app. In this case ‘app’ is the flask app. So in a Dash app (which sits ontop of Flask) you actually need to replace the ‘app’ with ‘app.server’ in the snippet above to reference the underlying flask app and for whitenoise to work. Or simply define a variable such as ‘server = app.server’ and use the code snippet I outlined at the beginning of this section.

Again, lots of these things are a 2 second fix if you know how. But can cost you literally HOURS AND HOURS……AND HOURS of time if you don’t know. Trivial for Flask developers. Not trivial at all for newcomers.

HEROKU PITFALL: Favicon may not work

For some reason I had lots of trouble with this. Anyway I managed to get it going by simply having a :

/assets/favicon.ico

From my root project directory. Special note that no other static files are served from here, it’s a stand-alone folder. In fact, don’t be lulled into thinking you can serve static files from your /assets folder on Heroku: you can’t. (see whitenoise section). Others have had problems with Heroku changing the extension name of the favicon causing it to fail. One failsafe option to note is you can in fact log into a Heroku Bash shell after you have deployed, and navigate to all your project folders/files to see what Heroku sees. See this post.

From heroku CLI:

heroku run bash -a <yourappname>

This will provision a new Dyno container running a Bash shell. Basically it’s a terminal to your deployed app.

HEROKU PITFALL: Web concurrency is important and can be configured

There is lots of ‘worker’ and ‘web’ terminology that gets confusing. Out of the box when using Gunicorn as your Python HTTP server, Heroku essentially guesses how many concurrent web-worker-processes to run for each dyno instance running your web app. Typically this is 1-6 concurrent ‘gunicorn-worker-web-processes’ per dyno for the commonly used hobby to standard 2-x dynos. This is how many client requests (i.e. from a web browser) can be simultaneously served by your app at an instantaneous point in time.

A gunicorn web-worker-process is a process capable of serving a single HTTP request at a time. So if you only had one, this means your website becomes quite unresponsive with a few users making simultaneous requests, and having to wait for these requests to be actioned from a queue. Essentially this is what Gunicorn does, it forks the main web process running on it’s Dyno into multiple (threads?) processes. Web concurrency in Heroku allows each dyno instance to essentially carve up it’s resources to serve multiple concurrent HTTP requests, which it calls WEB_CONCURRENCY. The problem is, this can sometimes lead to underestimating resources needed, and running over Dyno memory limits, causing failure, restarts, massive slow downs due to disk swap having to be used etc. Basically you don’t want to have too much web concurrency because it might break your dyno.

As I said, you don’t need to worry about this day 1, your app will work. But as you start load testing it, you may find you run into memory overrun issues and all sorts of things like that. If you have a high horsepower python application that chews resources, suggest you manually set your WEB_CONCURRENCY variable in heroku command line.

For example:

heroku config:set WEB_CONCURRENCY=3

heroku config:set WEB_CONCURRENCY=3 -a <herokuappname>

If performance is not compromised, you can increase web concurrency to increase the number of clients you can serve in parallel, while minimising Dyno cost. If you need to serve more, you can scale Dyno’s horizontally knowing that each one can serve an explicit number of concurrent HTTP requests

And of course you can monitor this with “heroku logs --tail” or in the Heroku dashboard METRICS section

HEROKU PITFALL: Hard limit 30 second request timeout

It’s important to be aware that Heroku has an unchangeable 30 second timeout for serving HTTP requests. This is a common problem especially encountered by Dash users because many of the data science applications have long load times, see this post. These might work fine running on your localhost, but be aware your Heroku deployed app must be able to serve within 30 seconds or it will time out. Heroku docs state a few work arounds but take special note of this problem.

HEROKU PITFALL: Develop on the master/main branch of your GitHub repository

If you are new to GitHub, just know that you can have multiple ‘branches’ of your project as you might take it in different directions. These can be merged or left as separate branches. The central branch by default is called master or main in GitHub. When you create your Heroku app it interacts with your GitHub repository to create a kind of Heroku mirror image behind the scenes. If you are developing your current code on a branch that is not master or main, prepare for pain. It’s not that it can’t be done, I just had a lot of trouble with this when trying to deploy to Heroku and found the best rule of thumb is to just develop all my code on the default ‘main/master’ branch in my GitHub repository.

Custom Domain

It’s not too difficult to setup a custom domain for your Heroku app. Obviously you need to purchase a domain first. Once you’ve done that, the provider will typically have a portal where you can login and adjust settings.

Heroku will generate a unique DNS target in the SETTINGS area of the dashboard, once logged in. Such as

Animate-salamander-8duwlndghfqbtj0t90uep8bmu.herokudns.com

What you need to do is copy this DNS target from the Heroku portal (settings page) and then login to your domain provider portal (e.g. Namecheap) and for your domain, create a new “CNAME record” with host “www” value “Animate-salamander-8duwlndghfqbtj0t90uep8bmu.herokudns.com” (your unique Heroku DNS target).

If it worked ok, in a few hrs your new domain should work!

Essentially all this is doing is when someone types your actual domain name www.blah.com it is redirecting to the Heroku DNS target, which points the incoming HTTP request to Heroku infrastructure, which then serves the actual page (as if you’d typed in blah.herokuapp.com). This was not entirely obvious to me as a newcomer. Again I fumbled my way though this as a first timer and it was pretty painful despite good documentation.

Flask Caching on Heroku
If you have Flask Caching running on your local machine, it’s straight forward to setup on Heroku with a free Memcachier account. And the docs are good. You can cache to the ephemeral Heroku file-system without Memcachier ,noting you might max out your 500MB of Dyno storage, otherwise you can get 100MB free high performance cache via Memcachier.

Getting Fancy with security and autoscaling etc

When you want to go to the next level and setup auto-scaling of machines, proper security/authentication etc, I think this is when it starts becoming worth considering Dash Enterprise OR going down the path of provisioning your own virtual machines, setting up containerised pipelines using Docker, Kubernetes and manage autoscaling with Rancher, for example etc. It’s DevOps territory.

For the newbies and hobbyists like me out there, I sincerely hope this has helped you get your project up and running faster with less pain :slight_smile:

Cheers
Dan

16 Likes

Hi all

In case anyone is interested, I’ve turned this little essay into a publication on Towards Data Science, which just went live today!

Have given Dash and Dash-Enterprise a plug, and attempted to fill in some explanatory content to supplement the documentation for specifically deploying Dash to Heroku.

Please let me know if I have anything wrong! :slight_smile:

Deploying your Dash App to Heroku — THE MAGICAL GUIDE | by Dan Baker | Nov, 2020 | Towards Data Science

Hey Dan,
Very Nice Guide. Thanks for it.
But I am having issue with nltk library as heroku not able to download it in server but it working fine with local machine.
Here is the error which it gives.

remote: -----> Installing requirements with pip
remote: -----> Downloading NLTK corpora…
remote:  !     'nltk.txt' not found, not downloading any corpora
remote:  !     Learn more: https://devcenter.heroku.com/articles/python-nltk

I have looked into this page but can’t figure out where to put this nltk.txt file in my dash folder. can you help me what I am doing wrong?
Here is how I am doing it as said in the above link but still same error.

@matsujju hmm

Can you share your requirements.txt?

Assuming you have nltk specified in the requirements.txt file aswell? I think this tells Heroku to then open the nltk.txt file itself. Assume you do because it says downloading NLTK and then looks for the text file.

Strange it can’t find it in route.

One option I’d do is after it fails to deploy, see if you can remote in with a bash terminal to check if you can see the nltk.txt file in your project route on Heroku.

heroku run bash -a <yourapp>

Once you’re in, you should just be able to type:

ls

To list the root folder, and

cat nltk.txt

To display it. If it’s not there for some reason, that’s the obvious problem. No idea why it wouldn’t be.

You could also try playing around with the Heroku buildpack:

$ heroku buildpacks:set heroku/python

In case it’s some weird thing. It seems like it’s relatively experimental with Heroku so it could be some other problem aswell. Good luck!

Also, try removing the .txt extension from your nltk file. If you look at your requirements.txt it doesn’t have the extension as it’s already classified as a text document.

So in reality, your current nltk file may actually be nltk.txt.txt

(see how it your requirements has no extension and the nltk file does in your screenshot)

Shooting the breeze here but it might be an annoying little thing like that. Remoting in with the bash shell in Heroku will allow you to verify 100% what Heroku sees.

Example here from my app:

Shell in, and list contents of my requirements.txt from Heroku

`[quote=“dan_baker, post:5, topic:46723”]
Also, try removing the .txt extension from your nltk file. If you look at your requirements.txt it doesn’t have the extension as it’s already classified as a text document.

So in reality, your current nltk file may actually be nltk.txt.txt
[/quote]

I feel like I am so dumb and did this silly mistake :pensive:
But now my app is crashing as I looked into heroku logs.

2021-01-16T14:03:01.333606+00:00 app[web.1]: [2021-01-16 14:03:01 +0000] [41] [INFO] Worker exiting (pid: 41)
2021-01-16T14:03:01.333612+00:00 app[web.1]: [2021-01-16 14:03:01 +0000] [42] [INFO] Worker exiting (pid: 42)
2021-01-16T14:03:04.207370+00:00 heroku[web.1]: Process running mem=969M(189.3%)
2021-01-16T14:03:04.209269+00:00 heroku[web.1]: Error R14 (Memory quota exceeded)
2021-01-16T14:03:07.378274+00:00 app[web.1]: [2021-01-16 14:03:07 +0000] [4] [INFO] Shutting down: Master
2021-01-16T14:03:07.378535+00:00 app[web.1]: [2021-01-16 14:03:07 +0000] [4] [INFO] Reason: Worker failed to boot.
2021-01-16T14:03:07.618731+00:00 heroku[web.1]: Process exited with status 3
2021-01-16T14:03:07.652068+00:00 heroku[web.1]: State changed from up to crashed

Ha, I’m glad it worked. It’s ALWAYS the little things like this. I spent 3 hours trying to get Whitenoise working for my app, and it 3 characters stopping it from working.

Anyway, worry not, I’ve seen that memory quota errror before. It’s because your Dash app is consuming over 500MB of ram for your standard Dyno. This is likely because (if you are using Gunicorn as your WSGI Http server) Heroku has probably forked it’s dyno resources into multiple threads (or processes I’m not sure which one) in order to serve multiple instances of your app on one dyno. It has a habit of overdoing this for Dash apps, which are resource hungry, so by default you run over memory. If your app requires say 200MB of ram to run, and Gunicorn forks the main process into 5 subprocesses, then this means its’ using 5 x 200MB of ram, and instantly tops out your poor dyno.

There are two super quick solutions:

  1. Set your Webconcurrency to 1 or 2, just to minimise resource drain on your dyno.
heroku config:set WEB_CONCURRENCY=1
  1. Scale up to a bigger dyno (e.g. standard 2-x that has 1GB ram)
heroku ps:scale web=1:standard-2x

This should at least get you up and running. You are so close!

where should I type these commands?

Type them straight into the Heroku command line interface (your CMD window, that is logged into Heroku)

You may need to specify -a <appname> aswell. Sometimes it get’s funny.

But yeh, to run those commands, it must be from the CLI.

Bascially from the same terminal where you were viewing the log tail.

Jsut it CTRL-C to cancel the log tail, and type them straight in! Then reboot your dynos

heroku dyno:restart

Dan

I am using free-tier so can’t use this.

See if you can get web concurrency to 1.

If it still runs over memory, you will probably have to upgrade to a hobby account and actually spend a little cash, just to get it up and running.

Currently your app is consuming 969MB of ram at boot, which is beyond the free and standard-1x dynos memory which is 500MB.

As I mentioned, it’s likely due to web concurrency, so try that first! Then if all else fails, spend cash and go to the standard-2x dyno :slight_smile:

I tried that and there is no error regarding memory but still app is crashing.

2021-01-16T14:34:15.118448+00:00 app[web.1]: File "/app/dash_app.py", line 303, in <module>
2021-01-16T14:34:15.118449+00:00 app[web.1]: encoded_image = base64.b64encode(open(image_filename, "rb").read()).decode("ascii")
2021-01-16T14:34:15.118456+00:00 app[web.1]: FileNotFoundError: [Errno 2] No such file or directory: 'assets\\dash-logo.png'
2021-01-16T14:34:15.119872+00:00 app[web.1]: [2021-01-16 14:34:15 +0000] [9] [INFO] Worker exiting (pid: 9)
2021-01-16T14:34:15.815179+00:00 app[web.1]: [2021-01-16 14:34:15 +0000] [4] [INFO] Shutting down: Master
2021-01-16T14:34:15.815435+00:00 app[web.1]: [2021-01-16 14:34:15 +0000] [4] [INFO] Reason: Worker failed to boot.
2021-01-16T14:34:16.015595+00:00 heroku[web.1]: Process exited with status 3
2021-01-16T14:34:16.065584+00:00 heroku[web.1]: State changed from up to crashed

Ok that’s one thing down.

Now you have run into the other problem I mentioned in my guide: Heroku can’t serve static files.

So it can’t load your ‘dash-logo.png’ from \assets like your local webserver can.

You can get around this using the Whitenoise library (documented in my article), buyt for now I would just comment out anywhere in your code where you are loading images, get that working. And then at least you can deploy your app to Heroku, and fix it later.

You are making progress. It’s painful but I think you’re almost there.

1 Like

Thanks Dan for all your help.
I was earlier encoding the image and then decoding for using it in dash but update way of using get_asset_url() solves the issue.
Here is my app deployed finally.
Do try it and share your thoughts.

1 Like

Hello Dan, I followed your guide but I cannot get what to do exactly with the whitenoise lib. In my app I use some images and as you said, I stored them all in a different folder called “static” inside the root. Then in my “app.py” script I put the lines of code you said to set the whitenoise. My app.py script is just this:

import dash
from whitenoise import WhiteNoise

# meta_tags are required for the app layout to be mobile responsive
app = dash.Dash(__name__, suppress_callback_exceptions=True,
                meta_tags=[{'name': 'viewport',
                            'content': 'width=device-width, initial-scale=1.0'}]
                )
server = app.server
server.wsgi_app = WhiteNoise(server.wsgi_app, root='static/')

So, after that is done, in my maint script (where I configure the layout of the page) do I need to load the images from the folder /static? I’ve done something like this:

PATH = pathlib.Path(__file__).parent
IMAGES_PATH = PATH.joinpath("../static").resolve()
image = IMAGES_PATH.joinpath("image.png")

The images doesnt show in the app, so I guess I’ve done something wrong. Can you help me?

thanks

Hi Gatopianista

It looks like your code declaration blocks are OK. I’m guessing this is something super simple. FYI If you are building in Flask and running your Dash app inside it things get a bit more complicated and you don’t actually need Whitenoise. However, if you are simply running a naked Dash app with Whitenoise (like me and most users), then it should be super simple.

The key thing is that the ‘static/’ folder is invisible to your Dash app - it actually sees everything WITHIN the static/ folder as root. So static is your root when using Whitenoise. For example I can display /static/media/heart1.png in my Dash app with the code snippet below:

dcc.Markdown(""" Built with ![Image](media/heart1.png) in Python using [Dash](https://plotly.com/dash/).""")

So from your snippet above, remove any reference to static. If you have /static/image.png it’s path is “image.png” in your dash code.

Try that and let me know how you go!!

Thanks for sharing, very comprehensive!

1 Like

Thanks for the guide!
It’s always a treat when someone gets a detailed workflow on paper and is kind enough to share!

1 Like

I deployed an app on Heroku today, although it’s a real time data visualization app, it just get new data from URL when I restart the app. Have you came across something like that

myccpay

Hi Thiel. You will have to provide some more info than that to help fault find you problem. What is the nature of the problem? What data are you talking about and where would it usually reside on your local machine? Where is the data stored? Have you brought up the Heroku logs? Recommend you attach some screenshots. Always happy to help if I can :slight_smile: