Heroku RAM Issue

To create my dashboard I need to query a BigQuery table which is currently 170k rows (30.7MB).

Because the query can take some time (15 - 20s), I am running it in a seperate worker script (similiar to the boilerplate example). Both the worker and the dash app are deployed to Heroku (the worker is running on a standard-1x dyno with 512MB RAM).

In the worker script, I’m running the query using the pandas-GBQ library

query = 'SELECT * FROM [Customer_Data.' + str(clientID) + ']'
df_bq = gbq.read_gbq(query, project_id=project_id, private_key=google_apiKey)

This returns the table as a dataframe. However, in the Heroku logs I’m seeing this error…

2018-04-06T10:59:04.068333+00:00 heroku[worker.1]: Process running mem=675M(131.9%)
2018-04-06T10:59:04.068485+00:00 heroku[worker.1]: Error R14 (Memory quota exceeded)

…which basically means the query is using too much RAM.

Is it unusual to see a query of this size causing such a spike in RAM? Is it likely to be a bug with the library or am I asking too much from Heroku?

Thanks,

Hi. I have seen similar memory spikes when loading data in Python on Heroku, but this is because Heroku will flag up a warning when you run low on memory, or even shut the process down, whereas running locally you’d never notice because your system probably has a lot more RAM and can make use of diskswap, etc. I’d suggest running the app locally and monitoring how much RAM Python is using in Task Manager (windows) or Activity Monitor (OS X). I suspect the sike is not a problem with Heroku, and might not even be a bug in the library — it just momentarily requires a lot of RAM while loading the data. Maybe?

1 Like