Fig: Incorrect error

Hey everyone! I will say that i love the product and have been following along with the documentation, but i keep hitting a snag with the syntax. For example, i was trying to follow this and when i ran it I keep getting 2 errors. First one was Fig syntax was incorrect. Second, url = py.plot(fig, filename=‘scatterplots’) was incorrect, as py it not defined. I’m a a complete loss at the moment. Much like the aformentioned scatterplot, I wanted to do a create a plot with for each dataframe, and its a huge file (500000+ entries) I was looking if anyone could guide me or tell me an easier way to do it! thanks in advanced!!

import pandas as pd
import plotly.graph_objs as go
import plotly.plotly as py
import plotly

plotly.tools.set_credentials_file(username='n_husain', api_key='XXXXXXXXX')

#create dataframe of information from .csv
#NOTE: I have not found a cleaner way of doing this other than hardcoding it in :(
df = pd.read_table('/media/nadeem/apollo/MetaGenomics/final/bacterial_data.csv', sep = ',')

#Select columns that are needed, which are adpated from script.py
#df = df[["I_Begin", "Identity"]]
dfhp = df[df.Name == 'helicobacter_pylori']
dfbv = df[df.Name == 'Bacteroides_vulgatus']
dfab = df[df.Name == 'Acinetobacter_baumannii']
dflm = df[df.Name == 'listeria_monocytogenes']
dfrs = df[df.Name == 'rhodobacter_sphaeroides']
dfsp = df[df.Name == 'streptococcus_pneumoniae']
df.head(2)

#sort all acending
dfhp = dfhp.sort_values('Identity', ascending = False)
dfbv = dfbv.sort_values('Identity', ascending = False)
dfab = dfab.sort_values('Identity', ascending = False)
dflm = dflm.sort_values('Identity', ascending = False)
dfrs = dfrs.sort_values('Identity', ascending = False)
dfsp = dfsp.sort_values('Identity', ascending = False)

#create FRP using a scatterplot
fig = {
    'data': [
	        {
	            'x': dfhp.I_Begin, 
	            'y': dfhp.Identity, 
	            'text': dfhp.ReadName, 
	            'mode': 'markers', 
	            'name': 'Helicobacter_Pylori'},
	        {
	            'x': dfbv.I_Begin, 
	            'y': dfbv.Identity, 
	            'text': dfbv.ReadName, 
	            'mode': 'markers', 
	            'name': 'Bacteroides_vulgatus'},
	        {
	            'x': dfab.I_Begin, 
	            'y': dfab.Identity, 
	            'text': dfab.ReadName, 
	            'mode': 'markers', 
	            'name': 'Acinetobacter_baumannii'},
	        {
	            'x': dflm.I_Begin, 
	            'y': dflm.Identity, 
	            'text': dflm.ReadName, 
	            'mode': 'markers', 
	            'name': 'listeria_monocytogenes'},
	        {
	            'x': dfrs.I_Begin, 
	            'y': dfrs.Identity, 
	            'text': dfrs.ReadName, 
	            'mode': 'markers', 
	            'name': 'rhodobacter_sphaeroides'},
	        {
	            'x': dfsp.I_Begin, 
	            'y': dfsp.Identity, 
	            'text': dfsp.ReadName, 
	            'mode': 'markers', 
	            'name': 'streptococcus_pneumoniae'},

    ],
    'layout': {
        'xaxis': {'title': "I_Begin"},
        'yaxis': {'title': "Identity (%)"}
    }
}


url = py.plot(fig, filename='scatterplots')

Hi @nhusain,

For the documentation examples that aren’t working, could you open an issue with the documentation repo at https://github.com/plotly/documentation/issues (the docs are all open source too). A link to the doc page and a copy of the error message you’re seeing is enough.

For your particular example, if you’re creating plots with that many points you’ll need to use the webgl accelerated scattergl trace type. You can do this by setting 'type': 'scattergl' in each trace dict. For example:

fig = {
    'data': [
	        {
                    'type': 'scattergl',
	            'x': dfhp.I_Begin, 
	            'y': dfhp.Identity, 
	            'text': dfhp.ReadName, 
	            'mode': 'markers', 
	            'name': 'Helicobacter_Pylori'},
...

It looks like you figured out that the import plotly.plotly as py is needed, are you still having problems displaying the figure?

Also, you may be able to create this figure by looping over the unique values of df.Name using an approach like Similar to seaborn's hue function in plotly. (you would use fig.add_scattergl rather that fig.add_scatter)

Hope that helps,
-Jon