Why dont `graph_objs` have `transforms`? Why doesn't `cufflinks` support layered plotting?

It seems confusing how many ways there are to plot the same thing (cufflinks + iplot, or raw dicts and lists, or graph_objs). I like the idea of sticking with one and mastering it, and considering Pandas popularity and familiarity I thought cufflinks looked like a good solution since it allows plotting of groupby aggregations. Unfortunately, I am not finding a way to plot multiple traces with cufflinks so I’ve been learning the graph_objs way.

Now I’m finding that transformations and aggregations cannot be done within these calls as seen in Setting Layout in Groupby Scatter. Is there a reason why this is the case? Is there a plan to allow both transformations and layered plotting (dual axis, multiple plots, etc) in either the cufflinks or the graph_objs approaches or am I misunderstanding the intention of each? …not finding any useful explanations of each

Looks like I might have to resort to using the raw dicts and lists to customize plots, but this approach is extremely verbose and not as intuitive. Thoughts?

1 Like

Hi @tmbluth,

Here are a few thoughts that might help

  • The graph_objs hierarchy follows the native declarative API of Plotly.js
  • Cufflinks is an independent project (not designed/maintained by the plotly.py team) that largely follows the pandas .plot API. Because of this, the way it names things isn’t necessary consistent with the underlying graph_objs objects.
  • Yes, it would be possible in principal to support transforms in the graph object classes. The reason we haven’t (and don’t really have plans to) is that pandas is way more powerful than the transform system built into Plotly.js. So it’s usually more intuitive for Python users to do their data transformation in pandas rather than learning a new declarative syntax for data transformation. This is what cufflinks is doing internally.
  • Some of the use-cases for cufflinks are now covered by plotly_express (See https://medium.com/@plotlygraphs/introducing-plotly-express-808df010143d). This is a new high-level plotting API developed by the plotly.py team that is going to be integrated into plotly.py for version 4. The goal for plotly.py v4 is to make it easy to start with plotly_express to build complex cufflinks-style figures and then be able to customize them (e.g. adding traces, customizing axis ticks, etc.) in a straight forward way.

I hope that helps clear up the landscape a little,
-Jon

1 Like

Hey @jmmease,

That’s great feedback, thanks. I have been promoting plotly_express since its release, but always have to qualify its awesome functionality with its apparent lack of customization for more a professionally polished finish. I’m happy to hear it’s on track to be more flexible. With that flexibility, I believe it’s going to be a game-changing tool.

In the meantime, do you suggest then to plot multiple aggregated data traces with graph_objs that I should create separate pre-aggregated DataFrames? It would be great to keep the granularity of a single non-aggregated data set to be used in all plots in a report and have the aggregations done in the plotting tool, like plotnine (ggplot for Python) does. Does using the Plotly.js API preclude graph_objs from integrating more with Pandas transformations (which is also declarative)?

1 Like

Hi @tmbluth,

In the meantime, do you suggest then to plot multiple aggregated data traces with graph_objs that I should create separate pre-aggregated DataFrames?

Yeah, this is pretty much what I meant.

It would be great to keep the granularity of a single non-aggregated data set to be used in all plots in a report and have the aggregations done in the plotting tool, like plotnine (ggplot for Python) does.

yeah, this is what plotly_express is largely aiming for.

Does using the Plotly.js API preclude graph_objs from integrating more with Pandas transformations (which is also declarative)?

Following the Plotly.js API for the graph_objs classes means that we can’t really add new custom properties in the hierarchy that aren’t present in the schema. The plotly_express layer, on the other hand, is a higher-level Python only API so we have full control over the API, the only constraint is that in needs to output a valid graph_objs figure in the end.

Do you have in mind what an API might look like to integrate more with pandas transformations?

-Jon

I’m more familiar with how ggplot in R integrates with data.frames and dplyr transformations, but would not be as helpful in pandas integration with Javascript. However, I am confident that if it can be done in R, there is likely a way to do it in Python (perhaps by taking cues from plotnine, though not JS based). It sounds like this desired functionality is being handled via plotly_express, which I look forward to. Thanks for all your great work!

Re transforms: they exist as part of the figure spec for execution in Javascript, but I would say that in the vast majority of cases, if you’re calling plotly.js from Python via any of the above mechanisms, you are better off doing the aggregation in Python in Pandas or whatever, rather than shipping your whole dataset to the browser to do aggregation in Javascript.

Re Plotly Express and customization/polish: it’s pretty easy to customize PX figures with .update_traces() and .update_layout()… you can mutate any figure attribute after the fact using these methods. Plotly Express is also fully compatible with the built-in theme/template system, so if you have a chart that you’ve made that you like in terms of colors, axes, layout, fonts etc, you can extract a theme out of it and apply it to all your PX figures with one line (px.defaults.template = <whatever>) We’re working on comprehensive documentation for all of this right now :slight_smile:

@nicolaskruchten

I’ve tried updating the layout to make the background plot color and background color to the ones I’d prefer, but was unable to use .update_layout() as seen below

import plotly_express as px
print(px.__version__)

df = px.data.gapminder()
px_plot = px.scatter(df.loc[df.year==df.year.max(),:], x='gdpPercap', y='lifeExp', size='pop', color='continent', log_x=True)

?px_plot.update_layout

Printed:

0.3.1
Object `px_plot.update_layout` not found.

Luckily I found a workaround:

px_plot

px_plot.layout.template.layout.paper_bgcolor = 'hsl(0, 0%, 100%)'
px_plot.layout.template.layout.plot_bgcolor = 'hsl(0, 0%, 100%)'
px_plot.layout.template.layout.yaxis.gridcolor = 'hsl(0, 0%, 90%)'
px_plot.layout.template.layout.xaxis.gridcolor = 'hsl(0, 0%, 90%)'
px_plot

Could you post an example of using px.defaults.template? I wasn’t able to figure out that one and I’m assuming its easier to work with than searching the px JSON

Ah, my apologies: .update_layout() is what we’re busy documenting but it won’t be available until next week. You should be able to use px.scatter().update(layout=dict( <whatever> )) with the current version of Plotly to the same effect.

Your code to change the layout is actually changing the underlying template, which is messier than it needs to be, you could just do:

px_plot.layout.paper_bgcolor = 'hsl(0, 0%, 100%)'
px_plot.layout.plot_bgcolor = 'hsl(0, 0%, 100%)'
px_plot.layout.yaxis.gridcolor = 'hsl(0, 0%, 90%)'
px_plot.layout.xaxis.gridcolor = 'hsl(0, 0%, 90%)'

or

px_plot.update(layout=dict( paper_bgcolor = 'hsl(0, 0%, 100%)', plot_bgcolor = 'hsl(0, 0%, 100%)', <etc> ))`

We are working on documentation for the templates and that will come out next Friday :slight_smile:

In the meantime, you can set px.defaults.template = "plotly_white" or "seaborn" or "plotly_dark" or to any go.Template object as explained here https://medium.com/@plotlygraphs/introducing-plotly-py-theming-b644109ac9c7 under the “Make your own” section (which is what we’re basing the new documentation on)