Black Lives Matter. Please consider donating to Black Girls Code today.

Major performance difference between ggplotly and plot_ly

When rendering graph from a large dataset through Shiny, the ggplotly() function provides a responsive rendering (about within a sec.) while the same app using the plot_ly() function results in very slow rendering (5+ sec.).

The two Shiny apps below illustrate this difference.
Interestingly, adding dummy columns to the datasets (variables r1-10, s1-s10) accounts for a significant share of the slowdown, although I wouldn’t expect any impact since these variables are removed through the summary with Dplyr and shoudn’t affect the plot_ly() input.

Is there any reason for such difference between the 2 functions? Using plot_ly() would be favoured so as to benefit from the multiple axis features.

Additionally, it can be noted the ggplotly seems to ignore the trace if it takes the exact same values as another trace. For example, here y3=y2 and y3 is removed from the plot.


####ggplotly example

library(shiny)
library(plotly)
library(dplyr)
set.seed(100)
test<- data.frame(x=runif(1000000)) %>% mutate(y1=sqrt(x),y2=y1+rnorm(1000000),y3=y2,
r1=“a”, r2=“a”, r3=“a”, r4=“a”,r5=“a”,r6=“a”,r7=“a”,r8=“a”,r9=“a”,r10=“a”,
s1=“a”, s2=“a”, s3=“a”, s4=“a”,s5=“a”,s6=“a”,s7=“a”,s8=“a”,s9=“a”,s10=“a”)
ui <- shinyUI(fluidPage(
titlePanel(“ggplotly performance”),
sidebarLayout(
sidebarPanel(
sliderInput(“bins”,
“Number of bins:”,
min = 10,
max = 50,
value = 30)
),
mainPanel(
plotlyOutput(“distPlot”)
)
)
))
server <- shinyServer(function(input, output) {
output$distPlot <- renderPlotly({
test$x<- cut(test$x, breaks = input$bins)
test1<- test %>% group_by(x) %>%
summarise(y1=mean(y1), y2=mean(y2), y3=mean(y3))
plot<- ggplot(data=test1) +
geom_point(aes(x=x,y=y1, col=“y1”))+
geom_point(aes(x=x,y=y2, col=“y2”))+
geom_point(aes(x=x,y=y2, col=“y3”))
ggplotly(plot) })
})


####plot_ly example

library(shiny)
library(plotly)
library(dplyr)

set.seed(100)
test<- data.frame(x=runif(1000000)) %>% mutate(y1=sqrt(x),y2=y1+rnorm(1000000),y3=y2,
r1=“a”, r2=“a”, r3=“a”, r4=“a”,r5=“a”,r6=“a”,r7=“a”,r8=“a”,r9=“a”,r10=“a”,
s1=“a”, s2=“a”, s3=“a”, s4=“a”,s5=“a”,s6=“a”,s7=“a”,s8=“a”,s9=“a”,s10=“a”)

ui <- shinyUI(fluidPage(
titlePanel(“plot_ly performance”),
sidebarLayout(
sidebarPanel(
sliderInput(“bins”,
“Number of bins:”,
min = 10,
max = 50,
value = 30)
),
mainPanel(
plotlyOutput(“distPlot”)
)
)
))

server <- shinyServer(function(input, output) {

output$distPlot <- renderPlotly({

test$x<- cut(test$x, breaks = input$bins)

test1<- test %>% group_by(x) %>% 
  summarise(y1=mean(y1), y2=mean(y2), y3=mean(y3))

plot_ly(data=test1, x=x, y=y1, type="markers") %>% 
  add_trace(data=test1, x=x, y=y2, type="markers") %>% 
  add_trace(data=test1, x=x, y=y3, type="markers")

})
})

###If you try to produce the same plot performance difference disappears

####I mean. try to create the 3 lines output in ggplot_example instead dots

just modify the server function on

server <- shinyServer(function(input, output) {
  output$distPlot <- renderPlotly({
    test$x<- cut(test$x, breaks = input$bins)
    test1<- test %>% group_by(x) %>% 
      summarise(y1=mean(y1), y2=mean(y2), y3=mean(y3))
    d <- melt(test1, id.vars="x")
    plot<- ggplot(data=d aes(x,value, col=variable)) + geom_line()
    ggplotly(plot) })
})

####Do not make any changes on plot_ly example

It takes almost the same time generating the image.

bye :slight_smile: