Frame showing less category than actual

Hi, I have one data frame in with dim 5527*4 i want to slider plot for this , two variable are numeric that I took on x and y axis respectively and other two variable have 33 different category so i wanted to different color according to one variable and slider on another variable so i used the following command

df = data.frame(x = tsne$Y[,1], y = tsne$Y[,2],c=km1$cluster,f=data$Failure.Mode)
gg <- ggplot(df, aes(x, y, color = f)) +
geom_point(aes(frame = c))

but i got warning msg as follows ,
Warning message:
In p$x$data[firstFrame] <- p$x$frames[[1]]$data :
number of items to replace is not a multiple of replacement length
And I got plot in which only 18 category of variable f showed.
I don not understood why this happen

Hi, although this is an old thread, this is exactly the issue I’ve run into. I don’t know if its better to revive this topic or create a new one.

I believe that the following provides additional info on why this is happening but am trying to figure out the best alternatives to address it.

Additional example

The following reproduces the issue with a readily available dataset (the mpg data from ggplot2). The plot_ly() function is used directly instead of creating the ggplot object followed by the ggplotly() conversion (but the issue is the same):


mpg_df <- ggplot2::mpg
mpg_df$cyl <- as.factor(mpg_df$cyl)

      data = mpg_df,
      type = "scatter", mode = "markers",
      x = ~displ, y = ~cty,
      color = ~cyl, frame = ~year
    ) %>%
   animation_opts(transition=0, redraw = TRUE)

The produces the following plot. It is correct when the slider is set to 1999:

But, when the slider is set to 2008, the plot is missing the points with cyl=8.

The same warning message occurs:

In p$x$data[firstFrame] <- p$x$frames[[1]]$data :
  number of items to replace is not a multiple of replacement length

This seems to be occurring because not all categorical values exist at all values of the slider. In this case, there are not instances with cyl=5 when year=1999:

> mpg_df %>% count(year,cyl) %>% spread(year,n)
# A tibble: 4 x 3
  cyl   `1999` `2008`
  <fct>  <int>  <int>
1 4         45     36
2 5         NA      4
3 6         45     34
4 8         27     43

Comparison with Plotly.python

It is interesting that equivalent code in python also misses a category in the plot:

import pandas as pd
import as px

mpg_df = pd.read_csv('mpg.csv')  # exported from R ggplot2::mpg data set
mpg_df["cyl"] = mpg_df["cyl"].astype("category")

           x="displ", y="cty", 
           animation_frame="year", color="cyl")

Resulting in plot:

In this case when the slider value is 2008, the cyl=5 points are missing (less noticeable but also not correct).

The python animation documentation has a limitation section: Plotly - Python - Intro to Animations - Current Animation Limitations and Caveats that lists the following constraint:

Animations are designed to work well when each row of input is present across all animation frames, and when categorical values mapped to symbol, color and facet are constant across frames. Animations may be misleading or inconsistent if these constraints are not met.

The corresponding R documentation, Plotly - R - Intro to Animations does NOT have a similar section or caveat.

My guess is that the limitation stated in the python documentation also applies in R, and this is something that users should be aware of and protect against. In my testing, the warning that occurs corresponds to the condition where the count of different categorical values is NOT consistent across frames, and the plot correctness depends on which slider value has the most different categories.

Follow-up questions

As a result the above, I have the following follow-up:

  • Are the conclusions above correct?

    • If so, are there any plans to change this limitation in the future?
    • If so, would it make sense to add a similar caveat note to the R animation documentation? If so, I can create an issue for this and likely provide an amendment to that page.
  • Is there an alternate way to create a slider (the animation is less important to me) that would result in a correct plot. For example, I am thinking that the example in the slider documentation, Plotly - R - Custom Controls - Sliders - Sine Wave Slider, can be adapted for this case, although the legend may be different/trickier. If I have success trying this, I’ll post code later, but I’m curious if there will be any problems with this approach.

Thanks in advance for any advice here.

I noticed that an issue has been logged for this: Trace is lost with animation, not all ids exist in the legend #1696.

I added a comment to that issue referencing this discussion.