R Plotly sankey diagram blank, no error messages

Hello all,

Hoping someone can help me out, as I’m out of things to try. I have seen a similar thread with Python, but haven’t found a fix with R. Basically I run my code and everything seems fine, no error codes, and the title shows in the output, but the sankey plot doesn’t appear.

I have tried:

  • making sure my R and Plotly, as well as other libraries, are up to date
  • double checking my variables have values instead of words
  • making sure there are equal numbers of rows and that values progress (0 → 1, etc.)
  • making my data into lists or vectors rather than a dataframe
  • converting my data to json format
  • commenting out different parts of the code to try to isolate the problem
  • successfully plotting with networkD3 so I know that it should work

Apologies it starts with a large df if someone wants to run the same and see if it works for them.

Packages:

library(readODS) #read ods files
library(tidyverse) #Tidy packages
library(dplyr) #lots of functions - data manipulation
library(janitor) #helps with data cleaning
library(plotly) #interactive web graphs

Here’s my code for downloading the data (public GHG data):

#Map loop to download UK 2022 greenhouse gas emissions data from the UK Department for Energy Security and Net Zero
downloaded <- file.exists("UKGHG_2022.ods") #checks if file is downloaded in working directory
if(downloaded != T){ #if downloaded is not true
  map2("https://assets.publishing.service.gov.uk/media/65c0d54663a23d000dc821ca/final-greenhouse-gas-emissions-2022-by-source-dataset.ods", #update this link when new data available
                         "UKGHG_2022.ods", download.file)} #else{print('data downloaded')} #name and download or print

#Read in ods file
GHG_UK22 <- read_ods(
  path = "UKGHG_2022.ods",
  sheet = 1, #define tab/sheet to read
  col_names = TRUE, #use header row for column names
  col_types = NULL, #guess data types
  na = "", #treat blank cells as NA
  skip = 0, #don't skip rows
  formula_as_formula = FALSE, #values only
  range = NULL, 
  row_names = FALSE, #no row names
  strings_as_factors = TRUE) %>% #use factors
  clean_names()  %>% #clean column names to lowercase, with underscores
  filter(year == "2022") #2022 only
            

Then I formatted data to have proper links and labels:

#Set up Sankey links dataframe - breakdown from GHGs to sectors to subsectors
links <- data.frame(source=c(paste0(GHG_UK2022$ghg_grouped), paste0(GHG_UK2022$tes_sector)), 
                    target=c(paste0(GHG_UK2022$tes_sector), paste0(GHG_UK2022$tes_subsector)),
                    value=as.numeric(paste0(GHG_UK2022$emissions_mt_co2e))) 

links <- links[-c(2317:3786),] #remove instances with repeat variable in source & target

#Create nodes df from names in links df
nodes <- data.frame(
  name=unique(c(as.character(links$source), 
  as.character(links$target))))

#Add ID numbers
links$IDsource <- as.numeric(match(links$source, nodes$name)-1)
links$IDtarget <- as.numeric(match(links$target, nodes$name)-1)

And finally, here is my Plotly code calling a sankey:

#Plot with Plotly
sankey <- plot_ly(type = "sankey",
                   domain = list(x =  c(0,1),y =  c(0,1)),
                   orientation = "h",
                   arrangement="snap", # can also change this to 'fixed'
                   valueformat = ".0f",
                   valuesuffix = "Mt CO2 eq.",
      node = list(
        label = nodes$name,
        pad = 15,
        thickness = 20,
        line = list(color = "black", width = 0.5),
      link = list(
        source = links$IDsource,
        target = links$IDtarget,
        value = links$value
        ))) %>% 
  layout(
    title = "Greenhouse Gas Emissions Per Sector in the UK",
    font = list(size = 10),
    xaxis = list(showgrid = F, zeroline = F),
    yaxis = list(showgrid = F, zeroline = F))

sankey

Thanks in advance for taking a look!