Figure Friday 2025 - week 6

Did you know that Where the Crawdads Sing was on the New York Times best seller list for 193 weeks.

Welcome to week 6 of Figure Friday :tada:
This week we’ll look at the NYT best sellers list. Data includes book title, author, publisher, published date, and more.

Download data:

  • Go to Joe Hovde’s google sheet and download it as a CSV sheet. Click File → Download → Comma Separate Values
  • Save the CSV sheet in the same directory as the Python code provided (under the sample figure), and run code.

Things to consider:

  • can you improve the sample figure below (bar chart)?
  • would you like to tell a different data story using a different graph?
  • can you create a Dash app instead?

Sample figure:

Code for sample figure:
import plotly.express as px
import pandas as pd

def plot_bestsellers_trend_by_quarter(_data: pd.DataFrame):
    data = _data.copy() # Avoids mutating your original data source
    # Convert 'bestsellers_date' to datetime
    data['bestsellers_date'] = pd.to_datetime(data['bestsellers_date'])
    
    # Extract year and quarter and convert to string for serialization
    data['year_quarter'] = data['bestsellers_date'].dt.to_period('Q').astype(str)
    
    # Group by 'year_quarter' and count the number of bestsellers in each quarter
    trend_data = data.groupby('year_quarter').size().reset_index(name='count')
    
    # Create the bar chart
    fig = px.bar(trend_data, x='year_quarter', y='count',
                 labels={'year_quarter': 'Year and Quarter', 'count': 'Number of Bestsellers'},
                 title='Trends in Bestsellers by Quarter')
    
    # Update x-axis to show quarters
    fig.update_layout(xaxis_title="Quarters")

    return fig

df = pd.read_csv('NYT Fiction Bestsellers - Bestsellers.csv')
bestsellers_trend_by_quarter_fig = plot_bestsellers_trend_by_quarter(df)
bestsellers_trend_by_quarter_fig.show()

Participation Instructions:

  • Create - use the weekly data set to build your own Plotly visualization or Dash app. Or, enhance the sample figure provided in this post, using Plotly or Dash.
  • Submit - post your creation to LinkedIn or Twitter with the hashtags #FigureFriday and #plotly by midnight Thursday, your time zone. Please also submit your visualization as a new post in this thread.
  • Celebrate - join the Figure Friday sessions to showcase your creation and receive feedback from the community.

:point_right: If you prefer to collaborate with others on Discord, join the Plotly Discord channel.

Data Source:

Thank you to Joe Hovde for the data. Check out Joe’s analysis post.

2 Likes

I successfully fetched the data from the Google Open Library :rocket: /with the help of ChatGPT/ , but I still need to verify it since I’m a beginner :cat: I made the barchart dynamically changeable on click.

8 Likes

Nice app, @Ester
Does the x axis of the bar chart represent number of weeks that the book was a bestseller? So The Housemaid was 48 times a bestseller in 2024

2 Likes

Yes, number of times it appeared on the weekly list.

2 Likes

After all these weeks sharing our work and effort, posting this bar chart looks so simple! :blush:
But I didn’t want to leave my desktop without sharing something, maybe it could serve to inspire somebody… work in progress…

Code
import pandas as pd
import numpy as np
import plotly.express as px


data = pd.read_csv(r"C:\Users\Juan Aguirre\Downloads\NYT Fiction Bestsellers - Bestsellers.csv")

def tweak_books(data):
    cols=['title', 'author', 'publisher', 'bestsellers_date',\
          'published_date', 'rank', 'rank_last_week', 'weeks_on_list']
    return(data
        [cols]
        .assign(
            publisher = data['publisher'].astype('category'),
            bestsellers_date = pd.to_datetime(data['bestsellers_date']),
            published_date = pd.to_datetime(data['published_date']),
            rank = data['rank'].astype('uint8'),
            rank_last_week = data['rank_last_week'].astype('uint8'),
            weeks_on_list = data['weeks_on_list'].astype('uint8'),
        )
        
    )

df = tweak_books(data)
df_title_by_pub = (df
 .groupby(by='publisher', as_index=False, observed=True)['title']
 .nunique()
 .sort_values(by='title')
).nlargest(20, columns='title') # type: ignore
fig = px.bar(
    df_title_by_pub.sort_values(by='title'),
    x = 'title',
    y = 'publisher',
    text = 'title',
    template='simple_white',
    height=600,
    labels={
        'publisher':'',
        'title':''
    },
    title='Top 20 Publisher of Bestsellers by #books'
)
fig.update_traces(marker_color='rgb(204,80,62)')
                #    textfont_size=100) # , textangle=0, textposition="outside"
fig.update_xaxes(showticklabels=False,
                    ticks='')
fig.update_yaxes(tickfont_weight=550,
                    ticks='')
fig
8 Likes

I used this figurefriday to dive into some technical things instead of analysis:

  • Webscraping for the images, code is in the data/scraping.py file on py.cafe
  • Use Store (because I knew what I was doing was very inefficient)
  • create a multipage app
  • use an external font

I’ll stay with this dataset for a few weeks to dive into app-architecture more and the analysis part (see publisher page, some interesting stuff happening) and could end-up with something completely different.

Link & code : PyCafe - Dash - Work in progress, multipage dash plotly

10 Likes

Beautiful! This dataset could be a couple of week project, definitively.

2 Likes

Thank you @JuanG and yes about the weeks. If you create something every week, what I miss is the time to play around and more important for me, let it sink in a few days, decide what make’s me curious or what is worthwhile to visualise (lack of customers, so I’m the customer) , how I’m going to dive into my idea, check it and how I’m going to visualise it. For example, the first thing I thought about the publishers & mentions (my page publishers) , the big ones bought the small ones, it did not help. Second thought, audiobooks. Third thought this morning, looking at the number of mentions (52*15 = 780), how can the number of mentions be above 780? Either an error with dataprocessing, and/or lists were longer in the past) . Lists were longer in the past. I haven’t decided yet if I find this important and if yes, what I’m going to do with it. And this is a dataset which I like to work on longer because of my preference for a) books en b) the images (they cheer me up) :slight_smile:

4 Likes

I love how your dash apps look, @marieanne . Your adding of images and cards make your dashboards look more like professional websites.

2 Likes

Thank you @adamschroeder, with my background it’s some sort of automatism (not the cards) but thinking responsive and having thoughts like, wouldn’t it be great if the rows were columns where on mobile you could swipe from one to the next. Having a framework like bootstrap makes it easy for me create something that looks decent, I’ve done bootstrap since version 2. On a more serious note: on that large business platform I now see a lot of highly liked posts from important data peeps on basic design subjects which were hot in webdesign 15 years ago and are now assumed common knowledge and should be common knowledge for dashboard designers too. Maybe I create a video on the layout design of the MTA commuter challenge app, why it was what it was , it was designed mobile first, why some stuff was omitted and as an excuse to put an image of my cats in a video :grinning:

4 Likes

As long as your cats are in the video, I fully support the idea :slight_smile:

2 Likes

Hey @Ester,

I’ve been admiring your recent submissions, as they are consistently beautiful and clean! This one is no exception—I love how you’ve displayed the book cover :heart_eyes_cat:

Personally, I would suggest using “Top 10 Bestselling Books for 2020” as the dashboard title, with 2020 being dynamic and responsive to the filter :slight_smile: This would allow you to remove the repeated chart title below, as well as the “The Book details” title, since your charts already convey the necessary information.

Instead, you might consider adding a chart title or subtitle like ":computer_mouse: Click on bar chart to view book details on the right." This can help clarify chart interactions, which are often not visually clear.

3 Likes

I am all in for simple charts! haha You can’t go wrong with this beautifully sorted and clean bar chart! :rocket:

1 Like

Thank you @li.nguyen your help! I will correct it.

Li Nguyen via Plotly Community Forum <notifications@plot.discoursemail.com> ezt írta (időpont: 2025. febr. 13., Csü 12:19):

1 Like

This is amazing, @marieanne! I love how you added the “insights” layer with the summarized numbers and KPI cards. It really makes me want to interact with the dashboard and see the results :rocket:

I totally understand the struggle of finding time to dive deep when you’re creating something new every week. I have the same challenge with wanting to go all-out on dashboards. When I’m short on time, I try to focus on just one chart or simply take a break and admire all your awesome work :hugs:

I hope you don’t mind, but I tinkered with the inspect panel on your PyCafe app because I had some design ideas and was curious to see how they would look. I added box shadows to your cards, decreased the background opacity and tried to consolidate some of the spaces. What do you think?

I believe adding box shadows and some opacity to the background color really makes your cards pop and gives them a more modern feel. The dark blue background color was quite prominent, so instead of using it as a background, you could use it to highlight elements, like the arrows you added (which I love, by the way—they give it a cool, artsy touch) or the selected range inside the slider and the refresh buttons as you already have it!

5 Likes



Hi All, I am Waliy new here and I find it interesting to join Figure Friday challenge.

Here I create Dash App with a bit design to summarize the data. Hope you enjoy.

I also use AI API to generate book summary, you can use openrouter for free, just put your own API Key

Github repo : GitHub - walidata48/FF-6-NYC-Book

7 Likes

So while I was working hard on something completley different, you converted my dashboard into Google Material Design (Material Design) , not sure which version, :rofl: .

Love your thinkering, totally agree it’s an improvement. @li.nguyen

2 Likes

"Hello everyone,

As always, great dashboards and ideas this week! This time, I focused on sentiment analysis and developed a Dash web app to explore this area. The app analyzes sentiment and trends in NY Times fiction bestsellers, providing insights into the emotional tone of bestselling fiction. Users can explore the overall sentiment distribution from book descriptions, compare author sentiment, and examine the top title bigrams and description trigrams, all filterable by year.



code here

from dash import dcc, html, Input. Output
import dash_bootstrap_components as dbc
import pandas as pd
import plotly.express as px
import string
import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from nltk.util import ngrams
from collections import Counter
from nltk.stem import WordNetLemmatizer
from nltk.sentiment.vader import SentimentIntensityAnalyzer

nltk.download('punkt')
nltk.download('stopwords')
nltk.download('wordnet')
nltk.download('vader_lexicon')

# Prepare the data
df = pd.read_excel("NYT Fiction Bestsellers.xlsx", sheet_name=1)[::-1]
df_subset = df.drop_duplicates(subset=['title'])
df_subset['title'] = df_subset['title'].astype(str)
df_subset['desc'] = df_subset['desc'].astype(str)
df_subset['year'] = df_subset.bestsellers_date.dt.year

lemmatizer = WordNetLemmatizer()
stop_words = set(stopwords.words('english')) | set(string.punctuation)

def generate_ngrams(text, n):
    tokens = word_tokenize(text.lower())
    tokens = [word for word in tokens if word.isalpha() and word not in stop_words]
    tokens = [lemmatizer.lemmatize(word) for word in tokens]
    n_grams = ngrams(tokens, n)
    return [' '.join(gram) for gram in n_grams]

def plot_top_ngrams_plotly(freq_counter):
    top_n = freq_counter.most_common(10)
    ngrams, counts = zip(*top_n)
    fig = px.bar(y=list(counts), x=list(ngrams), template="ggplot2", labels={'x':'', 'y':''}, 
                 text_auto=True,
                )
    fig.update_layout(paper_bgcolor='rgb(240, 240, 240)', plot_bgcolor='rgb(240, 240, 240)'
    )
   
    fig.update_yaxes(visible=False)
    return fig

def get_sentiment(text):
    analyzer = SentimentIntensityAnalyzer()
    scores = analyzer.polarity_scores(text)
    return scores['compound']

radio_style = {
    'display': 'flex',
    'flex-direction': 'row',
    'justify-content': 'space-between',
    'padding': '5px',
    'border': '2px solid',
    'border-radius': '5px',
    'boxShadow': '3px 3px 3px rgba(10, 10, 10, 0.3)',
    'font-family': 'Aharoni, sans-serif',
    'font-size': '20px',
}

header_style={'text-align': 'center', 'margin': '10px','padding': '10px'}
    
# Initialize Dash app
app = dash.Dash(__name__, external_stylesheets=[dbc.themes.JOURNAL])

app.title = "NY Times Fiction Best-Sellers"

app.layout = dbc.Container([
    dbc.Row([
        dbc.Col(html.H1("NYT Bestseller Sentiment & Trends", className="title"), width=12)  # Clase CSS para el título
    ]),
    dbc.Row([
        dbc.Col(html.H5("Sentiment analysis reveals trends in NYT fiction bestsellers: titles, descriptions, sentiment, bigrams/trigrams, and author sentiment."), width=12) 
    ]),
    dbc.Row([
        dbc.Col(dbc.RadioItems(
            id='radio-buttons',
            options=[{'label': str(year), 'value': year} for year in df_subset.year.unique()],
            value=2018,
            inline=True,
            style=radio_style
        ), width=12)
    ]),
    dbc.Row([
       
        dbc.Col([
            html.H5("Top 10 Title Bigrams", style=header_style),html.Hr(),
            dcc.Graph(id='bigrams-chart', className="dash-graph")], width=6),  
        dbc.Col([
            html.H5("Top 10 Description Trigrams", style=header_style),html.Hr(),
            dcc.Graph(id='trigrams-chart', className="dash-graph")], width=6)  
    ]),
    html.Hr(),
    dbc.Row([
            dbc.Col([
            html.H5("Book Description Sentiment by Year", style=header_style),
            html.Hr(),
            dcc.Graph(id='sentiment-chart')], width=5),
        dbc.Col(html.Div([
            html.H5("Comparative Sentiment Distribution of Bestselling Authors", style=header_style),
            html.Hr(),
            html.Button('Update Authors', id='update-authors-button', className="update-button"),
            dcc.Graph(id='sentiment_author-boxchart')
    ]), width=7), 
        dbc.Col(
            html.H5("Sentiment scores range from -1 (VERY NEGATIVE) to +1 (VERY POSITIVE). Scores close to 0 indicate neutral sentiment.", style=header_style), width=12)
]),
   
], fluid=True, style={'backgroundColor': '#f0f0f0'})  

@app.callback(
    Output('bigrams-chart', 'figure'),
    Output('trigrams-chart', 'figure'),
    Output('sentiment-chart', 'figure'),
    Output('sentiment_author-boxchart', 'figure'),
    Input('radio-buttons', 'value'),
    Input('update-authors-button', 'n_clicks') # Input del botón
)
def update_charts(year, n_clicks):
    filtered_df = df_subset[df_subset.year == year]

    bigrams = [generate_ngrams(title, 2) for title in filtered_df['title']]
    bigram_freq = Counter([gram for sublist in bigrams for gram in sublist])
    bigrams_fig = plot_top_ngrams_plotly(bigram_freq)
    

    trigrams = [generate_ngrams(desc, 3) for desc in filtered_df['desc']]
    trigram_freq = Counter([gram for sublist in trigrams for gram in sublist])
    trigrams_fig = plot_top_ngrams_plotly(trigram_freq)
    

    filtered_df['sentiment'] = filtered_df['desc'].apply(get_sentiment)
    sentiment_fig = px.histogram(
        filtered_df, x="sentiment", nbins=10, histnorm='percent',range_x=[-1,1],
        template='ggplot2', text_auto= '.2f', labels={'sentiment':''}
    )
    sentiment_fig.update_layout(paper_bgcolor='rgb(240, 240, 240)', plot_bgcolor='rgb(240, 240, 240)')

    sentiment_fig.update_yaxes(visible=False)


    if n_clicks is None or n_clicks == 0:  
        author_to_watch = filtered_df.author.sample(5).tolist()
    elif n_clicks > 0: 
        author_to_watch = filtered_df.author.sample(5).tolist()

    author_to_watch = filtered_df.author.sample(5).tolist()
    author_df = filtered_df[filtered_df['author'].isin(author_to_watch)]
    author_df['sentiment'] = author_df['desc'].apply(get_sentiment)
    sentimen_author_fig = px.box(author_df, x="sentiment", y='author',
                                 range_x=[-1,1],
                                 color_discrete_sequence=px.colors.sequential.Bluered_r,
                                 template='ggplot2', labels={'sentiment':'', 'author':''})
    sentimen_author_fig.update_layout(paper_bgcolor='rgb(240, 240, 240)', plot_bgcolor='rgb(240, 240, 240)')

    

    return bigrams_fig, trigrams_fig, sentiment_fig, sentimen_author_fig

if __name__ == '__main__':
    app.run_server(debug=True)
4 Likes

The App

1 Like

Wow - first submission and already killing it @waliyudin ! :rocket:

Absolutely love this dashboard! The layout is very clear, and the top publishers section looks great. The AI book insights section is a great addition—an excellent example of how to embed AI effectively. I was so intrigued that I immediately checked out your code! :smile:

I noticed you’re using Dash Bootstrap themes and the DataTable. Have you ever tried AgGrid? Since I discovered it, I haven’t gone back to using the standard DataTable. Just last week, I found out how easily you can embed Bootstrap components inside AgGrid, enabling more sophisticated designs like the one below.

Your table immediately reminded me of this. Instead of a hyperlink for the summary, you could use a button or a badge! I actually have an example from last week: li-nguyen/figure-friday-week-5 at main

Anyway, welcome to FigureFriday, and I look forward to seeing more of your work! :heart_eyes_cat:

2 Likes