Hello,
I’m currently facing an issue for which I haven’t been able to find a solution despite numerous unsuccessful attempts. I would like to address this problem, but I’m struggling to find suitable documentation to guide me through it.
The issue I’m encountering is as follows:
I have a large dataset (over 2000 entries), and I want to create a scatter plot with this data. To differentiate them and ensure each one is unique, I’ve concatenated several elements from my table with the column I want to use for my Y-axis, resulting in a column with unique identifiers that don’t repeat.
To avoid cluttering the graph, I’d like to display only the first 2 characters of each line on my Y-axis (Unique ID). However, when using the “fig.update_yaxes(tickvals=, ticktext=)” option to sort the ordinate column and display a truncated part of my labels, it freezes my Y-axis and displays all the lines on the graph (over 2000), making my Y-labels unreadable. On the other hand, without this option, only 50 lines are displayed on the graph, which is much more legible.
Therefore, I’m wondering if it’s possible to customize the display of data labels on the Y-axis while still maintaining Plotly’s functionality to synthesize information in the graph to adapt the displayed information only when zooming.
Note: Creating a column in my DataFrame with only the truncated values from my concatenated column doesn’t work because all values on the X-axis align with a single Y-axis line, making my values non-unique. This isn’t what I’m aiming for. Instead, I want a separate line for each value on the Y-axis.
Thank you in advance for your assistance.
My code is the following :
import pandas as pd
import numpy as np
import plotly.express as px
# Define the lists of values
fruit_name = ['Banane', 'Pomme', 'Orange', 'Fraise', 'PĂŞche', 'Ananas', 'Pomme', 'Banane', 'Pomme', 'PĂŞche',
'Raisin', 'PĂŞche', 'Banane', 'Pomme', 'Banane', 'Orange', 'Prune', 'Myrtille', 'Banane', 'Pomme']
countries = ['Brazil', 'Australia', 'China', 'France', 'USA', 'USA', 'USA', 'Australia',
'Italy', 'USA', 'China', 'Spain', 'Brazil', 'Australia', 'India', 'France',
'France', 'Brazil', 'Spain', 'Italy']
colors = ['red', 'orange', 'blue', 'green', 'red', 'green', 'purple', 'pink', 'brown',
'brown', 'red', 'red', 'orange', 'blue', 'green', 'red', 'green', 'purple', 'pink', 'brown']
# Generate random dates within the specified range
start_date = pd.to_datetime('2024-01-01').timestamp()
end_date = pd.to_datetime('2024-12-31').timestamp()
random_dates = pd.to_datetime(np.random.randint(start_date, end_date, size=2000), unit='s')
# Generate random values for each column
np.random.seed(0)
random_fruits = np.random.choice(fruit_name, size=2000)
random_countries = np.random.choice(countries, size=2000)
random_colors = np.random.choice(colors, size=2000)
# Create the DataFrame
df = pd.DataFrame({'Fruit': random_fruits, 'Country': random_countries, 'Color': random_colors, 'Date': random_dates})
# Display the DataFrame
print(df)
df["ID_Fruit"] = df['Fruit'].astype(str) + '_' + df['Country'].astype(str)+ '_' + df['Color'].astype(str)
df['Fruiti'] = df['ID_Fruit'].apply(lambda x: x[:2])
fig = px.scatter(df, x="Date", y="ID_Fruit", color = "Fruiti")
#Darkmode
fig.layout.template = "plotly_dark"
#Y axis
# fig.update_yaxes(tickvals=df['ID_Fruit'], ticktext=df['Fruiti'])
fig.show()