Hi,
For this week’s data set I decided to add box plots within violins to highlight medians, quartiles, and overall distribution.
For aesthetics and better visual differentation purposes I applied a pastel color palette.
Also, I Introduced jittered, semi-transparent points to reduce overlap and show individual data points more clearly.
For the y-axis values, I formatted them to two decimal places for clarity and precision, ensuring that the Max Yield (hl/ha) values are easy to interpret.
I’d love to hear your feedback on the overall design and if there’s anything else I can improve! ![]()
import plotly.express as px
import pandas as pd
df = pd.read_csv("wine_data.csv")
df['Max_yield_hl'] = pd.to_numeric(df['Max_yield_hl'], errors='coerce')
df_cleaned = df.dropna(subset=['Max_yield_hl'])
fig = px.violin(
df_cleaned,
x='Color',
y='Max_yield_hl',
facet_col='Country',
color='Color',
box=True,
title="Distribution of Max Allowed Yield (Hectoliters per Hectare) by Wine Color and Country",
labels={"Max_yield_hl": "Max Yield (hl/ha)", "Color": "Wine Color"},
color_discrete_sequence=px.colors.qualitative.Pastel # Subtle color palette
)
fig.update_traces(
jitter=0.3,
opacity=0.6
)
fig.update_layout(
title_font_size=18,
legend_title="Wine Color",
xaxis_title="Wine Color",
yaxis_title="Max Yield (hl/ha)",
yaxis=dict(tickformat=".2f") # Format y-axis values
)
fig.show()
