Hi all,
I have a data set that is multiple rows of the form:
{
'game': 955,
'gamedetails.date': '2024-01-30',
'gamedetails.solution': 'expel',
'guesses': 5,
'id': 1,
'success': 1,
'user': 1,
'userdetails.fullname': 'Andrew Hawkins',
'userdetails.username': 'andy'
}
(the ‘guesses’ field will be either a number from 1-6, or ‘Fail’)
I want to generate a plot that looks like this:
The vertical bars represent the percentage of times each number of guesses appeared for each user based on the number of results they actually have (not all users have results for all games).
I can get close using the following code:
fig = px.histogram(
results,
width=1920,
height=1080,
x="userdetails.username",
y="guesses",
color="guesses",
barmode="group",
orientation="v",
histnorm="percent",
category_orders={"guesses": ["1", "2", "3", "4", "5", "6", "Fail"]},
)
This give me this:
There are a couple of issues with this:
- The ordering of the bars for each user isn’t in the expected ‘1, 2, 3, 4, 5, 6 Fail’ order
- The percentages just don’t add up. For example, for ‘james’ the three tallest bars seem to be around 50, 35 and 20, which is already over 100%
Can anyone help me get the plot I’m trying to achieve?
Thanks
Andy