Help required generating chart

adhawkins · February 19, 2024, 12:12pm

Hi all,

I have a data set that is multiple rows of the form:

{
   'game': 955,
   'gamedetails.date': '2024-01-30',
   'gamedetails.solution': 'expel',
   'guesses': 5,
   'id': 1,
   'success': 1,
   'user': 1,
   'userdetails.fullname': 'Andrew Hawkins',
   'userdetails.username': 'andy'
}

(the ‘guesses’ field will be either a number from 1-6, or ‘Fail’)

I want to generate a plot that looks like this:

The vertical bars represent the percentage of times each number of guesses appeared for each user based on the number of results they actually have (not all users have results for all games).

I can get close using the following code:

    fig = px.histogram(
        results,
        width=1920,
        height=1080,
        x="userdetails.username",
        y="guesses",
        color="guesses",
        barmode="group",
        orientation="v",
        histnorm="percent",
        category_orders={"guesses": ["1", "2", "3", "4", "5", "6", "Fail"]},
    )

This give me this:

There are a couple of issues with this:

The ordering of the bars for each user isn’t in the expected ‘1, 2, 3, 4, 5, 6 Fail’ order
The percentages just don’t add up. For example, for ‘james’ the three tallest bars seem to be around 50, 35 and 20, which is already over 100%

Can anyone help me get the plot I’m trying to achieve?

Thanks

Andy

AIMPED · February 19, 2024, 12:31pm

Hey @adhawkins welcome to the forums.

Could you provide some data (for copy&paste) to play around with?

adhawkins · February 19, 2024, 2:45pm

Here you go @AIMPED :

https://termbin.com/li1s

Andy

AIMPED · February 27, 2024, 10:59am

Hey @adhawkins,

with the data you provided I don’t get the same figure as the one you provided. I guess you did some kind of preprocessing. Is that the case?

newplot(3)

import requests
import pandas as pd
import plotly.express as px
import json


# function source: https://stackoverflow.com/questions/16573332/jsondecodeerror-expecting-value-line-1-column-1-char-0
def get_json(url):
    response = requests.get(url)
    print(response.raise_for_status())  
    if response.status_code != 204:
        return response.json()

    
js = get_json('https://termbin.com/li1s')

results = pd.DataFrame(js['results'])

fig = px.histogram(
    results,
    width=600,
    height=400,
    x="userdetails.username",
    y="guesses",
    color="guesses",
    barmode="group",
    orientation="v",
    histnorm="percent",
    category_orders={"guesses": ["1", "2", "3", "4", "5", "6", "Fail"]},
)

fig.show()

adhawkins · February 27, 2024, 11:39am

@AIMPED Yes, sorry. There is some slight modification of the data, but the end result is very similar.

Your results show similar things to mine in that the order of the bars isn’t logical, and also the some of the percentages of the bars (for example in the ‘old’ group) is obviously not correct.

This is the full code:

results = fetchAllResults()
if results:
    for result in results:
        if result["success"]:
            result["guesses"] = str(result["guesses"])
        else:
            result["guesses"] = "Fail"

    results = sorted(results, key=lambda x: (x["user"], x["guesses"]))

    pprint(results)

    fig = px.histogram(
        results,
        width=1920,
        height=1080,
        x="userdetails.username",
        y="guesses",
        color="guesses",
        barmode="group",
        orientation="v",
        histnorm="percent",
        category_orders={"guesses": ["1", "2", "3", "4", "5", "6", "Fail"]},
    )

    fig.update_layout(
        legend=dict(
            yanchor="top",
            y=-0.05,
            xanchor="left",
            x=0,
        ),
        title_x=0.5,
    )

    fig.write_image("/home/andy/fig1.png")

The ‘fetchAllResults’ function returns the data I provided in the earlier post.

Thanks for taking the time to look in to this.

Andy

AIMPED · February 27, 2024, 12:09pm

Well, this is because plotly.expressdoes some grouping of your DataFrame under the hood. Sorting the df before creating the figure solves this issue:

results = results.sort_values('guesses', axis=0)

newplot(7)

See also

I imagine, the issue with the percentage is based on the same reason. I’ not sure which values are used to calculate the percentage.

adhawkins · February 27, 2024, 2:07pm

I was already sorting the results, but was doing it based on user first, then number of guesses. I’ve now swapped the order of these around, and I’m getting the bars in a more logical order.

Any way I can get it to calculate the percentages correctly?

Failing that, is there any way I can pass in a data set along the lines of the following:

[
	{
		"user": "andy",
		"1": 2,
		"2": 20,
		"3": 30,
		"4": 20,
		"5": 10,
		"6": 5,
		"Fail": 3
	},
	{
		"user": "james",
		"1": 3,
		"2": 21,
		"3": 31,
		"4": 18,
		"5": 8,
		"6": 5,
		"Fail": 14
	}
]

(i.e. I have already calculated the percentages for each guess) and just have it render the bars based on the numbers for each data point?

Thanks again

Andy

AIMPED · February 27, 2024, 9:11pm

Hi @adhawkins I’m not sure if I understood your last post. If you wanted to create a bar graph with the data provided, you could do something like that.

data = [
	{
		"user": "andy",
		"1": 2,
		"2": 20,
		"3": 30,
		"4": 20,
		"5": 10,
		"6": 5,
		"Fail": 3
	},
	{
		"user": "james",
		"1": 3,
		"2": 21,
		"3": 31,
		"4": 18,
		"5": 8,
		"6": 5,
		"Fail": 14
	}
]

df = pd.DataFrame.from_records(data)
fig = px.bar(df, x='user', y=df.columns[1:], barmode='group')
fig.show()

Concerning the percentages, I tried to figure out how these are calculated but it took me too long

I even think, it might be a bug. If you comment out the color parameter of your original graph, the percentages sum up correctly. This has also been reported here.

adhawkins · February 28, 2024, 11:20am

Thanks, that’s exactly what I need.

Andy

Topic		Replies	Views
Several Stacked Bar Charts per index 📊 Plotly Python question	1	3433	February 10, 2022
Creating percentage bar chart 📊 Plotly Python question	5	791	October 6, 2024
Sort bar descending with multiple subplots 📊 Plotly Python	2	705	August 16, 2022
Unable to plot histogram dataframe (y axes count, x axes label) 📊 Plotly Python	1	2132	June 7, 2018
Trying to create a barplot plotly.js	0	686	January 13, 2017

Help required generating chart

Related topics