I have the following dataset with many rows, multiple samples and 3 columns. I need to plot a graph which looks like a heatmap but it should fill the color not just for the position but also till previous position.
I need to modify above chart to fill the color in between the positions like below where the color filling will trace back to previous position and show up on the chart
Sample 1 :
1-15500 → Dark blue
15501-16200 → orange red
16201-17700 → yellow
17701-20010 → orange red
20011-22120 → Dark blue
22121-30000 → yellow
I’m not sure px.imshow() is the best way to achieve what you need.
Not impossible, but you will have to construct the whole “grid” of your image with the smallest step of your 'position' which seems to be 10 with the sample you provided.
Meaning if the range is [0,35000] you will have an array of 3500 values for each “Sample”, with approximately 1500 for your first 'position' only… Whereas you should to need only one “rectangle”.
I propose to construct only the “rectangles” you need, using a horizontal bar figure.
For that you will have to provide:
The "base" which will be the left side position of the rectangles
The "x" will be the width of the rectangle
The "y" will be the name 'Name'
The figure should use one trace by 'category', to be easier to handle and so that you can provide one color for each trace.
The main work here is to find the right "base" and "x".
Here is the result:
Code
# Shift index of 1 step to add a row at index 0
df['base'] = df['position'].shift()
# The width of the rectangles is the difference between two successive rows
df['width'] = df['position'].diff()
# set the first base and width for each Sample, meaning when we have a NaN value or when the width above is negative
# Then we have to "fix" the first base and first width for each Sample.
# We can detect the first base and width for each Sample when the calculated width above is negative, or NaN for Sample1
# first base = 0
df['base'] = df.apply(lambda x: 0 if (x['width'] < 0 or pd.isna(x['base'])) else x['base'], axis=1)
# first width = first 'position' of the Sample
df['width'] = df.apply(lambda x: x['position'] if (x['width'] < 0 or pd.isna(x['width'])) else x['width'], axis=1)
color = {1: 'blue', 2: 'orange', 3: 'yellow'}
fig = go.Figure()
# Create one trace for each category
for cat in sorted(df['category'].unique()):
# filter by cat
dff = df[df['category'] == cat]
fig.add_bar(
orientation='h',
base=dff['base'],
x=dff['width'],
y=dff['Name'],
marker_color=color[cat],
marker_line_width=0,
name=str(cat)
)
# reverse y to have Sample 1 at the top
fig.update_yaxes(autorange="reversed")
# set bargap=0 to have no gap between horizontal bars, and overlay to not stack the traces
fig.update_layout(legend_title_text='Category', bargap=0, barmode="overlay")
fig.show()
Thank you for your solution. I didn’t think of representing this way in the form of creating rectangles. It worked for my dataset. Still a beginner in using lambda functions and trying to understand the solution. Thanks for your clear explanation.
Also, i am looking to represent the dataset slightly differently as i am looking to use start and end positions to perform comparison on tool efficiency. Can we slightly modify your code to represent the blocks or is there a easy way in plotly chart which represents this kind of dataset?
I agree, lambda functions are not obvious concepts!
df['base'] = df.apply(lambda x: 0 if (x['width'] < 0 or pd.isna(x['base'])) else x['base'], axis=1)
is equivalent to
def my_function(x):
if x['width'] < 0 or pd.isna(x['base']):
return 0
else:
return x['base']
df['base'] = df.apply(my_function, axis=1)
and df.apply()will use this function to all rows, meaning x will be a row of df.
It should be possible to use the same figure type with your new dataset, only need to adapt the data preparation.
With this dataset, there will be gaps right? like between first ending at 15500 and second row starting at 15700, meaning a gap of 200?