I have a series of tasks with a name, start, and end. I am trying to build a timeline of each task grouped with a single stacked bar chart, and the y axis reflecting the start and end time of it. I have a feeling there’s something obvious I’m missing here.
Take this data for example:
Task Start End Duration
0 Task A 2021-11-13 00:00:00.000000 2021-11-13 00:13:35.452834 0.226515
1 Task B 2021-11-13 00:13:35.452834 2021-11-13 00:36:52.699531 0.388124
2 Task C 2021-11-13 00:36:52.699531 2021-11-13 01:00:00.000000 0.385361
I can use the px.timeline
to build something very close, which provides me the layout, but I want each task in a single row so when I am plotting several days worth of records it’s one row per day.
fig = px.timeline(df, x_start='Start', x_end='End', y='Task', color='Task')
fig.show()
I can build an overlapping chart here, but I can’t set the position of each start by hour.
df = pd.DataFrame(task_times, columns=['Task', 'Start', 'End', 'Duration'])
df['Hour'] = df['Start'].apply(lambda x: x.round(freq='H').strftime("%T"))
data = []
bar_width = 1.0 / task_count
for task, rows in df.groupby('Task'):
width = 1.0 - sum([bar_width for _ in range(task_names.index(task))])
data.append(go.Bar(
x=rows['Hour'], y=rows['Duration'],
width=width, name=task
))
fig = go.Figure(data=data, layout={"barmode": "overlay"})
fig.show()
This is the code I used to generate the sample data, and here’s an nbviewer for the full code.
task_count = 3
total_duration = 12
tasks = np.random.dirichlet(np.ones(task_count), size=12)
task_names = [f"Task {chr(97 + n).title()}" for n in range(task_count)]
task_times = []
s = datetime.combine(datetime.today(), datetime.min.time())
for task in tasks:
for x in range(len(task)):
t, n = task[x], task_names[x]
e = s + timedelta(hours=t)
task_times.append((n, s, e, t))
s = e
df = pd.DataFrame(task_times, columns=['Task', 'Start', 'End', 'Duration'])