✊🏿 Black Lives Matter. Please consider donating to Black Girls Code today.
🐇 Announcing Dash VTK for 3d simulation graphics. Check out the March webinar.

Cannot get times to overlap when using Multiple line graph for daily internet speeds

I am learning python and trying to use plotly to visualize the daily speed results from my home internet. I am using an excel spreadsheet to import the date, as I am rounding the speed result into 15 minute increments.

Ultimately I am trying to use a line graph for each day, but when have only been able to create chart that is not readable as the timestamps on the X axis seem to repeat. I have tried multiple attempts using pandas group and plotly express line_group. (and others)

Here is my current attempt:
import plotly.graph_objs as go
import plotly.express as px
import pandas as pd
import matplotlib.pyplot as plt


df = pd.read_excel(’/home/chesterreggie/log/speedtest1.xlsx’, index=“Q”, sheet_name=‘speedtest’, usecols = “C, K, Q”)

df[“Date”] = pd.to_datetime(df[“Date”])

#sort by rounddf.sort_values(by=“timeround”, ascending=True)
df.sort_values(by=“timeround”, ascending=True)

Plotly figure

fig = px.line(df, x=‘timeround’, y=‘DownSpeed’,
line_group=“Date”, hover_name=“Date”)
fig.update_layout(title=‘Download Speeds’ , showlegend=False)

#show plot

my current data frame: (this is coming from a pandas read_excel.
Date DownSpeed timeround
0 2020-03-28 5.76 17:45:00
1 2020-03-28 0.12 18:00:00
2 2020-03-28 0.35 19:00:00
3 2020-03-28 1.37 20:15:00
4 2020-03-28 1.89 21:00:00
… … … …
731 2020-04-09 249.70 09:30:00
732 2020-04-09 244.69 09:45:00
733 2020-04-09 177.70 10:00:00
734 2020-04-09 249.07 10:15:00
735 2020-04-09 236.01 10:30:00

Hi @chesterreggie welcome to the forum! What you have done should work (you don’t need to give linegroup if you already gave color but it shouldn’t hurt). Gave it a try with a small dataset and it worked see the screenshot below. If you need more help please share a datafile.

Thanks for the suggestion, looking at a smaller set helped. One thing that is inconsistent in my data during the day may or may not have the same data each hour. When I started tracking I was running my command every 30 minutes, now I have it every 15 minutes. And if the network is down, there will be missing times as my script will error and not write the new data (keeping it simple).

If I run the graph using only the data for every 30 or every 15 it works. Once I add a new time it “appends” the new timestamp at the end(to the right on the x axis). If you can, try running the same data, but from your set add after line 7 (2020-03-29, 3.67, 19:30), or some other similar other sample, but not the time must be something different from the first five rows.

A also think I need to sort it somehow so the graph time starts at midnight, right now it starts at whatever is in the first row in the df. I can work around this and clean my data and have the first row day/time start at midnight. and only use either the 30 minute or 15 minute data.

I am good with the workaround for just trying to see the data I want, but am curious why I cannot get the times to work the way I expect. Also long term I need to start with better data, right now I have to manually import and run macros in an excel file to create the proper info.

I think the problem comes from the fact that the timeround columns is not recognized at datetime by pandas. I could indeed reproduce your problem but if I do first

df['timeround'] = pd.to_datetime(df['timeround'], format= '%H:%M:%S' )

then the plot is correct (ie times are correctly sorted).

Same problem as you

and with

import plotly.express as px
import pandas as pd
df = pd.read_csv('tmp')
df['timeround'] = pd.to_datetime(df['timeround'], format= '%H:%M:%S' )
px.line(df, x='timeround', y='DownSpeed', color='Date')

it works

That was it!, Adding the to_datetime on the time variables worked.

That may have been my problem the whole time.

Thanks for the assist. I should have tried playing with the variable options more instead of focusing on the original data.

You’re welcome! Yes, it helps a lot to try to reproduce the problem on a small dataset, inspect the different variables, print the figure, etc.