✊🏿 Black Lives Matter. Please consider donating to Black Girls Code today.
📊 Dash 2.0 is Arriving. Register here.

Displaying timedelta64 in a readable format

Hi,

I’m fairly new to Python and recently discovered Dash.

I’m building a graph which has durations on the Y axis, and Datetime on the X axis, but the graph which is generated is just a gradual slope instead of the correct data - and the labels are displayed in what I believe is ISO format…any suggestions on what I’m doing wrong here? (probably a lot :smiley: )

The data comes from a dataframe in another file, and the data type of “time” is in timedelta64

my only assumption is that timedelta isn’t supported? but any clue of what to convert it to would be really appreciated.

Here’s my code:

app.layout = html.Div(children=[

    html.Div(children='''
        Dash: Testing out Dash.
    '''),

    dcc.Graph(
        id='example-graph2',
        figure={
            'data': [
                {'x': sdc_resolved['month ending'], 'y': sdc_resolved["time"], 'type': 'bar', 'name': 'Testing'},
            ],
            'layout': {
                'title': 'Dash Data Visualization'
            }
        }
    )

 ])

I was able to fix this myself by changing the source duration data to an integer, instead of worrying about hours, minutes, seconds, microseconds - I only pulled the day value in my current data source (Excel, using Power query)

luckily all my averages exceeds a day, so it does the job!

Here’s what you can do. As an example here’s a DataFrame with two datetime columns:

df = pd.DataFrame({'date_1': pd.date_range('2020-01-01', periods=5),
                   'date_2': pd.date_range('2020-02-01', periods=5, freq='H')}) 
df
	             date_1	             date_2
0	2020-01-01 00:00:00	2020-02-01 00:00:00
1	2020-01-02 00:00:00	2020-02-01 01:00:00
2	2020-01-03 00:00:00	2020-02-01 02:00:00
3	2020-01-04 00:00:00	2020-02-01 03:00:00
4	2020-01-05 00:00:00	2020-02-01 04:00:00

Two simple columns of datetimes. A third column calculates the time difference:

df['diff'] = df['date_2'] - df['date_1']
df

	             date_1	             date_2	                      diff
0	2020-01-01 00:00:00	2020-02-01 00:00:00	31 days 00:00:00.000000000
1	2020-01-02 00:00:00	2020-02-01 01:00:00	30 days 01:00:00.000000000
2	2020-01-03 00:00:00	2020-02-01 02:00:00	29 days 02:00:00.000000000
3	2020-01-04 00:00:00	2020-02-01 03:00:00	28 days 03:00:00.000000000
4	2020-01-05 00:00:00	2020-02-01 04:00:00	27 days 04:00:00.000000000


print(df['diff'].dtype)
timedelta64[ns]

Now you can access days, seconds, or for full control components.
Note that when you call pd.Series.dt you are telling pandas to treat this Series as a datetime Series, and access the relevant available method (provided that it is of the correct type of course):

df['diff'].dt.days

0    31
1    30
2    29
3    28
4    27
Name: diff, dtype: int64

df['diff'].dt.seconds

0        0
1     3600
2     7200
3    10800
4    14400
Name: diff, dtype: int64


df['diff'].dt.components
days hours minutes seconds milliseconds microseconds nanoseconds
0 31 0 0 0 0 0 0
1 30 1 0 0 0 0 0
2 29 2 0 0 0 0 0
3 28 3 0 0 0 0 0
4 27 4 0 0 0 0 0

Now it should be easy to format your time differences in whichever way is readable and appropriate for the durations you are dealing with.

You can get the “isoformat” by calling that method on each element:

[x.isoformat() for x in df['diff']]

['P31DT0H0M0S', 'P30DT1H0M0S', 'P29DT2H0M0S', 'P28DT3H0M0S', 'P27DT4H0M0S']

Hope that helps!

2 Likes