Kaplan-Meier component

DashKaplanMeier Dash Component

Hello community. Here I present a Dash component to render Kaplan–Meier survival curves with enhanced interactivity and styling options.

DashKaplanMeier component plots multiple Kaplan-Meier curves with confident intervals and computes its statistics (logrank p-value, COX p-value and hazard ratio) entirely using raw event and time-to-event data passed from Dash app. Visualize and compare multiple cohorts with labeled lines and customizable figure’s colors, layout, and title.

Usage

Here’s how to use the DashKaplanMeier component in your Dash app:

import dash
from dash import html, Dash
import dash_kaplan_meier as dkm
from dash_kaplan_meier.survival_stats import compute_survival_stats

# Example data
time        = [your time values list here]
event       = [your event values list here]
group       = [you rgroup values list here]

# Compute statistics
stats       = compute_survival_stats(time ,event, group)

# Dash app
app         = Dash()

# Dash app layout with DashKaplanMeier component
app.layout = html.Div([
    dkm.DashKaplanMeier(
        id              = 'km-example',
        time            = time,
        event           = event,
        group           = group,
        showCIs         = True,
        colors          = ['blue', 'green', 'red'],
        showStatistics  = True,
        logrankP        = stats["logrank_p"],
        coxP            = stats["cox_p"],
        hazardRatio     = stats["hazard_ratio"],
        layout          ={  'title': 'Kaplan-Meier Survival Curve Example',
                            'xaxis': {'title': 'Time'},
                            'yaxis': {'title': 'Survival Probability'}},
        title           = "Kaplan-Meier curves",
        config          = {'responsive': True}
    )
])

if __name__ == '__main__':
    app.run(debug=True)

Plot Example

Survival Example

I’ll be very happy to receive your feedback, comments or any improvements that can be made.

Github link
Pypi link

3 Likes

very cool, @Xavi.LL .
What are Kaplan–Meier survival curves used for?
Did you do this as part of your work or part of a hobby?

Hi @adamschroeder, i’m working in cancer research, where the Kaplan-Meier survival curves are widely used by bioinformaticians or clinicians to compare groups of patients, samples, treatments and more.

Since some years I’m using plotly and dash for interactive visualizations, but I din’t see a Kaplan-Meier component, so I decided to fill this gap as a hobby.

As a summary of porpuses and applications:

Kaplan-Meier survival curves are used to estimate and visualize the survival function over time for a group of subjects. They are especially useful in medical research and clinical trials, but can also be applied in other fields like engineering (for failure analysis) or economics (for time-to-event data).

Main Purposes of Kaplan-Meier Survival Curves:

  1. Estimate survival probabilities over time:
  • Shows the probability that a subject will survive (or remain event-free) past a certain time point.
  1. Compare survival between groups:
  • Helps assess differences in survival between two or more groups (e.g., treatment vs. control group).
  • Often used alongside statistical tests like the log-rank test to evaluate significance.
  1. Handle censored data:
  • Accommodates right-censored data, where the event (e.g., death, failure, relapse) hasn’t occurred by the end of the study or the subject is lost to follow-up.
  1. Visualize time-to-event data:
  • Provides a stepwise curve that drops at each event time, making it easy to see when most events occur.

Common Applications:

  • Medicine: Comparing survival of cancer patients on different treatments.
  • Epidemiology: Time until disease recurrence or infection.
  • Engineering: Time until machine failure or product defects.
  • Business: Customer churn analysis (time until customer leaves).

very interesting, and sounds quite useful to people in the field.
Thank you for creating this component, @Xavi.LL :folded_hands:

1 Like

Very interesting. Scikit-learn library has also a type of kaplan-meier estimator. Scikit-survival. Well Is not part of the core, Is built on top of scikit specifically for survival analysis

1 Like

Hi @Avacsiglo21 , you’re absolutely right. The DashKaplanMeier component I have created use the Python library called lifelines to compute the survival curves, so the good things are, that you don’t need to use this analysis libraries because the DashKaplanMeier compute the survival curves for you, by passing raw event and time-to-event data, and also compute the statistics for significance between group survivals.

1 Like

Hi @Xavi.LL , thank you for this component! I’m oncologist from N. N. Alexandrov National Cancer Centre of Belarus and I’ve looked for something like this. But showStatistics=True does not work. I’m using Dash 3.0.4

2 Likes

Hi @YaKuzuri , thanks for your comment. I’m very happy that this component can simplify your analysis. I’ll have a look at the showStatistics, where is the mistake.

I imagine you compute the statistics (stats = compute_survival_stats(time ,event, group) before the Dash app declaration right? Could you maybe provide me a code example where doesn’t work?

app = Dash()

app.layout = [
    html.H1(children='Title of Dash App', style={'textAlign':'center'}),
    html.A(href='http://127.0.0.1:8000', children='На главную страницу', style={'textAlign':'center'}),
    html.Button('Обновить базу', id='reload_btn', n_clicks=0),
    html.Div(id='reload_div')
]

@callback(
    Output('reload_div', 'children'),
    Input('reload_btn', 'n_clicks')
)
def reload_base(value):
    df, status, cr, error = get_response_check('/api/get_all', 'patients', get=True, post_json=None)

    time_observed = df['time_observed']
    time_to_recid = df['time_to_recid']
    event_overal = df['general_mortality']
    event_specific = df['specific_mortality']
    event_recid = df['recid']
    group = df['dendritic'].astype('str')

    # Compute statistics
    stats_overal = compute_survival_stats(time_observed, event_overal, group)
    stats_specific = compute_survival_stats(time_observed, event_specific, group)
    stats_recid = compute_survival_stats(time_to_recid, event_recid, group)

    reload_div = html.Div(
        [
            dash_table.DataTable(
                id='table',
                columns=[
                    {
                        "name": i,
                        "id": i,
                        "deletable": True,
                        "selectable": True} for i in df.columns
                ],
                # hidden_columns=[i for i in df.columns],
                data=df.to_dict('records'),
                editable=True,
                filter_action="native",
                sort_action="native",
                sort_mode="multi",
                column_selectable="multi",
                row_selectable="multi",
                row_deletable=True,
                selected_columns=[],
                selected_rows=[],
                page_action="native",
                page_current=0,
                page_size=10,
                export_format='xlsx',

                style_data={
                    'whiteSpace': 'normal',
                    'height': 'auto',
                },
                style_header={
                    'whiteSpace': 'normal',
                    'height': 'auto',
                },
                filter_options={
                    'placeholder_text': 'Фильтр...',
                },
                style_table={'overflowX': 'auto'},
            ),

            dcc.Graph(
                figure=px.pie(
                    df.groupby('dendritic')['id'].count().reset_index(),
                    values='id',
                    names='dendritic')
            ),

            html.Div([

                dkm.DashKaplanMeier(
                    id='km-overal',
                    time=time_observed,
                    event=event_overal,
                    group=group,
                    showCIs=True,
                    colors=['blue', 'green', 'red'],
                    showStatistics=True,
                    logrankP=stats_overal["logrank_p"],
                    coxP=stats_overal["cox_p"],
                    hazardRatio=stats_overal["hazard_ratio"],
                    layout={'title': 'Общая выживаемость',
                            'xaxis': {'title': 'Time'},
                            'yaxis': {'title': 'Survival Probability'}},
                    title="Общая выживаемость",
                    config={'responsive': True},
                    style={'display': 'inline-block'}
                ),

                dkm.DashKaplanMeier(
                    id='km-specific',
                    time=time_observed,
                    event=event_specific,
                    group=group,
                    showCIs=True,
                    colors=['blue', 'green', 'red'],
                    showStatistics=True,
                    logrankP=stats_specific["logrank_p"],
                    coxP=stats_specific["cox_p"],
                    hazardRatio=stats_specific["hazard_ratio"],
                    layout={'title': 'Специфическая выживаемость',
                            'xaxis': {'title': 'Time'},
                            'yaxis': {'title': 'Survival Probability'}},
                    title="Специфическая выживаемость",
                    config={'responsive': True},
                    style = {'display': 'inline-block'}
                ),

                dkm.DashKaplanMeier(
                    id='km-recid',
                    time=time_to_recid,
                    event=event_recid,
                    group=group,
                    showCIs=True,
                    colors=['blue', 'green', 'red'],
                    showStatistics=True,
                    logrankP=stats_recid["logrank_p"],
                    coxP=stats_recid["cox_p"],
                    hazardRatio=stats_recid["hazard_ratio"],
                    layout={'title': 'Безрецидивная выживаемость',
                            'xaxis': {'title': 'Time'},
                            'yaxis': {'title': 'Survival Probability'}},
                    title="Безрецидивная выживаемость",
                    config={'responsive': True},
                    style={'display': 'inline-block'}
                )
                ], style={'display': 'inline-block'}),
            ]
    )
    return reload_div

if __name__ == '__main__':
    app.run(debug=True, host='0.0.0.0', port=8050)

Thanks @YaKuzuri for the example. To me it worked.

I’m using:

  • dash==3.0.4
  • pandas==2.3.0
  • lifelines==0.30.0
  • numpy==2.2.6

Also check that:

  • Time values are float type.
  • Event values are int type (0 or 1).
  • Group values are str.

Ensure that all required props are passed and have correct numeric values:

  • logrankP: float
  • coxP: float
  • hazardRatio: float or string
  • If any of them is missing or is None, the component may silently skip the statistics section.

Let me know if there was any mistake and if it works for you or not.

You’re welcome @adamschroeder . When I started using Plotly and Dash, I was a bit surprised that a Kaplan-Meier component was not included in the Dash Bio components. In my field, Kapla-Meier plots are more used than some plots in Dash Bio, which are also useful an super interesting. Maybe you can inlcude that one :wink:

Thank you for your replay. And what about layout titles? It doesn’t seem working

1 Like

Hi @YaKuzuri , thank you very much for your feedback. I didn’t realize about that.

To make appear the axis titles you should do it like this:

layout = {  "title": 'Kaplan-Meier Survival Curve Example',
            "xaxis": {"title": {"text": "Time (months)"}},
            "yaxis": {"title": {"text": "Survival Probability"}}},

Hi! I noticed that when there are several charts on a page, they are displayed only on the first callback. On the second callback, only one chart is displayed. On the third callback, all charts are displayed again. On the fourth callback, only one again, and so on.

Hi @YaKuzuri ! Related to your last comment I have a question. Are you trying to update the same chart or charts in different callbacks? If that is the case I write you some alternative below.

If your are trying this, it will raise an error (DuplicateCallbackOutput). You should not have multiple callbacks updating the same Output. If multiple triggers need to update the same chart, combine them into a single callback using multiple Input. if logic must be separate, use a dcc.Store or intermediary component to pass computed state between callbacks.

Let me know if that solves your problem about charts update.

@Xavi.LL

This is my example of problem:

from dash import Dash, html, dcc, callback, Output, Input
import random


import dash_kaplan_meier as dkm
from dash_kaplan_meier.survival_stats import compute_survival_stats

n = 50  # Number of random numbers
groups = ['Group 1', 'Group 2']

time_observed = random.choices(range(1, 36), k=n)
time_to_recid = random.choices(range(1, 36), k=n)
event_overal = random.choices(range(2), k=n)
event_specific = random.choices(range(2), k=n)
event_recid = random.choices(range(2), k=n)
group = random.choices(groups, k=n)
print(group)

# Compute statistics
stats_overal = compute_survival_stats(time_observed, event_overal, group)
print(stats_overal)
stats_specific = compute_survival_stats(time_observed, event_specific, group)
stats_recid = compute_survival_stats(time_to_recid, event_recid, group)
app = Dash()

# Requires Dash 2.17.0 or later
app.layout = [
    html.Button('Reload', id='reload_btn'),
    html.Div([
    ], id='reload_div')

]



@callback(
    Output('reload_div', 'children'),
    Input('reload_btn', "n_clicks"),
    # prevent_initial_call=True,
)
def update_plots(n_clicks):
    dkm_overal = dkm.DashKaplanMeier(
                    id='km-overal',
                    time=time_observed,
                    event=event_overal,
                    group=group,
                    showCIs=True,
                    colors=['blue', 'green', 'red'],
                    showStatistics=True,
                    logrankP=stats_overal["logrank_p"],
                    coxP=stats_overal["cox_p"],
                    hazardRatio=stats_overal["hazard_ratio"],
                    layout={'title': 'Общая выживаемость',
                            'xaxis': {'title': 'Time'},
                            'yaxis': {'title': 'Survival Probability'}},
                    title="Общая выживаемость",
                    config={'responsive': True},
                    style={'display': 'inline-block', 'high': 500, 'width': 500}
                )

    dkm_specific = dkm.DashKaplanMeier(
                    id='km-specific',
                    time=time_observed,
                    event=event_specific,
                    group=group,
                    showCIs=True,
                    colors=['blue', 'green', 'red'],
                    showStatistics=True,
                    logrankP=stats_specific["logrank_p"],
                    coxP=stats_specific["cox_p"],
                    hazardRatio=stats_specific["hazard_ratio"],
                    layout={'title': 'Специфическая выживаемость',
                            'xaxis': {'title': 'Time'},
                            'yaxis': {'title': 'Survival Probability'}},
                    title="Специфическая выживаемость",
                    config={'responsive': True},
                    style={'display': 'inline-block', 'high': 500, 'width': 500}
                )

    dkm_recid = dkm.DashKaplanMeier(
                    id='km-recid',
                    time=time_to_recid,
                    event=event_recid,
                    group=group,
                    showCIs=True,
                    colors=['blue', 'green', 'red'],
                    showStatistics=True,
                    logrankP=stats_recid["logrank_p"],
                    coxP=stats_recid["cox_p"],
                    hazardRatio=stats_recid["hazard_ratio"],
                    layout={'title': 'Безрецидивная выживаемость',
                            'xaxis': {'title': {'text': 'Time'}},
                            'yaxis': {'title': {'text': 'Survival Probability'}}},
                    title="Безрецидивная выживаемость",
                    config={'responsive': True},
                    style={'display': 'inline-block', 'high': 500, 'width': 500}
                )


    reload_div = html.Div(
        [
            html.Div([
                html.Div([
                    html.H1('Общая выживаемость'),
                    dkm_overal
                ], id='overal'),

                html.Div([
                    html.H1('Специфическая выживаемость'),
                    dkm_specific,
                ], id='specific'),

                html.Div([
                    html.H1('Безрецидивная выживаемость'),
                    dkm_recid
                ], id='recid'),


            ]),
        ]
    )

    return reload_div

if __name__ == '__main__':
    app.run(debug=True, host='0.0.0.0', port=8050)

@Xavi.LL

while I personally encurage this

the following statement is outdated.

Since dash 2.19 you can allow multiple callbacks to target the same output.

1 Like

Thanks for the contribution @AIMPED , I didn’t noticed about that. Not sure if @YaKuzuri uses an older Dash version than 2.19 according with his comment at the app.layout declaration. Everything and so I’m going to check what happens. Thanks to both!

1 Like

Hi @YaKuzuri , thanks for the code example. As you will see in the following screenshot, after the first reload, it seems that the two first charts are not displayed, but they are. You can see the stats and a bit of the beginning of the Kaplan-Meier curves (if you hover and click on it the chart appears entirely). The thing is that is useless and confusing.

If I’m right, when new DashKaplanMeier components is injected via callback, they might render before the browser fully recalculates layout, especially if they rely on canvas or SVG sizing that’s computed based on their container’s visible dimensions.

If you wrap each DahKaplanMeier component inside a dcc.Loading the problem is solved.
Another alternative to solve the problem, is to add in each DahKaplanMeier style a 'minHeight': 500. This also will avoid the bad performance.

Other things I would change (that are not involved in your problem) according your code example are the following:

  • In the DashKaplanMeier style there is a grammatical error. Change high by height.
  • When you declare the app.layout, you are wrapping everything in a list. I think is not a good practice and sometimes can be the problem of a bad performance depending on what you are using in your Dash app. Wrap everything inside an html.Div.
  • If the children of your reload_div, is empty when you declare the app.layout, there is no need to write an empty list there.

As I said, this things are not involved in your problem, and are personal opinion. Here I leave you the final code with my modifications that is working fine. I hope it works also for you :wink:

from dash import Dash, html, dcc, callback, Output, Input
import random


import dash_kaplan_meier as dkm
from dash_kaplan_meier.survival_stats import compute_survival_stats

n = 50  # Number of random numbers
groups = ['Group 1', 'Group 2']

time_observed = random.choices(range(1, 36), k=n)
time_to_recid = random.choices(range(1, 36), k=n)
event_overal = random.choices(range(2), k=n)
event_specific = random.choices(range(2), k=n)
event_recid = random.choices(range(2), k=n)
group = random.choices(groups, k=n)
print(group)

# Compute statistics
stats_overal = compute_survival_stats(time_observed, event_overal, group)
print(stats_overal)
stats_specific = compute_survival_stats(time_observed, event_specific, group)
stats_recid = compute_survival_stats(time_to_recid, event_recid, group)
app = Dash()

# Requires Dash 2.17.0 or later
app.layout = html.Div(children=[
    html.Button('Reload', id='reload_btn'),
    html.Div(id='reload_div')

])



@callback(
    Output('reload_div', 'children'),
    Input('reload_btn', "n_clicks"),
    # prevent_initial_call=True,
)
def update_plots(n_clicks):
    dkm_overal = dkm.DashKaplanMeier(
                    id='km-overal',
                    time=time_observed,
                    event=event_overal,
                    group=group,
                    showCIs=True,
                    colors=['blue', 'green', 'red'],
                    showStatistics=True,
                    logrankP=stats_overal["logrank_p"],
                    coxP=stats_overal["cox_p"],
                    hazardRatio=stats_overal["hazard_ratio"],
                    layout={'title': 'Общая выживаемость',
                            'xaxis': {'title': 'Time'},
                            'yaxis': {'title': 'Survival Probability'}},
                    title="Общая выживаемость",
                    config={'responsive': False},
                    style={'display': 'inline-block', 'height': 500, 'width': 500}
                )

    dkm_specific = dkm.DashKaplanMeier(
                    id='km-specific',
                    time=time_observed,
                    event=event_specific,
                    group=group,
                    showCIs=True,
                    colors=['blue', 'green', 'red'],
                    showStatistics=True,
                    logrankP=stats_specific["logrank_p"],
                    coxP=stats_specific["cox_p"],
                    hazardRatio=stats_specific["hazard_ratio"],
                    layout={'title': 'Специфическая выживаемость',
                            'xaxis': {'title': 'Time'},
                            'yaxis': {'title': 'Survival Probability'}},
                    title="Специфическая выживаемость",
                    config={'responsive': False},
                    style={'display': 'inline-block', 'height': 500, 'width': 500}
                )

    dkm_recid = dkm.DashKaplanMeier(
                    id='km-recid',
                    time=time_to_recid,
                    event=event_recid,
                    group=group,
                    showCIs=True,
                    colors=['blue', 'green', 'red'],
                    showStatistics=True,
                    logrankP=stats_recid["logrank_p"],
                    coxP=stats_recid["cox_p"],
                    hazardRatio=stats_recid["hazard_ratio"],
                    layout={'title': 'Безрецидивная выживаемость',
                            'xaxis': {'title': {'text': 'Time'}},
                            'yaxis': {'title': {'text': 'Survival Probability'}}},
                    title="Безрецидивная выживаемость",
                    config={'responsive': False},
                    style={'display': 'inline-block', 'height': 500, 'width': 500}
                )


    reload_div = html.Div(
        [
            html.Div([
                html.Div([
                    html.H1('Общая выживаемость'),
                    dcc.Loading([dkm_overal]),
                ], id='overal'),

                html.Div([
                    html.H1('Специфическая выживаемость'),
                    dcc.Loading([dkm_specific]),
                ], id='specific'),

                html.Div([
                    html.H1('Безрецидивная выживаемость'),
                    dcc.Loading([dkm_recid]),
                ], id='recid'),


            ])
        ]
    )

    return reload_div

if __name__ == '__main__':
    app.run(debug=True)