{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Analyse de survie en pratique"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Quelques données\n",
"\n",
"On récupère les données disponibles sur *open.data.gouv.fr* [Données hospitalières relatives à l'épidémie de COVID-19](https://www.data.gouv.fr/fr/datasets/donnees-hospitalieres-relatives-a-lepidemie-de-covid-19/). Ces données ne permettent pas de construire la courbe de [Kaplan-Meier](https://fr.wikipedia.org/wiki/Estimateur_de_Kaplan-Meier). On sait combien de personnes rentrent et sortent chaque jour mais on ne sait pas quand une personne qui sort un 1er avril est entrée."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" jour \n",
" rad \n",
" dc \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 2020-03-18 \n",
" NaN \n",
" NaN \n",
" \n",
" \n",
" 1 \n",
" 2020-03-19 \n",
" 695.0 \n",
" 207.0 \n",
" \n",
" \n",
" 2 \n",
" 2020-03-20 \n",
" 806.0 \n",
" 248.0 \n",
" \n",
" \n",
" 3 \n",
" 2020-03-21 \n",
" 452.0 \n",
" 151.0 \n",
" \n",
" \n",
" 4 \n",
" 2020-03-22 \n",
" 608.0 \n",
" 210.0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" jour rad dc\n",
"0 2020-03-18 NaN NaN\n",
"1 2020-03-19 695.0 207.0\n",
"2 2020-03-20 806.0 248.0\n",
"3 2020-03-21 452.0 151.0\n",
"4 2020-03-22 608.0 210.0"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import numpy.random as rnd\n",
"\n",
"import pandas\n",
"\n",
"df = pandas.read_csv(\n",
" \"https://www.data.gouv.fr/fr/datasets/r/e3d83ab3-dc52-4c99-abaf-8a38050cc68c\",\n",
" sep=\";\",\n",
")\n",
"gr = df[[\"jour\", \"rad\", \"dc\"]].groupby([\"jour\"]).sum()\n",
"diff = gr.diff().reset_index(drop=False)\n",
"diff.head()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" entree \n",
" sortie \n",
" issue \n",
" \n",
" \n",
" \n",
" \n",
" 518488 \n",
" -200 \n",
" 19 \n",
" 0 \n",
" \n",
" \n",
" 541408 \n",
" -192 \n",
" 27 \n",
" 0 \n",
" \n",
" \n",
" 476735 \n",
" -187 \n",
" 2 \n",
" 0 \n",
" \n",
" \n",
" 587013 \n",
" -185 \n",
" 42 \n",
" 0 \n",
" \n",
" \n",
" 476057 \n",
" -180 \n",
" 1 \n",
" 0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" entree sortie issue\n",
"518488 -200 19 0\n",
"541408 -192 27 0\n",
"476735 -187 2 0\n",
"587013 -185 42 0\n",
"476057 -180 1 0"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def donnees_artificielles(hosp, mu=14, nu=21):\n",
" dt = pandas.to_datetime(hosp[\"jour\"])\n",
" res = []\n",
" for i in range(hosp.shape[0]):\n",
" date = dt[i].dayofyear\n",
" h = hosp.iloc[i, 1]\n",
" delay = rnd.exponential(mu, int(h))\n",
" for j in range(delay.shape[0]):\n",
" res.append([date - int(delay[j]), date, 1])\n",
" h = hosp.iloc[i, 2]\n",
" delay = rnd.exponential(nu, int(h))\n",
" for j in range(delay.shape[0]):\n",
" res.append([date - int(delay[j]), date, 0])\n",
" return pandas.DataFrame(res, columns=[\"entree\", \"sortie\", \"issue\"])\n",
"\n",
"\n",
"data = donnees_artificielles(diff[1:].reset_index(drop=True)).sort_values(\"entree\")\n",
"data.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Chaque ligne est une personne, `entree` est le jour d'entrée à l'hôpital, `sortie` celui de la sortie, `issue`, 0 pour décès, 1 pour en vie."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" entree \n",
" sortie \n",
" issue \n",
" \n",
" \n",
" \n",
" \n",
" count \n",
" 624130.000000 \n",
" 624130.000000 \n",
" 624130.000000 \n",
" \n",
" \n",
" mean \n",
" 169.704510 \n",
" 184.532815 \n",
" 0.806729 \n",
" \n",
" \n",
" std \n",
" 125.420957 \n",
" 124.343186 \n",
" 0.394864 \n",
" \n",
" \n",
" min \n",
" -200.000000 \n",
" 1.000000 \n",
" 0.000000 \n",
" \n",
" \n",
" 25% \n",
" 53.000000 \n",
" 84.000000 \n",
" 1.000000 \n",
" \n",
" \n",
" 50% \n",
" 133.000000 \n",
" 144.000000 \n",
" 1.000000 \n",
" \n",
" \n",
" 75% \n",
" 301.000000 \n",
" 315.000000 \n",
" 1.000000 \n",
" \n",
" \n",
" max \n",
" 366.000000 \n",
" 366.000000 \n",
" 1.000000 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" entree sortie issue\n",
"count 624130.000000 624130.000000 624130.000000\n",
"mean 169.704510 184.532815 0.806729\n",
"std 125.420957 124.343186 0.394864\n",
"min -200.000000 1.000000 0.000000\n",
"25% 53.000000 84.000000 1.000000\n",
"50% 133.000000 144.000000 1.000000\n",
"75% 301.000000 315.000000 1.000000\n",
"max 366.000000 366.000000 1.000000"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data.describe()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Il y a environ 80% de survie dans ces données."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"import numpy\n",
"\n",
"duree = data.sortie - data.entree\n",
"deces = (data.issue == 0).astype(numpy.int32)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"import numpy\n",
"import matplotlib.pyplot as plt\n",
"from lifelines import KaplanMeierFitter\n",
"\n",
"fig, ax = plt.subplots(1, 1, figsize=(10, 4))\n",
"kmf = KaplanMeierFitter()\n",
"kmf.fit(duree, deces)\n",
"kmf.plot(ax=ax)\n",
"ax.legend();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Régression de Cox\n",
"\n",
"On reprend les données artificiellement générées et on ajoute une variable identique à la durée plus un bruit mais quasi nul "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" duree \n",
" deces \n",
" X1 \n",
" X2 \n",
" \n",
" \n",
" \n",
" \n",
" 518488 \n",
" 219 \n",
" 1 \n",
" 125.653230 \n",
" -125.666662 \n",
" \n",
" \n",
" 541408 \n",
" 219 \n",
" 1 \n",
" 126.006024 \n",
" -125.327549 \n",
" \n",
" \n",
" 476735 \n",
" 189 \n",
" 1 \n",
" 107.920779 \n",
" -108.358230 \n",
" \n",
" \n",
" 587013 \n",
" 227 \n",
" 1 \n",
" 129.788930 \n",
" -130.045019 \n",
" \n",
" \n",
" 476057 \n",
" 181 \n",
" 1 \n",
" 103.642440 \n",
" -103.793008 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" duree deces X1 X2\n",
"518488 219 1 125.653230 -125.666662\n",
"541408 219 1 126.006024 -125.327549\n",
"476735 189 1 107.920779 -108.358230\n",
"587013 227 1 129.788930 -130.045019\n",
"476057 181 1 103.642440 -103.793008"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas\n",
"\n",
"data_simple = pandas.DataFrame(\n",
" {\n",
" \"duree\": duree,\n",
" \"deces\": deces,\n",
" \"X1\": duree * 0.57 * deces + numpy.random.randn(duree.shape[0]),\n",
" \"X2\": duree * (-0.57) * deces + numpy.random.randn(duree.shape[0]),\n",
" }\n",
")\n",
"data_simple.head()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"\n",
"data_train, data_test = train_test_split(data_simple, test_size=0.8)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r\n",
"Iteration 1: norm_delta = 0.13943, step_size = 0.9000, log_lik = -250658.36250, newton_decrement = 889.93933, seconds_since_start = 0.0\n",
"\r\n",
"Iteration 2: norm_delta = 0.00660, step_size = 0.9000, log_lik = -249862.37270, newton_decrement = 2.81312, seconds_since_start = 0.0\n",
"\r\n",
"Iteration 3: norm_delta = 0.00073, step_size = 0.9000, log_lik = -249859.57376, newton_decrement = 0.03357, seconds_since_start = 0.1\n",
"\r\n",
"Iteration 4: norm_delta = 0.00000, step_size = 1.0000, log_lik = -249859.54017, newton_decrement = 0.00000, seconds_since_start = 0.1\n",
"Convergence success after 4 iterations.\n"
]
},
{
"data": {
"text/plain": [
""
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from lifelines.fitters.coxph_fitter import CoxPHFitter\n",
"\n",
"cox = CoxPHFitter()\n",
"cox.fit(\n",
" data_train[[\"duree\", \"deces\", \"X1\"]],\n",
" duration_col=\"duree\",\n",
" event_col=\"deces\",\n",
" show_progress=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" model \n",
" lifelines.CoxPHFitter \n",
" \n",
" \n",
" duration col \n",
" 'duree' \n",
" \n",
" \n",
" event col \n",
" 'deces' \n",
" \n",
" \n",
" baseline estimation \n",
" breslow \n",
" \n",
" \n",
" number of observations \n",
" 124826 \n",
" \n",
" \n",
" number of events observed \n",
" 24072 \n",
" \n",
" \n",
" partial log-likelihood \n",
" -249859.54 \n",
" \n",
" \n",
" time fit was run \n",
" 2021-02-24 23:48:57 UTC \n",
" \n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
" coef \n",
" exp(coef) \n",
" se(coef) \n",
" coef lower 95% \n",
" coef upper 95% \n",
" exp(coef) lower 95% \n",
" exp(coef) upper 95% \n",
" z \n",
" p \n",
" -log2(p) \n",
" \n",
" \n",
" \n",
" \n",
" X1 \n",
" 0.02 \n",
" 1.02 \n",
" 0.00 \n",
" 0.02 \n",
" 0.02 \n",
" 1.02 \n",
" 1.02 \n",
" 42.23 \n",
" <0.005 \n",
" inf \n",
" \n",
" \n",
"
\n",
"\n",
"
\n",
" \n",
" \n",
" Concordance \n",
" 0.69 \n",
" \n",
" \n",
" Partial AIC \n",
" 499721.08 \n",
" \n",
" \n",
" log-likelihood ratio test \n",
" 1597.64 on 1 df \n",
" \n",
" \n",
" -log2(p) of ll-ratio test \n",
" inf \n",
" \n",
" \n",
"
\n",
"
"
],
"text/latex": [
"\\begin{tabular}{lrrrrrrrrrr}\n",
"\\toprule\n",
"{} & coef & exp(coef) & se(coef) & coef lower 95\\% & coef upper 95\\% & exp(coef) lower 95\\% & exp(coef) upper 95\\% & z & p & -log2(p) \\\\\n",
"covariate & & & & & & & & & & \\\\\n",
"\\midrule\n",
"X1 & 0.02 & 1.02 & 0.00 & 0.02 & 0.02 & 1.02 & 1.02 & 42.23 & 0.00 & inf \\\\\n",
"\\bottomrule\n",
"\\end{tabular}\n"
],
"text/plain": [
"\n",
" duration col = 'duree'\n",
" event col = 'deces'\n",
" baseline estimation = breslow\n",
" number of observations = 124826\n",
"number of events observed = 24072\n",
" partial log-likelihood = -249859.54\n",
" time fit was run = 2021-02-24 23:48:57 UTC\n",
"\n",
"---\n",
" coef exp(coef) se(coef) coef lower 95% coef upper 95% exp(coef) lower 95% exp(coef) upper 95%\n",
"covariate \n",
"X1 0.02 1.02 0.00 0.02 0.02 1.02 1.02\n",
"\n",
" z p -log2(p)\n",
"covariate \n",
"X1 42.23 <0.005 inf\n",
"---\n",
"Concordance = 0.69\n",
"Partial AIC = 499721.08\n",
"log-likelihood ratio test = 1597.64 on 1 df\n",
"-log2(p) of ll-ratio test = inf"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"cox.print_summary()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\r\n",
"Iteration 1: norm_delta = 0.13946, step_size = 0.9000, log_lik = -250658.36250, newton_decrement = 888.92089, seconds_since_start = 0.0\n",
"\r\n",
"Iteration 2: norm_delta = 0.00667, step_size = 0.9000, log_lik = -249863.61089, newton_decrement = 2.86434, seconds_since_start = 0.0\n",
"\r\n",
"Iteration 3: norm_delta = 0.00074, step_size = 0.9000, log_lik = -249860.76079, newton_decrement = 0.03426, seconds_since_start = 0.1\n",
"\r\n",
"Iteration 4: norm_delta = 0.00000, step_size = 1.0000, log_lik = -249860.72650, newton_decrement = 0.00000, seconds_since_start = 0.1\n",
"Convergence success after 4 iterations.\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" model \n",
" lifelines.CoxPHFitter \n",
" \n",
" \n",
" duration col \n",
" 'duree' \n",
" \n",
" \n",
" event col \n",
" 'deces' \n",
" \n",
" \n",
" baseline estimation \n",
" breslow \n",
" \n",
" \n",
" number of observations \n",
" 124826 \n",
" \n",
" \n",
" number of events observed \n",
" 24072 \n",
" \n",
" \n",
" partial log-likelihood \n",
" -249860.73 \n",
" \n",
" \n",
" time fit was run \n",
" 2021-02-24 23:48:59 UTC \n",
" \n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
" coef \n",
" exp(coef) \n",
" se(coef) \n",
" coef lower 95% \n",
" coef upper 95% \n",
" exp(coef) lower 95% \n",
" exp(coef) upper 95% \n",
" z \n",
" p \n",
" -log2(p) \n",
" \n",
" \n",
" \n",
" \n",
" X2 \n",
" -0.02 \n",
" 0.98 \n",
" 0.00 \n",
" -0.02 \n",
" -0.02 \n",
" 0.98 \n",
" 0.98 \n",
" -42.21 \n",
" <0.005 \n",
" inf \n",
" \n",
" \n",
"
\n",
"\n",
"
\n",
" \n",
" \n",
" Concordance \n",
" 0.69 \n",
" \n",
" \n",
" Partial AIC \n",
" 499723.45 \n",
" \n",
" \n",
" log-likelihood ratio test \n",
" 1595.27 on 1 df \n",
" \n",
" \n",
" -log2(p) of ll-ratio test \n",
" inf \n",
" \n",
" \n",
"
\n",
"
"
],
"text/latex": [
"\\begin{tabular}{lrrrrrrrrrr}\n",
"\\toprule\n",
"{} & coef & exp(coef) & se(coef) & coef lower 95\\% & coef upper 95\\% & exp(coef) lower 95\\% & exp(coef) upper 95\\% & z & p & -log2(p) \\\\\n",
"covariate & & & & & & & & & & \\\\\n",
"\\midrule\n",
"X2 & -0.02 & 0.98 & 0.00 & -0.02 & -0.02 & 0.98 & 0.98 & -42.21 & 0.00 & inf \\\\\n",
"\\bottomrule\n",
"\\end{tabular}\n"
],
"text/plain": [
"\n",
" duration col = 'duree'\n",
" event col = 'deces'\n",
" baseline estimation = breslow\n",
" number of observations = 124826\n",
"number of events observed = 24072\n",
" partial log-likelihood = -249860.73\n",
" time fit was run = 2021-02-24 23:48:59 UTC\n",
"\n",
"---\n",
" coef exp(coef) se(coef) coef lower 95% coef upper 95% exp(coef) lower 95% exp(coef) upper 95%\n",
"covariate \n",
"X2 -0.02 0.98 0.00 -0.02 -0.02 0.98 0.98\n",
"\n",
" z p -log2(p)\n",
"covariate \n",
"X2 -42.21 <0.005 inf\n",
"---\n",
"Concordance = 0.69\n",
"Partial AIC = 499723.45\n",
"log-likelihood ratio test = 1595.27 on 1 df\n",
"-log2(p) of ll-ratio test = inf"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"cox2 = CoxPHFitter()\n",
"cox2.fit(\n",
" data_train[[\"duree\", \"deces\", \"X2\"]],\n",
" duration_col=\"duree\",\n",
" event_col=\"deces\",\n",
" show_progress=True,\n",
")\n",
"cox2.print_summary()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" 621725 \n",
" 110352 \n",
" 139986 \n",
" 72623 \n",
" 248121 \n",
" \n",
" \n",
" \n",
" \n",
" 0.0 \n",
" 0.008806 \n",
" 0.008673 \n",
" 0.008543 \n",
" 0.008717 \n",
" 0.008661 \n",
" \n",
" \n",
" 1.0 \n",
" 0.018218 \n",
" 0.017942 \n",
" 0.017673 \n",
" 0.018035 \n",
" 0.017918 \n",
" \n",
" \n",
" 2.0 \n",
" 0.026735 \n",
" 0.026330 \n",
" 0.025936 \n",
" 0.026466 \n",
" 0.026295 \n",
" \n",
" \n",
" 3.0 \n",
" 0.036002 \n",
" 0.035457 \n",
" 0.034926 \n",
" 0.035640 \n",
" 0.035409 \n",
" \n",
" \n",
" 4.0 \n",
" 0.045543 \n",
" 0.044854 \n",
" 0.044182 \n",
" 0.045085 \n",
" 0.044793 \n",
" \n",
" \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" \n",
" \n",
" 184.0 \n",
" 2.055736 \n",
" 2.024623 \n",
" 1.994277 \n",
" 2.035070 \n",
" 2.021872 \n",
" \n",
" \n",
" 189.0 \n",
" 2.092002 \n",
" 2.060340 \n",
" 2.029459 \n",
" 2.070971 \n",
" 2.057541 \n",
" \n",
" \n",
" 197.0 \n",
" 2.138687 \n",
" 2.106318 \n",
" 2.074747 \n",
" 2.117186 \n",
" 2.103457 \n",
" \n",
" \n",
" 201.0 \n",
" 2.205513 \n",
" 2.172133 \n",
" 2.139576 \n",
" 2.183341 \n",
" 2.169182 \n",
" \n",
" \n",
" 217.0 \n",
" 2.330629 \n",
" 2.295356 \n",
" 2.260952 \n",
" 2.307199 \n",
" 2.292237 \n",
" \n",
" \n",
"
\n",
"
165 rows × 5 columns
\n",
"
"
],
"text/plain": [
" 621725 110352 139986 72623 248121\n",
"0.0 0.008806 0.008673 0.008543 0.008717 0.008661\n",
"1.0 0.018218 0.017942 0.017673 0.018035 0.017918\n",
"2.0 0.026735 0.026330 0.025936 0.026466 0.026295\n",
"3.0 0.036002 0.035457 0.034926 0.035640 0.035409\n",
"4.0 0.045543 0.044854 0.044182 0.045085 0.044793\n",
"... ... ... ... ... ...\n",
"184.0 2.055736 2.024623 1.994277 2.035070 2.021872\n",
"189.0 2.092002 2.060340 2.029459 2.070971 2.057541\n",
"197.0 2.138687 2.106318 2.074747 2.117186 2.103457\n",
"201.0 2.205513 2.172133 2.139576 2.183341 2.169182\n",
"217.0 2.330629 2.295356 2.260952 2.307199 2.292237\n",
"\n",
"[165 rows x 5 columns]"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cox.predict_cumulative_hazard(data_test[:5])"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" 621725 \n",
" 110352 \n",
" 139986 \n",
" 72623 \n",
" 248121 \n",
" \n",
" \n",
" \n",
" \n",
" 0.0 \n",
" 0.991233 \n",
" 0.991365 \n",
" 0.991494 \n",
" 0.991321 \n",
" 0.991377 \n",
" \n",
" \n",
" 1.0 \n",
" 0.981947 \n",
" 0.982218 \n",
" 0.982482 \n",
" 0.982127 \n",
" 0.982242 \n",
" \n",
" \n",
" 2.0 \n",
" 0.973619 \n",
" 0.974013 \n",
" 0.974398 \n",
" 0.973881 \n",
" 0.974048 \n",
" \n",
" \n",
" 3.0 \n",
" 0.964638 \n",
" 0.965164 \n",
" 0.965677 \n",
" 0.964988 \n",
" 0.965211 \n",
" \n",
" \n",
" 4.0 \n",
" 0.955478 \n",
" 0.956137 \n",
" 0.956780 \n",
" 0.955916 \n",
" 0.956196 \n",
" \n",
" \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" \n",
" \n",
" 184.0 \n",
" 0.127999 \n",
" 0.132044 \n",
" 0.136112 \n",
" 0.130671 \n",
" 0.132407 \n",
" \n",
" \n",
" 189.0 \n",
" 0.123440 \n",
" 0.127411 \n",
" 0.131407 \n",
" 0.126063 \n",
" 0.127768 \n",
" \n",
" \n",
" 197.0 \n",
" 0.117809 \n",
" 0.121685 \n",
" 0.125588 \n",
" 0.120370 \n",
" 0.122034 \n",
" \n",
" \n",
" 201.0 \n",
" 0.110194 \n",
" 0.113934 \n",
" 0.117705 \n",
" 0.112664 \n",
" 0.114271 \n",
" \n",
" \n",
" 217.0 \n",
" 0.097235 \n",
" 0.100726 \n",
" 0.104251 \n",
" 0.099540 \n",
" 0.101040 \n",
" \n",
" \n",
"
\n",
"
165 rows × 5 columns
\n",
"
"
],
"text/plain": [
" 621725 110352 139986 72623 248121\n",
"0.0 0.991233 0.991365 0.991494 0.991321 0.991377\n",
"1.0 0.981947 0.982218 0.982482 0.982127 0.982242\n",
"2.0 0.973619 0.974013 0.974398 0.973881 0.974048\n",
"3.0 0.964638 0.965164 0.965677 0.964988 0.965211\n",
"4.0 0.955478 0.956137 0.956780 0.955916 0.956196\n",
"... ... ... ... ... ...\n",
"184.0 0.127999 0.132044 0.136112 0.130671 0.132407\n",
"189.0 0.123440 0.127411 0.131407 0.126063 0.127768\n",
"197.0 0.117809 0.121685 0.125588 0.120370 0.122034\n",
"201.0 0.110194 0.113934 0.117705 0.112664 0.114271\n",
"217.0 0.097235 0.100726 0.104251 0.099540 0.101040\n",
"\n",
"[165 rows x 5 columns]"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cox.predict_survival_function(data_test[:5])"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.7"
}
},
"nbformat": 4,
"nbformat_minor": 4
}