{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Analyse de survie en pratique"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Quelques données\n",
"\n",
"On récupère les données disponibles sur *open.data.gouv.fr* [Données hospitalières relatives à l'épidémie de COVID-19](https://www.data.gouv.fr/fr/datasets/donnees-hospitalieres-relatives-a-lepidemie-de-covid-19/). Ces données ne permettent pas de construire la courbe de [Kaplan-Meier](https://fr.wikipedia.org/wiki/Estimateur_de_Kaplan-Meier). On sait combien de personnes rentrent et sortent chaque jour mais on ne sait pas quand une personne qui sort un 1er avril est entrée."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" jour \n",
" rad \n",
" dc \n",
" \n",
" \n",
" \n",
" \n",
" 0 \n",
" 2020-03-18 \n",
" NaN \n",
" NaN \n",
" \n",
" \n",
" 1 \n",
" 2020-03-19 \n",
" 695.0 \n",
" 207.0 \n",
" \n",
" \n",
" 2 \n",
" 2020-03-20 \n",
" 806.0 \n",
" 248.0 \n",
" \n",
" \n",
" 3 \n",
" 2020-03-21 \n",
" 452.0 \n",
" 151.0 \n",
" \n",
" \n",
" 4 \n",
" 2020-03-22 \n",
" 608.0 \n",
" 210.0 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" jour rad dc\n",
"0 2020-03-18 NaN NaN\n",
"1 2020-03-19 695.0 207.0\n",
"2 2020-03-20 806.0 248.0\n",
"3 2020-03-21 452.0 151.0\n",
"4 2020-03-22 608.0 210.0"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import numpy.random as rnd\n",
"\n",
"import pandas\n",
"\n",
"df = pandas.read_csv(\n",
" \"https://www.data.gouv.fr/fr/datasets/r/63352e38-d353-4b54-bfd1-f1b3ee1cabd7\",\n",
" sep=\";\",\n",
")\n",
"gr = df[[\"jour\", \"rad\", \"dc\"]].groupby([\"jour\"]).sum()\n",
"diff = gr.diff().reset_index(drop=False)\n",
"diff.head()"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" entree \n",
" sortie \n",
" issue \n",
" \n",
" \n",
" \n",
" \n",
" 1905678 \n",
" -148 \n",
" 3 \n",
" 1 \n",
" \n",
" \n",
" 577877 \n",
" -147 \n",
" 40 \n",
" 1 \n",
" \n",
" \n",
" 1126578 \n",
" -140 \n",
" 6 \n",
" 1 \n",
" \n",
" \n",
" 1140232 \n",
" -140 \n",
" 11 \n",
" 1 \n",
" \n",
" \n",
" 1205621 \n",
" -131 \n",
" 26 \n",
" 1 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" entree sortie issue\n",
"1905678 -148 3 1\n",
"577877 -147 40 1\n",
"1126578 -140 6 1\n",
"1140232 -140 11 1\n",
"1205621 -131 26 1"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def donnees_artificielles(hosp, mu=14, nu=21):\n",
" dt = pandas.to_datetime(hosp[\"jour\"])\n",
" res = []\n",
" for i in range(hosp.shape[0]):\n",
" date = dt[i].dayofyear\n",
" h1 = hosp.iloc[i, 1]\n",
" h2 = hosp.iloc[i, 2]\n",
" if h1 < 0 or h2 < 0:\n",
" continue\n",
" delay1 = rnd.exponential(mu, int(h1))\n",
" for j in range(delay1.shape[0]):\n",
" res.append([date - int(delay1[j]), date, 1])\n",
" delay2 = rnd.exponential(mu, int(h2))\n",
" for j in range(delay2.shape[0]):\n",
" res.append([date - int(delay2[j]), date, 0])\n",
" return pandas.DataFrame(res, columns=[\"entree\", \"sortie\", \"issue\"])\n",
"\n",
"\n",
"data = donnees_artificielles(diff[1:].reset_index(drop=True)).sort_values(\"entree\")\n",
"data.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Chaque ligne est une personne, `entree` est le jour d'entrée à l'hôpital, `sortie` celui de la sortie, `issue`, 0 pour décès, 1 pour en vie."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" entree \n",
" sortie \n",
" issue \n",
" \n",
" \n",
" \n",
" \n",
" count \n",
" 1.993886e+06 \n",
" 1.993886e+06 \n",
" 1.993886e+06 \n",
" \n",
" \n",
" mean \n",
" 1.481621e+02 \n",
" 1.616597e+02 \n",
" 8.642781e-01 \n",
" \n",
" \n",
" std \n",
" 1.152239e+02 \n",
" 1.143726e+02 \n",
" 3.424931e-01 \n",
" \n",
" \n",
" min \n",
" -1.480000e+02 \n",
" 1.000000e+00 \n",
" 0.000000e+00 \n",
" \n",
" \n",
" 25% \n",
" 5.100000e+01 \n",
" 6.400000e+01 \n",
" 1.000000e+00 \n",
" \n",
" \n",
" 50% \n",
" 1.130000e+02 \n",
" 1.250000e+02 \n",
" 1.000000e+00 \n",
" \n",
" \n",
" 75% \n",
" 2.600000e+02 \n",
" 2.750000e+02 \n",
" 1.000000e+00 \n",
" \n",
" \n",
" max \n",
" 3.660000e+02 \n",
" 3.660000e+02 \n",
" 1.000000e+00 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" entree sortie issue\n",
"count 1.993886e+06 1.993886e+06 1.993886e+06\n",
"mean 1.481621e+02 1.616597e+02 8.642781e-01\n",
"std 1.152239e+02 1.143726e+02 3.424931e-01\n",
"min -1.480000e+02 1.000000e+00 0.000000e+00\n",
"25% 5.100000e+01 6.400000e+01 1.000000e+00\n",
"50% 1.130000e+02 1.250000e+02 1.000000e+00\n",
"75% 2.600000e+02 2.750000e+02 1.000000e+00\n",
"max 3.660000e+02 3.660000e+02 1.000000e+00"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data.describe()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Il y a environ 80% de survie dans ces données."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"import numpy\n",
"\n",
"duree = data.sortie - data.entree\n",
"deces = (data.issue == 0).astype(numpy.int32)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import numpy\n",
"import matplotlib.pyplot as plt\n",
"from lifelines import KaplanMeierFitter\n",
"\n",
"fig, ax = plt.subplots(1, 1, figsize=(10, 4))\n",
"kmf = KaplanMeierFitter()\n",
"kmf.fit(duree, deces)\n",
"kmf.plot(ax=ax)\n",
"ax.legend();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Régression de Cox\n",
"\n",
"On reprend les données artificiellement générées et on ajoute une variable identique à la durée plus un bruit mais quasi nul "
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" duree \n",
" deces \n",
" X1 \n",
" X2 \n",
" \n",
" \n",
" \n",
" \n",
" 1905678 \n",
" 151 \n",
" 0 \n",
" 0.650961 \n",
" -1.128843 \n",
" \n",
" \n",
" 577877 \n",
" 187 \n",
" 0 \n",
" -1.956525 \n",
" 0.108041 \n",
" \n",
" \n",
" 1126578 \n",
" 146 \n",
" 0 \n",
" 0.026987 \n",
" -0.130392 \n",
" \n",
" \n",
" 1140232 \n",
" 151 \n",
" 0 \n",
" 1.149385 \n",
" 0.280224 \n",
" \n",
" \n",
" 1205621 \n",
" 157 \n",
" 0 \n",
" -0.032398 \n",
" 0.400499 \n",
" \n",
" \n",
"
\n",
"
"
],
"text/plain": [
" duree deces X1 X2\n",
"1905678 151 0 0.650961 -1.128843\n",
"577877 187 0 -1.956525 0.108041\n",
"1126578 146 0 0.026987 -0.130392\n",
"1140232 151 0 1.149385 0.280224\n",
"1205621 157 0 -0.032398 0.400499"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas\n",
"\n",
"data_simple = pandas.DataFrame(\n",
" {\n",
" \"duree\": duree,\n",
" \"deces\": deces,\n",
" \"X1\": duree * 0.57 * deces + numpy.random.randn(duree.shape[0]),\n",
" \"X2\": duree * (-0.57) * deces + numpy.random.randn(duree.shape[0]),\n",
" }\n",
")\n",
"data_simple.head()"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"\n",
"data_train, data_test = train_test_split(data_simple, test_size=0.8)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Iteration 1: norm_delta = 5.01e-01, step_size = 0.9500, log_lik = -647954.57157, newton_decrement = 1.86e+04, seconds_since_start = 2.0\n",
"Iteration 2: norm_delta = 1.29e-01, step_size = 0.9500, log_lik = -668348.17665, newton_decrement = 2.19e+04, seconds_since_start = 4.2\n",
"Iteration 3: norm_delta = 8.30e-02, step_size = 0.9500, log_lik = -642917.75741, newton_decrement = 4.33e+03, seconds_since_start = 6.2\n",
"Iteration 4: norm_delta = 3.36e-02, step_size = 1.0000, log_lik = -637879.13496, newton_decrement = 4.09e+02, seconds_since_start = 8.5\n",
"Iteration 5: norm_delta = 3.94e-03, step_size = 1.0000, log_lik = -637443.76633, newton_decrement = 4.61e+00, seconds_since_start = 10.7\n",
"Iteration 6: norm_delta = 4.65e-05, step_size = 1.0000, log_lik = -637439.12353, newton_decrement = 6.25e-04, seconds_since_start = 13.0\n",
"Iteration 7: norm_delta = 6.33e-09, step_size = 1.0000, log_lik = -637439.12291, newton_decrement = 1.16e-11, seconds_since_start = 15.1\n",
"Convergence success after 7 iterations.\n"
]
},
{
"data": {
"text/plain": [
""
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from lifelines.fitters.coxph_fitter import CoxPHFitter\n",
"\n",
"cox = CoxPHFitter()\n",
"cox.fit(\n",
" data_train[[\"duree\", \"deces\", \"X1\"]],\n",
" duration_col=\"duree\",\n",
" event_col=\"deces\",\n",
" show_progress=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" model \n",
" lifelines.CoxPHFitter \n",
" \n",
" \n",
" duration col \n",
" 'duree' \n",
" \n",
" \n",
" event col \n",
" 'deces' \n",
" \n",
" \n",
" baseline estimation \n",
" breslow \n",
" \n",
" \n",
" number of observations \n",
" 398777 \n",
" \n",
" \n",
" number of events observed \n",
" 54338 \n",
" \n",
" \n",
" partial log-likelihood \n",
" -637439.12 \n",
" \n",
" \n",
" time fit was run \n",
" 2024-10-07 10:42:15 UTC \n",
" \n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
" coef \n",
" exp(coef) \n",
" se(coef) \n",
" coef lower 95% \n",
" coef upper 95% \n",
" exp(coef) lower 95% \n",
" exp(coef) upper 95% \n",
" cmp to \n",
" z \n",
" p \n",
" -log2(p) \n",
" \n",
" \n",
" \n",
" \n",
" X1 \n",
" 0.06 \n",
" 1.06 \n",
" 0.00 \n",
" 0.06 \n",
" 0.06 \n",
" 1.06 \n",
" 1.06 \n",
" 0.00 \n",
" 176.66 \n",
" <0.005 \n",
" inf \n",
" \n",
" \n",
"
\n",
"\n",
"
\n",
" \n",
" \n",
" Concordance \n",
" 0.75 \n",
" \n",
" \n",
" Partial AIC \n",
" 1274880.25 \n",
" \n",
" \n",
" log-likelihood ratio test \n",
" 21030.90 on 1 df \n",
" \n",
" \n",
" -log2(p) of ll-ratio test \n",
" inf \n",
" \n",
" \n",
"
\n",
"
"
],
"text/latex": [
"\\begin{tabular}{lrrrrrrrrrrr}\n",
" & coef & exp(coef) & se(coef) & coef lower 95% & coef upper 95% & exp(coef) lower 95% & exp(coef) upper 95% & cmp to & z & p & -log2(p) \\\\\n",
"covariate & & & & & & & & & & & \\\\\n",
"X1 & 0.06 & 1.06 & 0.00 & 0.06 & 0.06 & 1.06 & 1.06 & 0.00 & 176.66 & 0.00 & inf \\\\\n",
"\\end{tabular}\n"
],
"text/plain": [
"\n",
" duration col = 'duree'\n",
" event col = 'deces'\n",
" baseline estimation = breslow\n",
" number of observations = 398777\n",
"number of events observed = 54338\n",
" partial log-likelihood = -637439.12\n",
" time fit was run = 2024-10-07 10:42:15 UTC\n",
"\n",
"---\n",
" coef exp(coef) se(coef) coef lower 95% coef upper 95% exp(coef) lower 95% exp(coef) upper 95%\n",
"covariate \n",
"X1 0.06 1.06 0.00 0.06 0.06 1.06 1.06\n",
"\n",
" cmp to z p -log2(p)\n",
"covariate \n",
"X1 0.00 176.66 <0.005 inf\n",
"---\n",
"Concordance = 0.75\n",
"Partial AIC = 1274880.25\n",
"log-likelihood ratio test = 21030.90 on 1 df\n",
"-log2(p) of ll-ratio test = inf"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"cox.print_summary()"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Iteration 1: norm_delta = 5.01e-01, step_size = 0.9500, log_lik = -647954.57157, newton_decrement = 1.86e+04, seconds_since_start = 2.4\n",
"Iteration 2: norm_delta = 1.31e-01, step_size = 0.9500, log_lik = -668036.09368, newton_decrement = 2.18e+04, seconds_since_start = 5.0\n",
"Iteration 3: norm_delta = 8.29e-02, step_size = 0.9500, log_lik = -642745.26291, newton_decrement = 4.23e+03, seconds_since_start = 7.2\n",
"Iteration 4: norm_delta = 3.27e-02, step_size = 1.0000, log_lik = -637838.96866, newton_decrement = 3.84e+02, seconds_since_start = 9.4\n",
"Iteration 5: norm_delta = 3.70e-03, step_size = 1.0000, log_lik = -637430.64477, newton_decrement = 4.03e+00, seconds_since_start = 11.5\n",
"Iteration 6: norm_delta = 4.05e-05, step_size = 1.0000, log_lik = -637426.59011, newton_decrement = 4.72e-04, seconds_since_start = 13.6\n",
"Iteration 7: norm_delta = 4.77e-09, step_size = 1.0000, log_lik = -637426.58963, newton_decrement = 6.55e-12, seconds_since_start = 15.7\n",
"Convergence success after 7 iterations.\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" model \n",
" lifelines.CoxPHFitter \n",
" \n",
" \n",
" duration col \n",
" 'duree' \n",
" \n",
" \n",
" event col \n",
" 'deces' \n",
" \n",
" \n",
" baseline estimation \n",
" breslow \n",
" \n",
" \n",
" number of observations \n",
" 398777 \n",
" \n",
" \n",
" number of events observed \n",
" 54338 \n",
" \n",
" \n",
" partial log-likelihood \n",
" -637426.59 \n",
" \n",
" \n",
" time fit was run \n",
" 2024-10-07 10:42:35 UTC \n",
" \n",
" \n",
"
\n",
"
\n",
" \n",
" \n",
" \n",
" coef \n",
" exp(coef) \n",
" se(coef) \n",
" coef lower 95% \n",
" coef upper 95% \n",
" exp(coef) lower 95% \n",
" exp(coef) upper 95% \n",
" cmp to \n",
" z \n",
" p \n",
" -log2(p) \n",
" \n",
" \n",
" \n",
" \n",
" X2 \n",
" -0.06 \n",
" 0.94 \n",
" 0.00 \n",
" -0.06 \n",
" -0.06 \n",
" 0.94 \n",
" 0.95 \n",
" 0.00 \n",
" -176.68 \n",
" <0.005 \n",
" inf \n",
" \n",
" \n",
"
\n",
"\n",
"
\n",
" \n",
" \n",
" Concordance \n",
" 0.75 \n",
" \n",
" \n",
" Partial AIC \n",
" 1274855.18 \n",
" \n",
" \n",
" log-likelihood ratio test \n",
" 21055.96 on 1 df \n",
" \n",
" \n",
" -log2(p) of ll-ratio test \n",
" inf \n",
" \n",
" \n",
"
\n",
"
"
],
"text/latex": [
"\\begin{tabular}{lrrrrrrrrrrr}\n",
" & coef & exp(coef) & se(coef) & coef lower 95% & coef upper 95% & exp(coef) lower 95% & exp(coef) upper 95% & cmp to & z & p & -log2(p) \\\\\n",
"covariate & & & & & & & & & & & \\\\\n",
"X2 & -0.06 & 0.94 & 0.00 & -0.06 & -0.06 & 0.94 & 0.95 & 0.00 & -176.68 & 0.00 & inf \\\\\n",
"\\end{tabular}\n"
],
"text/plain": [
"\n",
" duration col = 'duree'\n",
" event col = 'deces'\n",
" baseline estimation = breslow\n",
" number of observations = 398777\n",
"number of events observed = 54338\n",
" partial log-likelihood = -637426.59\n",
" time fit was run = 2024-10-07 10:42:35 UTC\n",
"\n",
"---\n",
" coef exp(coef) se(coef) coef lower 95% coef upper 95% exp(coef) lower 95% exp(coef) upper 95%\n",
"covariate \n",
"X2 -0.06 0.94 0.00 -0.06 -0.06 0.94 0.95\n",
"\n",
" cmp to z p -log2(p)\n",
"covariate \n",
"X2 0.00 -176.68 <0.005 inf\n",
"---\n",
"Concordance = 0.75\n",
"Partial AIC = 1274855.18\n",
"log-likelihood ratio test = 21055.96 on 1 df\n",
"-log2(p) of ll-ratio test = inf"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"cox2 = CoxPHFitter()\n",
"cox2.fit(\n",
" data_train[[\"duree\", \"deces\", \"X2\"]],\n",
" duration_col=\"duree\",\n",
" event_col=\"deces\",\n",
" show_progress=True,\n",
")\n",
"cox2.print_summary()"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" 1369970 \n",
" 834048 \n",
" 1217055 \n",
" 1119706 \n",
" 1444869 \n",
" \n",
" \n",
" \n",
" \n",
" 0.0 \n",
" 0.008909 \n",
" 0.008111 \n",
" 0.020118 \n",
" 0.008601 \n",
" 0.008080 \n",
" \n",
" \n",
" 1.0 \n",
" 0.017592 \n",
" 0.016017 \n",
" 0.039727 \n",
" 0.016985 \n",
" 0.015956 \n",
" \n",
" \n",
" 2.0 \n",
" 0.026273 \n",
" 0.023921 \n",
" 0.059330 \n",
" 0.025366 \n",
" 0.023830 \n",
" \n",
" \n",
" 3.0 \n",
" 0.035119 \n",
" 0.031975 \n",
" 0.079308 \n",
" 0.033908 \n",
" 0.031854 \n",
" \n",
" \n",
" 4.0 \n",
" 0.043795 \n",
" 0.039875 \n",
" 0.098901 \n",
" 0.042284 \n",
" 0.039724 \n",
" \n",
" \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" \n",
" \n",
" 156.0 \n",
" 0.532402 \n",
" 0.484742 \n",
" 1.202293 \n",
" 0.514034 \n",
" 0.482903 \n",
" \n",
" \n",
" 158.0 \n",
" 0.538375 \n",
" 0.490181 \n",
" 1.215783 \n",
" 0.519801 \n",
" 0.488322 \n",
" \n",
" \n",
" 163.0 \n",
" 0.538375 \n",
" 0.490181 \n",
" 1.215783 \n",
" 0.519801 \n",
" 0.488322 \n",
" \n",
" \n",
" 170.0 \n",
" 0.538375 \n",
" 0.490181 \n",
" 1.215783 \n",
" 0.519801 \n",
" 0.488322 \n",
" \n",
" \n",
" 186.0 \n",
" 0.538375 \n",
" 0.490181 \n",
" 1.215783 \n",
" 0.519801 \n",
" 0.488322 \n",
" \n",
" \n",
"
\n",
"
151 rows × 5 columns
\n",
"
"
],
"text/plain": [
" 1369970 834048 1217055 1119706 1444869\n",
"0.0 0.008909 0.008111 0.020118 0.008601 0.008080\n",
"1.0 0.017592 0.016017 0.039727 0.016985 0.015956\n",
"2.0 0.026273 0.023921 0.059330 0.025366 0.023830\n",
"3.0 0.035119 0.031975 0.079308 0.033908 0.031854\n",
"4.0 0.043795 0.039875 0.098901 0.042284 0.039724\n",
"... ... ... ... ... ...\n",
"156.0 0.532402 0.484742 1.202293 0.514034 0.482903\n",
"158.0 0.538375 0.490181 1.215783 0.519801 0.488322\n",
"163.0 0.538375 0.490181 1.215783 0.519801 0.488322\n",
"170.0 0.538375 0.490181 1.215783 0.519801 0.488322\n",
"186.0 0.538375 0.490181 1.215783 0.519801 0.488322\n",
"\n",
"[151 rows x 5 columns]"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cox.predict_cumulative_hazard(data_test[:5])"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" \n",
" 1369970 \n",
" 834048 \n",
" 1217055 \n",
" 1119706 \n",
" 1444869 \n",
" \n",
" \n",
" \n",
" \n",
" 0.0 \n",
" 0.991131 \n",
" 0.991922 \n",
" 0.980083 \n",
" 0.991435 \n",
" 0.991952 \n",
" \n",
" \n",
" 1.0 \n",
" 0.982562 \n",
" 0.984111 \n",
" 0.961052 \n",
" 0.983159 \n",
" 0.984170 \n",
" \n",
" \n",
" 2.0 \n",
" 0.974069 \n",
" 0.976363 \n",
" 0.942396 \n",
" 0.974953 \n",
" 0.976452 \n",
" \n",
" \n",
" 3.0 \n",
" 0.965490 \n",
" 0.968530 \n",
" 0.923755 \n",
" 0.966661 \n",
" 0.968648 \n",
" \n",
" \n",
" 4.0 \n",
" 0.957150 \n",
" 0.960910 \n",
" 0.905833 \n",
" 0.958597 \n",
" 0.961055 \n",
" \n",
" \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" ... \n",
" \n",
" \n",
" 156.0 \n",
" 0.587193 \n",
" 0.615856 \n",
" 0.300504 \n",
" 0.598078 \n",
" 0.616989 \n",
" \n",
" \n",
" 158.0 \n",
" 0.583696 \n",
" 0.612516 \n",
" 0.296478 \n",
" 0.594639 \n",
" 0.613656 \n",
" \n",
" \n",
" 163.0 \n",
" 0.583696 \n",
" 0.612516 \n",
" 0.296478 \n",
" 0.594639 \n",
" 0.613656 \n",
" \n",
" \n",
" 170.0 \n",
" 0.583696 \n",
" 0.612516 \n",
" 0.296478 \n",
" 0.594639 \n",
" 0.613656 \n",
" \n",
" \n",
" 186.0 \n",
" 0.583696 \n",
" 0.612516 \n",
" 0.296478 \n",
" 0.594639 \n",
" 0.613656 \n",
" \n",
" \n",
"
\n",
"
151 rows × 5 columns
\n",
"
"
],
"text/plain": [
" 1369970 834048 1217055 1119706 1444869\n",
"0.0 0.991131 0.991922 0.980083 0.991435 0.991952\n",
"1.0 0.982562 0.984111 0.961052 0.983159 0.984170\n",
"2.0 0.974069 0.976363 0.942396 0.974953 0.976452\n",
"3.0 0.965490 0.968530 0.923755 0.966661 0.968648\n",
"4.0 0.957150 0.960910 0.905833 0.958597 0.961055\n",
"... ... ... ... ... ...\n",
"156.0 0.587193 0.615856 0.300504 0.598078 0.616989\n",
"158.0 0.583696 0.612516 0.296478 0.594639 0.613656\n",
"163.0 0.583696 0.612516 0.296478 0.594639 0.613656\n",
"170.0 0.583696 0.612516 0.296478 0.594639 0.613656\n",
"186.0 0.583696 0.612516 0.296478 0.594639 0.613656\n",
"\n",
"[151 rows x 5 columns]"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"cox.predict_survival_function(data_test[:5])"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 4
}