teachcompute.datasets¶
mortality_table¶
- teachcompute.datasets.mortality_table(to: str = '.', stop_at: int | None = None, verbose: bool = False) DataFrame [source][source]¶
This function retrieves mortality table from EuroStat or INSEE. A copy is provided. The link is changing.
- Paramètres:
to – data needs to be downloaded, location of this place
stop_at – the overall process is quite long, if not None, it only keeps the first rows
- Renvoie:
data_frame
The function checks the file final_name exists. If it is the case, the data is not downloaded twice. The header contains a weird format as coordinates are separated by a comma:
indic_de,sex,age,geo\time 2013 2012 2011 2010 2009
We need to preprocess the data to split this information into columns. The overall process takes 4-5 minutes, 10 seconds to download (< 10 Mb), 4-5 minutes to preprocess the data (it could be improved). The processed data contains the following columns:
['annee', 'valeur', 'age', 'age_num', 'indicateur', 'genre', 'pays']
Columns age and age_num look alike. age_num is numeric and is equal to age except when age_num is 85. Everybody above that age fall into the same category. The table contains many indicators:
PROBSURV: Probabilité de survie entre deux âges exacts (px)
LIFEXP: Esperance de vie à l’âge exact (ex)
SURVIVORS: Nombre des survivants à l’âge exact (lx)
PYLIVED: Nombre d’années personnes vécues entre deux âges exacts (Lx)
DEATHRATE: Taux de mortalité à l’âge x (Mx)
PROBDEATH: Probabilité de décès entre deux âges exacts (qx)
TOTPYLIVED: Nombre total d’années personne vécues après l’âge exact (Tx)