{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# ACP - projection\n", "\n", "On projette le jeu de données initiale selon les premiers axes d'une [analyse en composantes principales (ACP)](https://fr.wikipedia.org/wiki/Analyse_en_composantes_principales)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "scrolled": true }, "outputs": [], "source": [ "from teachpyx.datasets import load_wines_dataset\n", "\n", "df = load_wines_dataset()\n", "X = df.drop([\"quality\", \"color\"], axis=1)\n", "y = df[\"quality\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On utilise la classe [PCA](https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
PCA(n_components=5)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
PCA(n_components=5)
\n", " | residual_sugar | \n", "chlorides | \n", "free_sulfur_dioxide | \n", "total_sulfur_dioxide | \n", "density | \n", "
---|---|---|---|---|---|
count | \n", "6497.000000 | \n", "6497.000000 | \n", "6497.000000 | \n", "6497.000000 | \n", "6497.000000 | \n", "
mean | \n", "5.443235 | \n", "0.056034 | \n", "30.525319 | \n", "115.744574 | \n", "0.994697 | \n", "
std | \n", "4.757804 | \n", "0.035034 | \n", "17.749400 | \n", "56.521855 | \n", "0.002999 | \n", "
min | \n", "0.600000 | \n", "0.009000 | \n", "1.000000 | \n", "6.000000 | \n", "0.987110 | \n", "
25% | \n", "1.800000 | \n", "0.038000 | \n", "17.000000 | \n", "77.000000 | \n", "0.992340 | \n", "
50% | \n", "3.000000 | \n", "0.047000 | \n", "29.000000 | \n", "118.000000 | \n", "0.994890 | \n", "
75% | \n", "8.100000 | \n", "0.065000 | \n", "41.000000 | \n", "156.000000 | \n", "0.996990 | \n", "
max | \n", "65.800000 | \n", "0.611000 | \n", "289.000000 | \n", "440.000000 | \n", "1.038980 | \n", "
PCA(n_components=5)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
PCA(n_components=5)