Note

Go to the end to download the full example code.

Predictable t-SNE¶

t-SNE is not a transformer which can produce outputs for other inputs than the one used to train the transform. The proposed solution is train a predictor afterwards to try to use the results on some other inputs the model never saw.

t-SNE on MNIST¶

Let’s reuse some part of the example of Manifold learning on handwritten digits: Locally Linear Embedding, Isomap.

import numpy
import matplotlib.pyplot as plt
from matplotlib import offsetbox
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.manifold import TSNE
from sklearn.neighbors import KNeighborsRegressor
from sklearn.preprocessing import StandardScaler
from mlinsights.mlmodel import PredictableTSNE


digits = datasets.load_digits(n_class=6)
Xd = digits.data
yd = digits.target
imgs = digits.images
n_samples, n_features = Xd.shape
n_samples, n_features

(1083, 64)

Let’s split into train and test.

X_train, X_test, y_train, y_test, imgs_train, imgs_test = train_test_split(Xd, yd, imgs)

tsne = TSNE(n_components=2, init="pca", random_state=0)

X_train_tsne = tsne.fit_transform(X_train, y_train)
X_train_tsne.shape

(812, 2)

def plot_embedding(Xp, y, imgs, title=None, figsize=(12, 4)):
    x_min, x_max = numpy.min(Xp, 0), numpy.max(Xp, 0)
    X = (Xp - x_min) / (x_max - x_min)

    fig, ax = plt.subplots(1, 2, figsize=figsize)
    for i in range(X.shape[0]):
        ax[0].text(
            X[i, 0],
            X[i, 1],
            str(y[i]),
            color=plt.cm.Set1(y[i] / 10.0),
            fontdict={"weight": "bold", "size": 9},
        )

    if hasattr(offsetbox, "AnnotationBbox"):
        # only print thumbnails with matplotlib > 1.0
        shown_images = numpy.array([[1.0, 1.0]])  # just something big
        for i in range(X.shape[0]):
            dist = numpy.sum((X[i] - shown_images) ** 2, 1)
            if numpy.min(dist) < 4e-3:
                # don't show points that are too close
                continue
            shown_images = numpy.r_[shown_images, [X[i]]]
            imagebox = offsetbox.AnnotationBbox(
                offsetbox.OffsetImage(imgs[i], cmap=plt.cm.gray_r), X[i]
            )
            ax[0].add_artist(imagebox)
    ax[0].set_xticks([]), ax[0].set_yticks([])
    ax[1].plot(Xp[:, 0], Xp[:, 1], ".")
    if title is not None:
        ax[0].set_title(title)
    return ax


plot_embedding(X_train_tsne, y_train, imgs_train, "t-SNE embedding of the digits")

array([<Axes: title={'center': 't-SNE embedding of the digits'}>,
       <Axes: >], dtype=object)

Repeatable t-SNE¶

We use class PredictableTSNE but it works for other trainable transform too.

ptsne = PredictableTSNE()
ptsne.fit(X_train, y_train)

~/vv/this312/lib/python3.12/site-packages/sklearn/neural_network/_multilayer_perceptron.py:781: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.
  warnings.warn(

PredictableTSNE(estimator=MLPRegressor(), transformer=TSNE())

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

PredictableTSNE

iFitted

Parameters

	normalizer	None
	transformer	TSNE()
	estimator	MLPRegressor()
	normalize	True
	keep_tsne_outputs	False

estimator: MLPRegressor

MLPRegressor()

MLPRegressor

?Documentation for MLPRegressor

Parameters

	loss	'squared_error'
	hidden_layer_sizes	(100,)
	activation	'relu'
	solver	'adam'
	alpha	0.0001
	batch_size	'auto'
	learning_rate	'constant'
	learning_rate_init	0.001
	power_t	0.5
	max_iter	200
	shuffle	True
	random_state	None
	tol	0.0001
	verbose	False
	warm_start	False
	momentum	0.9
	nesterovs_momentum	True
	early_stopping	False
	validation_fraction	0.1
	beta_1	0.9
	beta_2	0.999
	epsilon	1e-08
	n_iter_no_change	10
	max_fun	15000

transformer: TSNE

TSNE()

TSNE

?Documentation for TSNE

Parameters

	n_components	2
	perplexity	30.0
	early_exaggeration	12.0
	learning_rate	'auto'
	max_iter	1000
	n_iter_without_progress	300
	min_grad_norm	1e-07
	metric	'euclidean'
	metric_params	None
	init	'pca'
	verbose	0
	random_state	None
	method	'barnes_hut'
	angle	0.5
	n_jobs	None

X_train_tsne2 = ptsne.transform(X_train)
plot_embedding(X_train_tsne2, y_train, imgs_train, "Predictable t-SNE of the digits")

array([<Axes: title={'center': 'Predictable t-SNE of the digits'}>,
       <Axes: >], dtype=object)

The difference now is that it can be applied on new data.

X_test_tsne2 = ptsne.transform(X_test)
plot_embedding(
    X_test_tsne2, y_test, imgs_test, "Predictable t-SNE on new digits on test database"
)

Predictable t-SNE on new digits on test database

array([<Axes: title={'center': 'Predictable t-SNE on new digits on test database'}>,
       <Axes: >], dtype=object)

By default, the output data is normalized to get comparable results over multiple tries such as the loss computed between the normalized output of t-SNE and their approximation.

ptsne.loss_

0.01670443839687488

Repeatable t-SNE with another predictor¶

# The predictor is a
# `MLPRegressor <https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPRegressor.html>`_.


ptsne.estimator_

MLPRegressor()

MLPRegressor

?Documentation for MLPRegressoriFitted

Parameters

	loss	'squared_error'
	hidden_layer_sizes	(100,)
	activation	'relu'
	solver	'adam'
	alpha	0.0001
	batch_size	'auto'
	learning_rate	'constant'
	learning_rate_init	0.001
	power_t	0.5
	max_iter	200
	shuffle	True
	random_state	None
	tol	0.0001
	verbose	False
	warm_start	False
	momentum	0.9
	nesterovs_momentum	True
	early_stopping	False
	validation_fraction	0.1
	beta_1	0.9
	beta_2	0.999
	epsilon	1e-08
	n_iter_no_change	10
	max_fun	15000

Let’s replace it with a KNeighborsRegressor and a normalizer StandardScaler.

ptsne_knn = PredictableTSNE(
    normalizer=StandardScaler(), estimator=KNeighborsRegressor()
)
ptsne_knn.fit(X_train, y_train)

PredictableTSNE(estimator=KNeighborsRegressor(), normalizer=StandardScaler(),
                transformer=TSNE())

PredictableTSNE

iFitted

Parameters

	normalizer	StandardScaler()
	transformer	TSNE()
	estimator	KNeighborsRegressor()
	normalize	True
	keep_tsne_outputs	False

estimator: KNeighborsRegressor

KNeighborsRegressor()

KNeighborsRegressor

?Documentation for KNeighborsRegressor

Parameters

	n_neighbors	5
	weights	'uniform'
	algorithm	'auto'
	leaf_size	30
	p	2
	metric	'minkowski'
	metric_params	None
	n_jobs	None

normalizer: StandardScaler

StandardScaler()

StandardScaler

?Documentation for StandardScaler

Parameters

	copy	True
	with_mean	True
	with_std	True

transformer: TSNE

TSNE()

TSNE

?Documentation for TSNE

Parameters

	n_components	2
	perplexity	30.0
	early_exaggeration	12.0
	learning_rate	'auto'
	max_iter	1000
	n_iter_without_progress	300
	min_grad_norm	1e-07
	metric	'euclidean'
	metric_params	None
	init	'pca'
	verbose	0
	random_state	None
	method	'barnes_hut'
	angle	0.5
	n_jobs	None

X_train_tsne2 = ptsne_knn.transform(X_train)
plot_embedding(
    X_train_tsne2,
    y_train,
    imgs_train,
    "Predictable t-SNE of the digits\nStandardScaler+KNeighborsRegressor",
)

Predictable t-SNE of the digits StandardScaler+KNeighborsRegressor

array([<Axes: title={'center': 'Predictable t-SNE of the digits\nStandardScaler+KNeighborsRegressor'}>,
       <Axes: >], dtype=object)

X_test_tsne2 = ptsne_knn.transform(X_test)
plot_embedding(
    X_test_tsne2,
    y_test,
    imgs_test,
    "Predictable t-SNE on new digits\nStandardScaler+KNeighborsRegressor",
)

Predictable t-SNE on new digits StandardScaler+KNeighborsRegressor

array([<Axes: title={'center': 'Predictable t-SNE on new digits\nStandardScaler+KNeighborsRegressor'}>,
       <Axes: >], dtype=object)

The model seems to work better as the loss is better but as it is evaluated on the training dataset, it is just a way to check it is not too big.

ptsne_knn.loss_

0.004273401573300362

Total running time of the script: (0 minutes 9.438 seconds)

Gallery generated by Sphinx-Gallery