mlstatpy.ml.neural_tree#

class mlstatpy.ml.neural_tree.BaseNeuralTreeNet(estimator, optimizer=None, max_iter=100, early_th=None, verbose=False, lr=None, lr_schedule=None, l1=0.0, l2=0.0, momentum=0.9)[source][source]#

Classifier or regressor following scikit-learn API.

Paramètres:
  • estimator – instance of NeuralTreeNet.

  • X – training set

  • y – training labels

  • optimizer – optimizer, by default, it is SGDOptimizer.

  • max_iter – number maximum of iterations

  • early_th – early stopping threshold

  • verbose – more verbose

  • lr – to overwrite learning_rate_init if optimizer is None (unused otherwise)

  • lr_schedule – to overwrite lr_schedule if optimizer is None (unused otherwise)

  • l1 – L1 regularization if optimizer is None (unused otherwise)

  • l2 – L2 regularization if optimizer is None (unused otherwise)

  • momentum – used if optimizer is None

decision_function(X)[source][source]#

Returns the classification probabilities.

Paramètres:

X – inputs

Renvoie:

probabilities

fit(X, y, sample_weights=None)[source][source]#

Trains the estimator.

Paramètres:
  • X – input features

  • y – expected classes (binary)

  • sample_weights – sample weights

Renvoie:

self

static onnx_converter()[source][source]#

Converts this model into ONNX.

static onnx_shape_calculator()[source][source]#

Shape calculator when converting this model into ONNX. See sklearn-onnx.

set_fit_request(*, sample_weights: bool | None | str = '$UNCHANGED$') BaseNeuralTreeNet[source]#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Nouveau dans la version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Paramètres:

sample_weights (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weights parameter in fit.

Renvoie:

self – The updated object.

Type renvoyé:

object

class mlstatpy.ml.neural_tree.NeuralTreeNet(dim, empty=True)[source][source]#

Node ensemble.

Paramètres:
  • dim – space dimension

  • empty – empty network, other adds an identity node

<<<

import numpy
from mlstatpy.ml.neural_tree import NeuralTreeNode, NeuralTreeNet

w1 = numpy.array([-0.5, 0.8, -0.6])

neu = NeuralTreeNode(w1[1:], bias=w1[0], activation="sigmoid")
net = NeuralTreeNet(2, empty=True)
net.append(neu, numpy.arange(2))

ide = NeuralTreeNode(numpy.array([1.0]), bias=numpy.array([0.0]), activation="identity")

net.append(ide, numpy.arange(2, 3))

X = numpy.abs(numpy.random.randn(10, 2))
pred = net.predict(X)
print(pred)

>>>

    /home/xadupre/github/mlstatpy/mlstatpy/ml/_neural_tree_node.py:184: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)
      self.coef[0] = bias
    [[0.426 3.156 0.114 0.114]
     [0.879 2.457 0.219 0.219]
     [1.075 0.11  0.573 0.573]
     [0.671 0.462 0.44  0.44 ]
     [0.692 0.394 0.454 0.454]
     [0.433 0.287 0.419 0.419]
     [2.085 0.288 0.73  0.73 ]
     [0.612 0.677 0.397 0.397]
     [1.282 0.26  0.591 0.591]
     [0.232 1.788 0.2   0.2  ]]
append(node, inputs)[source][source]#

Appends a node into the graph.

Paramètres:
  • node – node to add

  • inputs – index of input nodes

clear()[source][source]#

Clear all nodes

static create_from_tree(tree, k=1.0, arch='one')[source][source]#

Creates a NeuralTreeNet instance from a DecisionTreeClassifier

Paramètres:
Renvoie:

NeuralTreeNet

The function only works for binary problems. Available architecture:

  • “one”: the method adds nodes with one output, there is no soecific definition of layers,

  • “compact”: the adds two nodes, the first computes the threshold, the second one computes the leaves output, a final node merges all outputs into one

See notebook Un arbre de décision en réseaux de neurones for examples.

dlossds(X, y, cache=None)[source][source]#

Computes the loss derivative against the inputs.

fill_cache(X)[source][source]#

Creates a cache with intermediate results.

gradient_backward(graddx, X, inputs=False, cache=None)[source][source]#

Computes the gradient in X.

Paramètres:
  • graddx – existing gradient against the inputs

  • X – computes the gradient in X

  • inputs – if False, derivative against the coefficients, otherwise against the inputs.

  • cache – cache intermediate results to avoid more computation

Renvoie:

gradient

loss(X, y, cache=None)[source][source]#

Computes the loss due to prediction error. Returns a float.

property shape#

Returns the shape of the coefficients.

to_dot(X=None)[source][source]#

Exports the neural network into dot.

Paramètres:

X – input as an example

property training_weights#

Returns the weights.

update_training_weights(X, add=True)[source][source]#

Updates weights.

Paramètres:
  • X – training dataset

  • add – addition or replace

class mlstatpy.ml.neural_tree.NeuralTreeNetClassifier(estimator, optimizer=None, max_iter=100, early_th=None, verbose=False, lr=None, lr_schedule=None, l1=0.0, l2=0.0, momentum=0.9)[source][source]#

Classifier following scikit-learn API.

Paramètres:
  • estimator – instance of NeuralTreeNet.

  • optimizer – optimizer, by default, it is SGDOptimizer.

  • max_iter – number maximum of iterations

  • early_th – early stopping threshold

  • verbose – more verbose

  • lr – to overwrite learning_rate_init if optimizer is None (unused otherwise)

  • lr_schedule – to overwrite lr_schedule if optimizer is None (unused otherwise)

  • l1 – L1 regularization if optimizer is None (unused otherwise)

  • l2 – L2 regularization if optimizer is None (unused otherwise)

  • momentum – used if optimizer is None

predict(X)[source][source]#

Returns the predicted classes.

Paramètres:

X – inputs

Renvoie:

classes

predict_proba(X)[source][source]#

Returns the classification probabilities.

Paramètres:

X – inputs

Renvoie:

probabilities

set_fit_request(*, sample_weights: bool | None | str = '$UNCHANGED$') NeuralTreeNetClassifier[source]#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Nouveau dans la version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Paramètres:

sample_weights (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weights parameter in fit.

Renvoie:

self – The updated object.

Type renvoyé:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') NeuralTreeNetClassifier[source]#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Nouveau dans la version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Paramètres:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Renvoie:

self – The updated object.

Type renvoyé:

object

class mlstatpy.ml.neural_tree.NeuralTreeNetRegressor(estimator, optimizer=None, max_iter=100, early_th=None, verbose=False, lr=None, lr_schedule=None, l1=0.0, l2=0.0, momentum=0.9)[source][source]#

Regressor following scikit-learn API.

Paramètres:
  • estimator – instance of NeuralTreeNet.

  • optimizer – optimizer, by default, it is SGDOptimizer.

  • max_iter – number maximum of iterations

  • early_th – early stopping threshold

  • verbose – more verbose

  • lr – to overwrite learning_rate_init if optimizer is None (unused otherwise)

  • lr_schedule – to overwrite lr_schedule if optimizer is None (unused otherwise)

  • l1 – L1 regularization if optimizer is None (unused otherwise)

  • l2 – L2 regularization if optimizer is None (unused otherwise)

  • momentum – used if optimizer is None

predict(X)[source][source]#

Returns the predicted classes.

Paramètres:

X – inputs

Renvoie:

classes

set_fit_request(*, sample_weights: bool | None | str = '$UNCHANGED$') NeuralTreeNetRegressor[source]#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Nouveau dans la version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Paramètres:

sample_weights (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weights parameter in fit.

Renvoie:

self – The updated object.

Type renvoyé:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') NeuralTreeNetRegressor[source]#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Nouveau dans la version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Paramètres:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Renvoie:

self – The updated object.

Type renvoyé:

object

mlstatpy.ml.neural_tree.label_class_to_softmax_output(y_label)[source][source]#

Converts a binary class label into a matrix with two columns of probabilities.

<<<

import numpy
from mlstatpy.ml.neural_tree import label_class_to_softmax_output

y_label = numpy.array([0, 1, 0, 0])
soft_y = label_class_to_softmax_output(y_label)
print(soft_y)

>>>

    [[1. 0.]
     [0. 1.]
     [1. 0.]
     [1. 0.]]
class mlstatpy.ml._neural_tree_node.NeuralTreeNode(weights, bias=None, activation='sigmoid', nodeid=-1, tag=None)[source][source]#

One node in a neural network.

Paramètres:
  • weights – weights

  • bias – bias, if None, draws a random number

  • activation – activation function

  • nodeid – node id

  • tag – unused but to add information on how this node was created

property bias#

Returns the weights.

dlossds(X, y, cache=None)[source][source]#

Computes the loss derivative due to prediction error.

fill_cache(X)[source][source]#

Creates a cache with intermediate results. lX is the results before the activation function, aX is the results after the activation function, the prediction.

static get_activation_dloss_function(activation)[source][source]#

Returns the derivative of the default loss function based on the activation function. It returns a function df(x,y)/dw, df(w)/dw where w are the weights.

static get_activation_function(activation)[source][source]#

Returns the activation function. It returns a function y=f(x).

static get_activation_gradient_function(activation)[source][source]#

Returns the activation function. It returns a function y=f”(x). About the sigmoid:

\[\begin{split}\begin{array}{rcl} f(x) &=& \frac{1}{1 + e^{-x}} \\ f'(x) &=& \frac{e^{-x}}{(1 + e^{-x})^2} = f(x)(1-f(x)) \end{array}\end{split}\]
static get_activation_loss_function(activation)[source][source]#

Returns a default loss function based on the activation function. It returns two functions g=loss(x,y).

gradient_backward(graddx, X, inputs=False, cache=None)[source][source]#

Computes the gradients at point X.

Paramètres:
  • graddx – existing gradient against the inputs

  • X – computes the gradient in X

  • inputs – if False, derivative against the coefficients, otherwise against the inputs.

  • cache – cache intermediate results

Renvoie:

gradient

property input_weights#

Returns the weights.

loss(X, y, cache=None)[source][source]#

Computes the loss. Returns a float.

property ndim#

Returns the input dimension.

property ndim_out#

Returns the output dimension.

predict(X)[source][source]#

Computes neuron outputs.

property training_weights#

Returns the weights stored in the neuron.

update_training_weights(X, add=True)[source][source]#

Updates weights.

Paramètres:
  • X – training datasets

  • add – addition or replace

class mlstatpy.ml._neural_tree_api._TrainingAPI[source][source]#

Declaration of function needed to train a model.

dlossds(X, y, cache=None)[source][source]#

Computes the loss derivative due to prediction error.

fill_cache(X)[source][source]#

Creates a cache with intermediate results.

fit(X, y, optimizer=None, max_iter=100, early_th=None, verbose=False, lr=None, lr_schedule=None, l1=0.0, l2=0.0, momentum=0.9)[source][source]#

Fits a neuron.

Paramètres:
  • X – training set

  • y – training labels

  • optimizer – optimizer, by default, it is SGDOptimizer.

  • max_iter – number maximum of iterations

  • early_th – early stopping threshold

  • verbose – more verbose

  • lr – to overwrite learning_rate_init if optimizer is None (unused otherwise)

  • lr_schedule – to overwrite lr_schedule if optimizer is None (unused otherwise)

  • l1 – L1 regularization if optimizer is None (unused otherwise)

  • l2 – L2 regularization if optimizer is None (unused otherwise)

  • momentum – used if optimizer is None

Renvoie:

self

gradient(X, y, inputs=False)[source][source]#

Computes the gradient in X knowing the expected value y.

Paramètres:
  • X – computes the gradient in X

  • y – expected values

  • inputs – if False, derivative against the coefficients, otherwise against the inputs.

Renvoie:

gradient

gradient_backward(graddx, X, inputs=False, cache=None)[source][source]#

Computes the gradient in X.

Paramètres:
  • graddx – existing gradient against the outputs

  • X – computes the gradient in X

  • inputs – if False, derivative against the coefficients, otherwise against the inputs.

  • cache – cache intermediate results to avoid more computation

Renvoie:

gradient

loss(X, y, cache=None)[source][source]#

Computes the loss. Returns a float.

property training_weights#

Returns the weights.

update_training_weights(grad, add=True)[source][source]#

Updates weights.

Paramètres:
  • grad – vector to add to the weights such as gradient

  • add – addition or replace