mlinsights.timeseries

Datasets

mlinsights.timeseries.datasets.artificial_data(dt1, dt2, minutes=1)[source]

Generates articial data every minutes.

Parameters:
  • dt1 – first date

  • dt2 – second date

  • minutes – interval between two observations

Returns:

dataframe

<<<

import datetime
from mlinsights.timeseries.datasets import artificial_data

now = datetime.datetime.now()
data = artificial_data(now - datetime.timedelta(40), now)
print(data.head())

>>>

                            time         y
    0 2025-06-09 12:55:21.805257  1.684128
    1 2025-06-09 12:56:21.805257  1.547256
    2 2025-06-09 12:57:21.805257  1.729528
    3 2025-06-09 12:58:21.805257  1.814975
    4 2025-06-09 12:59:21.805257  1.769485

Experimentation

mlinsights.timeseries.patterns.find_ts_group_pattern(ttime, values, names, name_subset=None, per='week', unit='half-hour', agg='sum', estimator=None, verbose=0)[source]

Clusters times series to find similar patterns.

Parameters:
  • ttime – time column

  • values – features to use to cluster

  • names – column which holds group name

  • name_subset – subset of groups to study, None for all

  • per – aggragation per week

  • unit – unit

  • agg – aggregation function

  • estimator – estimator used to find pattern, sklearn.cluster.KMeans and 10 groups

  • verbose – verbosity

Returns:

found clusters, distances

Manipulation

mlinsights.timeseries.agg.aggregate_timeseries(df, index='time', values='y', unit='half-hour', agg='sum', per=None)[source]

Aggregates timeseries assuming the data is in a dataframe.

@param df dataframe @param index time column @param values value or values column @param unit aggregate over a specific period @param sum kind of aggregation @param per second aggregation, per week… @return aggregated values

Plotting

mlinsights.timeseries.plotting.plot_week_timeseries(time, value, normalise=True, label=None, h=0.85, value2=None, label2=None, daynames=None, xfmt='%1.0f', ax=None)[source]

Shows a timeseries dispatched by days as bars.

Parameters:
  • time – dates

  • value – values to display as bars.

  • normalise – normalise data before showing it

  • label – label of the series

  • h – scale factor

  • value2 – second series to show as a line

  • label2 – label of the second series

  • daynames – names to use for week day names (default is English)

  • xfmt – format number of the X axis

  • ax – existing axis

Returns:

axis

(Source code, png, hires.png, pdf)

../_images/timeseries-1.png

Prediction

BaseReciprocalTimeSeriesTransformer

The following function builds a regular dataset from a timeseries so that it can be used by machine learning models.

class mlinsights.timeseries.base.BaseReciprocalTimeSeriesTransformer(context_length=0)[source]

Base for all timeseries preprocessing automatically applied within a predictor.

fit(X, y, sample_weight=None)[source]

Stores the first values.

get_fct_inv()[source]

Returns the reverse tranform.

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') BaseReciprocalTimeSeriesTransformer

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in fit.

selfobject

The updated object.

set_transform_request(*, context: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') BaseReciprocalTimeSeriesTransformer

Configure whether metadata should be requested to be passed to the transform method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

contextstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for context parameter in transform.

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in transform.

selfobject

The updated object.

transform(X, y, sample_weight=None, context=None)[source]

Transforms both X and y. Returns X and y, returns sample_weight as well if not None. The context is used when the y series stored in the predictor is not related to the y series given to the transform method.

build_ts_X_y

mlinsights.timeseries.utils.build_ts_X_y(model, X, y, weights=None, same_rows=False)[source]

Builds standard X, y based in the given one.

Parameters:
  • model – a timeseries model (BaseTimeSeries)

  • X – times series, used as features, [n_obs, n_features], X may be empty (None)

  • y – timeseries (one single vector), [n_obs]

  • weights – weights None or array [n_obs]

  • same_rows – keeps the same number of rows as the original datasets, use nan when no value is available

Returns:

(X, y, weights): X is array of features [nrows, n_features + past] where nrows = n_obs + model.delay2 - model.past + 2, y is an array of targets [nrows], weights is None or array [nrows]

<<<

import numpy
from mlinsights.timeseries import build_ts_X_y
from mlinsights.timeseries.base import BaseTimeSeries

X = numpy.arange(10).reshape(5, 2)
y = numpy.arange(5) * 100
weights = numpy.arange(5) * 1000
bs = BaseTimeSeries(past=2)
nx, ny, nw = build_ts_X_y(bs, X, y, weights)
print("X=", X)
print("y=", y)
print("nx=", nx)
print("ny=", ny)

>>>

    X= [[0 1]
     [2 3]
     [4 5]
     [6 7]
     [8 9]]
    y= [  0 100 200 300 400]
    nx= [[  2   3   0 100]
     [  4   5 100 200]
     [  6   7 200 300]]
    ny= [[200]
     [300]
     [400]]

With use_all_past=True:

<<<

import numpy
from mlinsights.timeseries.base import BaseTimeSeries
from mlinsights.timeseries import build_ts_X_y

X = numpy.arange(10).reshape(5, 2)
y = numpy.arange(5) * 100
weights = numpy.arange(5) * 1000
bs = BaseTimeSeries(past=2, use_all_past=True)
nx, ny, nw = build_ts_X_y(bs, X, y, weights)
print("X=", X)
print("y=", y)
print("nx=", nx)
print("ny=", ny)

>>>

    X= [[0 1]
     [2 3]
     [4 5]
     [6 7]
     [8 9]]
    y= [  0 100 200 300 400]
    nx= [[  0   1   2   3   0 100]
     [  2   3   4   5 100 200]
     [  4   5   6   7 200 300]]
    ny= [[200]
     [300]
     [400]]

BaseTimeSeries

The first class defined the template for all timeseries estimators. It deals with a timeseries ine one dimension and additional features.

class mlinsights.timeseries.base.BaseTimeSeries(past=1, delay1=1, delay2=2, use_all_past=False, preprocessing=None)[source]

Base class to build a predictor on timeseries. The class computes one or several predictions at each time, between delay1 and delay2. It computes: \hat{Y_{t+d} = f(Y_{t-1}, ..., Y_{t-p})} with d in [delay1, delay2[ and 1 \leqslant p \leqslant past.

Parameters:
  • past – values to use to predict

  • delay1 – the model computes the first prediction for time=t + delay1

  • delay2 – the model computes the last prediction for time=t + delay2 excluded

  • use_all_past – use all past features, not only the timeseries

  • preprocessing – preprocessing to apply before predicting, only the timeseries itselves, it can be a difference, it must be of type BaseReciprocalTimeSeriesTransformer

has_preprocessing()[source]

Tells if there is one preprocessing.

DummyTimeSeriesRegressor

The first predictor is a dummy one: it uses the current value to predict the future.

class mlinsights.timeseries.dummies.DummyTimeSeriesRegressor(estimator='dummy', past=1, delay1=1, delay2=2, use_all_past=False, preprocessing=None)[source]

Dummy regressor for time series. Use past values as prediction.

Parameters:
  • estimator – estimator to use for regression, sklearn.linear_model.LinearRegression implements a linear auto-regressor, 'dummy' use past value as predictions

  • past – values to use to predict

  • delay1 – the model computes the first prediction for time=t + delay1

  • delay2 – the model computes the last prediction for time=t + delay2 excluded

  • use_all_past – use all past features, not only the timeseries

  • preprocessing – preprocessing to apply before predicting, only the timeseries itselves, it can be a difference, it must be of type BaseReciprocalTimeSeriesTransformer

fit(X, y, sample_weight=None)[source]

Trains the model.

Parameters:
  • X – output of X may be empty (None)

  • y – timeseries (one single vector), array [n_obs]

  • sample_weight – weights None or array [n_obs]

Returns:

self

predict(X, y)[source]

Returns the prediction

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') DummyTimeSeriesRegressor

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in fit.

selfobject

The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') DummyTimeSeriesRegressor

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

selfobject

The updated object.

ARTimeSeriesRegressor

The first regressor is an auto-regressor. It can be estimated with any regressor implemented in scikit-learn.

class mlinsights.timeseries.ar.ARTimeSeriesRegressor(estimator='dummy', past=1, delay1=1, delay2=2, use_all_past=False, preprocessing=None)[source]

Base class to build a regressor on timeseries. The class computes one or several predictions at each time, between delay1 and delay2. It computes: \hat{Y_{t+d} = f(Y_{t-1}, ..., Y_{t-p})} with d in [delay1, delay2[ and 1 \leqslant p \leqslant past.

Parameters:
  • estimator – estimator to use for regression, sklearn.linear_model.LinearRegression implements a linear auto-regressor, 'dummy' use past value as predictions

  • past – values to use to predict

  • delay1 – the model computes the first prediction for time=t + delay1

  • delay2 – the model computes the last prediction for time=t + delay2 excluded

  • use_all_past – use all past features, not only the timeseries

  • preprocessing – preprocessing to apply before predicting, only the timeseries itselves, it can be a difference, it must be of type BaseReciprocalTimeSeriesTransformer

fit(X, y, sample_weight=None)[source]

Trains the model.

Parameters:
  • X – output of X may be empty (None)

  • y – timeseries (one single vector), array [n_obs]

  • sample_weight – weights None or array [n_obs]

Returns:

self

predict(X, y)[source]

Returns the prediction

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ARTimeSeriesRegressor

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in fit.

selfobject

The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ARTimeSeriesRegressor

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

selfobject

The updated object.

ts_mape

The library implements one scoring function which compares the prediction to what a dummy predictor would do by using the previous day as a prediction.

mlinsights.timeseries.metrics.ts_mape(expected_y, predicted_y, sample_weight=None)[source]

Computes \frac{\sum_i | \hat{Y_t} - Y_t |}
{\sum_i | Y_t - Y_{t-1} |}. It compares the prediction to what a dummy predictor would do by using the previous day as a prediction.

Parameters:
  • expected_y – expected values

  • predicted_y – predictions

  • sample_weight – sample weight

Returns:

metrics