mlinsights.mlbatch

This was written for older version of scikit-learn and never revisited since. It may not bring much value.

MLCache

class mlinsights.mlbatch.cache_model.MLCache(name)[source]

Implements a cache to reduce the number of trainings a grid search has to do.

static as_key(params)[source]

Converts a list of parameters into a key.

@param params dictionary @return key as a string

cache(params, value)[source]

Caches one object.

@param params dictionary of parameters @param value value to cache

count(params)[source]

Retrieves the number of times an elements was retrieved from the cache.

@param params dictionary of parameters @return int

static create_cache(name)[source]

Creates a new cache.

@param name name @return created cache

get(params, default=None)[source]

Retrieves an element from the cache.

@param params dictionary of parameters @param default if not found @return value or None if it does not exists

static get_cache(name)[source]

Gets a cache with a given name.

@param name name @return created cache

static has_cache(name)[source]

Tells if cache name is present.

@param name name @return boolean

items()[source]

Enumerates all cached items.

keys()[source]

Enumerates all cached keys.

static remove_cache(name)[source]

Removes a cache with a given name.

@param name name

PipelineCache

class mlinsights.mlbatch.pipeline_cache.PipelineCache(steps, cache_name=None, verbose=False)[source]

Same as sklearn.pipeline.Pipeline but it can skip training if it detects a step was already trained the model was already trained accross even in a different pipeline.

Parameters:
  • steps – list List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an estimator.

  • cache_name – name of the cache, if None, a new name is created

  • verbose – boolean, optional If True, the time elapsed while fitting each step will be printed as it is completed.

The attribute named_steps is a bunch object, a dictionary with attribute access Read-only attribute to access any step parameter by user given name. Keys are step names and values are steps parameters.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') PipelineCache

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

Returns

selfobject

The updated object.