mlinsights.mlbatch

This was written for older version of scikit-learn and never revisited since. It may not bring much value.

MLCache

class mlinsights.mlbatch.cache_model.MLCache(name)[source]

Implements a cache to reduce the number of trainings a grid search has to do.

static as_key(params)[source]

Converts a list of parameters into a key.

@param params dictionary @return key as a string

cache(params, value)[source]

Caches one object.

@param params dictionary of parameters @param value value to cache

count(params)[source]

Retrieves the number of times an elements was retrieved from the cache.

@param params dictionary of parameters @return int

static create_cache(name)[source]

Creates a new cache.

@param name name @return created cache

get(params, default=None)[source]

Retrieves an element from the cache.

@param params dictionary of parameters @param default if not found @return value or None if it does not exists

static get_cache(name)[source]

Gets a cache with a given name.

@param name name @return created cache

static has_cache(name)[source]

Tells if cache name is present.

@param name name @return boolean

items()[source]

Enumerates all cached items.

keys()[source]

Enumerates all cached keys.

static remove_cache(name)[source]

Removes a cache with a given name.

@param name name

PipelineCache

class mlinsights.mlbatch.pipeline_cache.PipelineCache(steps, cache_name=None, verbose=False)[source]

Same as sklearn.pipeline.Pipeline but it can skip training if it detects a step was already trained the model was already trained accross even in a different pipeline.

Parameters:
  • steps – list List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an estimator.

  • cache_name – name of the cache, if None, a new name is created

  • verbose – boolean, optional If True, the time elapsed while fitting each step will be printed as it is completed.

The attribute named_steps is a bunch object, a dictionary with attribute access Read-only attribute to access any step parameter by user given name. Keys are step names and values are steps parameters.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') PipelineCache

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

selfobject

The updated object.