.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples_sklearn/plot_sklearn_custom_converter_options.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_sklearn_plot_sklearn_custom_converter_options.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_sklearn_plot_sklearn_custom_converter_options.py:


.. _l-plot-sklearn-custom-converter-options:

Custom converter with convert options
======================================

This example shows how to write a **custom sklearn converter** whose behaviour
is controlled by a user-supplied :class:`~yobx.typing.ConvertOptionsProtocol`
object.

The idea mirrors the built-in :class:`~yobx.sklearn.ConvertOptions`
(``decision_leaf``, ``decision_path``), but applied to a fully custom
estimator.  The workflow has three steps:

1. **Define the estimator** — a plain scikit-learn transformer.
2. **Define a custom options class** — a lightweight object that implements
   the :class:`~yobx.typing.ConvertOptionsProtocol` protocol (``available_options``
   and ``has``).
3. **Write the converter** — a function that checks
   ``g.convert_options.has("option_name", estimator)`` to decide whether to
   emit the optional extra output.

The custom estimator used here is ``ClipTransformer``: it clips every feature
to a ``[clip_min, clip_max]`` range (equivalent to ``np.clip``).  The optional
extra output, activated by ``ClipOptions(clip_mask=True)``, is a **boolean mask
tensor** indicating which values were actually clipped.

.. GENERATED FROM PYTHON SOURCE LINES 28-38

.. code-block:: Python


    import numpy as np
    import onnxruntime
    from sklearn.base import BaseEstimator, TransformerMixin

    from yobx.doc import plot_dot
    from yobx.helpers.onnx_helper import tensor_dtype_to_np_dtype
    from yobx.sklearn import to_onnx
    from yobx.typing import ConvertOptionsProtocol, GraphBuilderExtendedProtocol


.. GENERATED FROM PYTHON SOURCE LINES 39-45

1. Custom estimator: ``ClipTransformer``
-----------------------------------------

A minimal transformer that clips all features into ``[clip_min, clip_max]``.
The helper method ``get_clip_mask`` returns a boolean array marking values
that were changed — it is used later to validate the ONNX output.

.. GENERATED FROM PYTHON SOURCE LINES 45-65

.. code-block:: Python


    class ClipTransformer(TransformerMixin, BaseEstimator):
        """Clips every feature value to ``[clip_min, clip_max]``."""

        def __init__(self, clip_min: float = 0.0, clip_max: float = 1.0):
            self.clip_min = clip_min
            self.clip_max = clip_max

        def fit(self, X, y=None):
            return self

        def transform(self, X):
            return np.clip(X, self.clip_min, self.clip_max)

        def get_clip_mask(self, X):
            """Boolean mask: ``True`` where a value was clipped."""
            return (X < self.clip_min) | (X > self.clip_max)


.. GENERATED FROM PYTHON SOURCE LINES 66-78

2. Custom convert options: ``ClipOptions``
------------------------------------------

The options class must implement two methods:

* ``available_options()`` — returns the list of option names the class
  recognises.  The framework iterates this list to decide how many extra
  output slots to pre-allocate for each estimator.
* ``has(option_name, piece, name=None)`` — returns ``True`` when the option
  should be active for the given fitted estimator *piece*.  The optional
  *name* is the pipeline step name (useful for enabling an option only for a
  specific named step in a :class:`~sklearn.pipeline.Pipeline`).

.. GENERATED FROM PYTHON SOURCE LINES 78-103

.. code-block:: Python


    class ClipOptions(ConvertOptionsProtocol):
        """Convert options for :class:`ClipTransformer`.

        :param clip_mask: when ``True``, adds a second boolean output tensor whose
            value is ``True`` at each position where the input was clipped.
        """

        def __init__(self, clip_mask: bool = False):
            self.clip_mask = clip_mask

        def available_options(self):
            """Returns the list of option names this class supports."""
            return ["clip_mask"]

        def has(self, option_name: str, piece: object, name=None) -> bool:
            """Returns ``True`` when *option_name* is active for *piece*."""
            if option_name == "clip_mask":
                # Only activate for estimators that have a clip_min attribute
                # (i.e. ClipTransformer instances).
                return bool(self.clip_mask) and hasattr(piece, "clip_min")
            return False


.. GENERATED FROM PYTHON SOURCE LINES 104-111

3. Converter function
----------------------

The converter always emits the primary ``Clip`` output (``outputs[0]``).
When ``g.convert_options.has("clip_mask", estimator)`` returns ``True`` the
framework has pre-allocated ``outputs[1]`` and the converter emits
``Less + Greater + Or`` to fill it.

.. GENERATED FROM PYTHON SOURCE LINES 111-150

.. code-block:: Python


    def convert_clip_transformer(
        g: GraphBuilderExtendedProtocol,
        sts: dict,
        outputs: list,
        estimator: ClipTransformer,
        X: str,
        name: str = "clip",
    ) -> str:
        """Convert :class:`ClipTransformer` to ONNX.

        Primary output
            ``outputs[0]`` — clipped values, same dtype and shape as *X*.

        Optional extra output (when ``ClipOptions(clip_mask=True)`` is used)
            ``outputs[1]`` — boolean mask, ``True`` where a value was clipped.
        """
        itype = g.get_type(X)
        dtype = tensor_dtype_to_np_dtype(itype)

        low = np.array(estimator.clip_min, dtype=dtype)
        high = np.array(estimator.clip_max, dtype=dtype)

        # ── Primary output: Clip ──────────────────────────────────────────────────
        _clipped = g.op.Clip(X, low, high, name=name, outputs=outputs[:1])

        # ── Optional extra output: clip mask ─────────────────────────────────────
        if g.convert_options.has("clip_mask", estimator, name):
            assert (
                len(outputs) > 1
            ), f"Expected at least 2 outputs when clip_mask is active, got {len(outputs)}"
            below = g.op.Less(X, low, name=f"{name}_below")
            above = g.op.Greater(X, high, name=f"{name}_above")
            g.op.Or(below, above, name=f"{name}_mask", outputs=outputs[1:2])

        return outputs[0] if len(outputs) == 1 else tuple(outputs)


.. GENERATED FROM PYTHON SOURCE LINES 151-153

4. Training data
-----------------

.. GENERATED FROM PYTHON SOURCE LINES 153-160

.. code-block:: Python


    rng = np.random.default_rng(0)
    X_train = rng.standard_normal((80, 4)).astype(np.float32)
    X_test = rng.standard_normal((20, 4)).astype(np.float32)

    transformer = ClipTransformer(clip_min=-0.5, clip_max=0.5).fit(X_train)


.. GENERATED FROM PYTHON SOURCE LINES 161-165

5. Baseline conversion — no extra output
-----------------------------------------

Without any ``convert_options`` the model produces a single output.

.. GENERATED FROM PYTHON SOURCE LINES 165-182

.. code-block:: Python


    onx_plain = to_onnx(
        transformer, (X_train,), extra_converters={ClipTransformer: convert_clip_transformer}
    )

    print("=== Plain conversion (no clip_mask) ===")
    print(f"Outputs: {[o.name for o in onx_plain.graph.output]}")

    sess_plain = onnxruntime.InferenceSession(
        onx_plain.SerializeToString(), providers=["CPUExecutionProvider"]
    )
    (clipped_onnx,) = sess_plain.run(None, {"X": X_test})
    clipped_sklearn = transformer.transform(X_test)

    assert np.allclose(clipped_sklearn, clipped_onnx, atol=1e-6), "Clipped values differ!"
    print("Clipped values match sklearn ✓")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    === Plain conversion (no clip_mask) ===
    Outputs: ['Y']
    Clipped values match sklearn ✓


.. GENERATED FROM PYTHON SOURCE LINES 183-189

6. Conversion with ``clip_mask=True``
--------------------------------------

Passing ``ClipOptions(clip_mask=True)`` instructs the framework to allocate
a second output slot.  The converter detects this via
``g.convert_options.has("clip_mask", estimator)`` and emits the boolean mask.

.. GENERATED FROM PYTHON SOURCE LINES 189-218

.. code-block:: Python


    clip_opts = ClipOptions(clip_mask=True)
    onx_with_mask = to_onnx(
        transformer,
        (X_train,),
        extra_converters={ClipTransformer: convert_clip_transformer},
        convert_options=clip_opts,
    )

    print("\n=== Conversion with clip_mask=True ===")
    print(f"Outputs: {[o.name for o in onx_with_mask.graph.output]}")

    sess_mask = onnxruntime.InferenceSession(
        onx_with_mask.SerializeToString(), providers=["CPUExecutionProvider"]
    )
    clipped_onnx2, mask_onnx = sess_mask.run(None, {"X": X_test})

    # Verify clipped values
    assert np.allclose(clipped_sklearn, clipped_onnx2, atol=1e-6), "Clipped values differ!"
    print("Clipped values match sklearn ✓")

    # Verify boolean mask
    expected_mask = transformer.get_clip_mask(X_test)
    assert np.array_equal(expected_mask, mask_onnx), "Clip mask differs!"
    print("Clip mask matches sklearn ✓")

    print(f"\nmask_onnx shape  : {mask_onnx.shape}")
    print(f"fraction clipped : {mask_onnx.mean():.2%}")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    === Conversion with clip_mask=True ===
    Outputs: ['Y', 'clip_mask']
    Clipped values match sklearn ✓
    Clip mask matches sklearn ✓

    mask_onnx shape  : (20, 4)
    fraction clipped : 57.50%


.. GENERATED FROM PYTHON SOURCE LINES 219-224

7. Visualize the ONNX graph
----------------------------

The graph with ``clip_mask=True`` contains the ``Clip`` node for the primary
output plus ``Less``, ``Greater``, and ``Or`` nodes for the mask.

.. GENERATED FROM PYTHON SOURCE LINES 224-226

.. code-block:: Python


    plot_dot(onx_with_mask)


.. image-sg:: /auto_examples_sklearn/images/sphx_glr_plot_sklearn_custom_converter_options_001.png
   :alt: plot sklearn custom converter options
   :srcset: /auto_examples_sklearn/images/sphx_glr_plot_sklearn_custom_converter_options_001.png
   :class: sphx-glr-single-img


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.330 seconds)


.. _sphx_glr_download_auto_examples_sklearn_plot_sklearn_custom_converter_options.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_sklearn_custom_converter_options.ipynb <plot_sklearn_custom_converter_options.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_sklearn_custom_converter_options.py <plot_sklearn_custom_converter_options.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_sklearn_custom_converter_options.zip <plot_sklearn_custom_converter_options.zip>`


.. include:: plot_sklearn_custom_converter_options.recommendations


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_