yobx.sklearn.neighbors.local_outlier_factor#

yobx.sklearn.neighbors.local_outlier_factor.sklearn_local_outlier_factor(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: LocalOutlierFactor, X: str, name: str = 'lof') → str | Tuple[str, str][source]#

Converts a sklearn.neighbors.LocalOutlierFactor into ONNX.

Only novelty=True is supported (novelty detection mode), which enables the predict() and score_samples() methods on new data.

Algorithm overview

For each query point x, the converter implements the exact LOF formula:

Compute pairwise distances to all training points: dists (N, M).
Find the k nearest training neighbours: topk_dists, topk_idx.
Compute the reachability distance from x to each neighbour x_i:
```
reach_dist(x, x_i) = max(dist(x, x_i), k_distance(x_i))
```
where k_distance(x_i) is the distance from training point x_i to its own k-th nearest neighbour, precomputed as estimator._distances_fit_X_[:, n_neighbors_ - 1].

Local Reachability Density of x:

LRD(x) = 1 / (mean(reach_dist(x, x_i)) + 1e-10)

LOF score:
```
LOF(x) = mean(LRD(x_i)) / LRD(x)
```
Anomaly score (score_samples):
```
score_samples(x) = -LOF(x)
```

Decision function and label:

decision_function(x) = score_samples(x) - offset_
predict(x) = 1  if decision_function(x) >= 0 else -1

ONNX graph structure

X (N, F)
  │
  └─── pairwise distances ────────────────────────────────► dists (N, M)
                                                                  │
               TopK(k, axis=1, largest=0) ──────────────► topk_dists (N,k), topk_idx (N,k)
                                                                  │
    Gather(k_distances_train, topk_idx) ───────────────► k_dists_nbrs (N, k)
                                                                  │
    Max(topk_dists, k_dists_nbrs) ─────────────────────► reach_dists (N, k)
                                                                  │
    ReduceMean(axis=1) + 1e-10 ─────────────────────────► mean_reach (N,)
                                                                  │
    Div(1, mean_reach) ─────────────────────────────────► lrd_query (N,)
                                                                  │
    Gather(lrd_train, topk_idx) ────────────────────────► lrd_nbrs (N, k)
                                                                  │
    Div(lrd_nbrs, Unsqueeze(lrd_query,1)) ──────────────► lrd_ratios (N, k)
                                                                  │
    Neg(ReduceMean(lrd_ratios, axis=1)) ────────────────► score_samples (N,)
                                                                  │
    Sub(score_samples, offset_) ────────────────────────► decision (N,)
                                                                  │
    Where(decision >= 0, 1, -1) ────────────────────────► label (N,)

Parameters:

g – the graph builder to add nodes to
sts – shapes and types defined by scikit-learn
outputs – desired output tensor names; two entries (label, scores) or one entry (label,)
estimator – a fitted LocalOutlierFactor with novelty=True
X – name of the input tensor
name – prefix used for names of nodes added by this converter

Returns:

label tensor name, or tuple (label, scores)

Raises:

ValueError – if estimator.novelty is False
NotImplementedError – if the opset is below 18 (required for ReduceMean with axes as input) or the metric is unsupported