yobx.sklearn.neighbors.kernel_density#
- yobx.sklearn.neighbors.kernel_density.sklearn_kernel_density(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: KernelDensity, X: str, name: str = 'kde') str[source]#
Converts a
sklearn.neighbors.KernelDensityinto ONNX.The converter implements
score_samples(), which returns the log-density at each query point:log_density(x) = log( (1/N) · Σᵢ k( ‖x − xᵢ‖ / h ) ) − log(norm_h)
where
kis the unnormalized kernel,hthe bandwidth, andnorm_hthe kernel-specific normalization factor that does not depend on the query x.Rearranging, the output equals:
output(x) = log(Σᵢ k( ‖x − xᵢ‖ / h )) − log(N · norm_h · h^D)
All normalization constants are precomputed at conversion time and stored as ONNX scalar initializers, so the resulting graph contains only arithmetic and reduction operations.
Supported kernels
'gaussian'k(t) = exp(−t²/2)'exponential'k(t) = exp(−t)'tophat'k(t) = 1fort ≤ 1, else0'epanechnikov'k(t) = 1 − t²fort ≤ 1, else0'linear'k(t) = 1 − tfort ≤ 1, else0'cosine'k(t) = (π/4)·cos(πt/2)fort ≤ 1, else 0where
t = ‖x − xᵢ‖ / h.Graph structure (gaussian kernel, standard-ONNX path)
X (N, F) X_train (M, F) │ │ └──sq_euclidean───┘ → sq_dists (N, M) │ Mul(−0.5/h²) │ ReduceLogSumExp(axis=1) → log_sum (N,) │ Sub(log_norm_const) → log_density (N,)For compact kernels (
tophat,epanechnikov,linear,cosine) the same squared-distance matrix is used but kernel values are summed directly, then the log is taken. When no training sample falls within the bandwidth the score is−∞(matching sklearn behaviour for degenerate cases).Computation paths
With
com.microsoftopset (CDist path): Squared distances are delegated tocom.microsoft.CDist(metric="sqeuclidean"), which is hardware-accelerated by ONNX Runtime.Without
com.microsoftopset (standard ONNX path): Squared distances are computed using the expansion identity||x-c||² = ||x||² − 2·x·cᵀ + ||c||², which requires onlyMatMuland element-wise ops available since opset 13.- Parameters:
g – the graph builder to add nodes to
sts – shapes defined by scikit-learn
outputs – desired output names;
outputs[0]receives the log-density vector of shape(N,)estimator – a fitted
KernelDensityX – input tensor name
name – prefix for added node names
- Returns:
output tensor name for the log-density (shape
(N,))