yobx.sklearn.mixture.bayesian_gaussian_mixture#

yobx.sklearn.mixture.bayesian_gaussian_mixture.sklearn_bayesian_gaussian_mixture(g: GraphBuilderExtendedProtocol, sts: Dict, outputs: List[str], estimator: BayesianGaussianMixture, X: str, name: str = 'bayesian_gaussian_mixture') → Tuple[str, str][source]#

Converts a sklearn.mixture.BayesianGaussianMixture into ONNX.

The converter supports all four covariance types supported by BayesianGaussianMixture: 'full', 'tied', 'diag', and 'spherical'.

At inference time a BayesianGaussianMixture uses a variational Bayes approximation. The weighted log-probability for sample n under component k is:

log_p[n, k] = log_weight_k + log_det_k
              - 0.5 * n_features * log(2π)
              - 0.5 * quad[n, k]
              - 0.5 * n_features * log(degrees_of_freedom_[k])
              + 0.5 * (log_lambda_k - n_features / mean_precision_[k])

where

log_weight_k = _estimate_log_weights()[k]   (digamma-based)
log_lambda_k = n_features * log(2)
               + Σ_f digamma(0.5 * (dof_k - f))  for f in [0, n_features)

The last two lines are constant per component and are folded into the c_k offset together with log_weight_k, so at run-time only the same MatMul / ReduceSum operations as for GaussianMixture are required.

‘full’ — per-component Cholesky of the precision matrix L_k (shape (K, F, F)):

L_2d  = L.transpose(1,0,2).reshape(F, K*F)          # (F, K*F) constant
b     = einsum('ki,kij->kj', means_, L)              # (K, F)   constant
XL    = MatMul(X, L_2d)                              # (N, K*F)
Y     = Reshape(XL - b, [-1, K, F])                  # (N, K, F)
quad  = ReduceSum(Y * Y, axis=2)                     # (N, K)

‘tied’ — single shared Cholesky L (shape (F, F)):

means_L = means_ @ L                                 # (K, F)  constant
XL      = MatMul(X, L)                               # (N, F)
Y       = Reshape(XL, [-1, 1, F]) - means_L          # (N, K, F)
quad    = ReduceSum(Y * Y, axis=2)                   # (N, K)

‘diag’ — per-component diagonal precision A = prec_chol**2 (shape (K, F)):

B     = means_ * A                                   # (K, F)  constant
log_p = -0.5 * MatMul(X², Aᵀ) + MatMul(X, Bᵀ) + c  # (N, K)

‘spherical’ — scalar precision prec = prec_chol**2 per component (shape (K,)):

x_sq  = ReduceSum(X * X, axis=1, keepdims=1)        # (N, 1)
cross = MatMul(X, means_ᵀ)                          # (N, K)
log_p = prec * cross - 0.5 * prec * x_sq + c        # (N, K)

Parameters:

g – the graph builder to add nodes to
sts – shapes defined by scikit-learn
outputs – desired output names; outputs[0] receives the predicted component labels and outputs[1] receives the posterior probabilities
estimator – a fitted BayesianGaussianMixture
X – input tensor name
name – prefix for added node names

Returns:

tuple (label_result_name, proba_result_name)

Raises:

NotImplementedError – for unsupported covariance_type values