Note
Go to the end to download the full example code.
Linear Regression and export to ONNX¶
scikit-learn and torch to train a linear regression.
data¶
import numpy as np
from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression, SGDRegressor
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import train_test_split
import torch
from onnxruntime import InferenceSession
from experimental_experiment.helpers import pretty_onnx
from onnx_array_api.plotting.graphviz_helper import plot_dot
X, y = make_regression(1000, n_features=5, noise=10.0, n_informative=2)
print(X.shape, y.shape)
X_train, X_test, y_train, y_test = train_test_split(X, y)
(1000, 5) (1000,)
scikit-learn: the simple regression¶
clr = LinearRegression()
clr.fit(X_train, y_train)
print(f"coefficients: {clr.coef_}, {clr.intercept_}")
coefficients: [ 0.3459669 0.45336692 0.65710949 26.14819898 25.6202578 ], -0.15630463825218383
Evaluation¶
LinearRegression: l2=102.12270199862125, r2=0.9443646784589387
scikit-learn: SGD algorithm¶
SGD = Stochastic Gradient Descent
clr = SGDRegressor(max_iter=5, verbose=1)
clr.fit(X_train, y_train)
print(f"coefficients: {clr.coef_}, {clr.intercept_}")
-- Epoch 1
Norm: 31.18, NNZs: 5, Bias: -0.284870, T: 750, Avg. loss: 186.075143
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 35.18, NNZs: 5, Bias: -0.270052, T: 1500, Avg. loss: 58.010925
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 36.24, NNZs: 5, Bias: -0.194536, T: 2250, Avg. loss: 53.378042
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 36.57, NNZs: 5, Bias: -0.231132, T: 3000, Avg. loss: 52.981006
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 36.50, NNZs: 5, Bias: -0.166303, T: 3750, Avg. loss: 52.930445
Total training time: 0.00 seconds.
~/vv/this312/lib/python3.12/site-packages/sklearn/linear_model/_stochastic_gradient.py:1608: ConvergenceWarning: Maximum number of iteration reached before convergence. Consider increasing max_iter to improve the fit.
warnings.warn(
coefficients: [ 0.24442643 0.51140703 0.67514459 26.11876261 25.47591706], [-0.16630295]
Evaluation
SGDRegressor: sl2=102.6936035240574, sr2=0.9440536576054526
Linrar Regression with pytorch¶
class TorchLinearRegression(torch.nn.Module):
def __init__(self, n_dims: int, n_targets: int):
super().__init__()
self.linear = torch.nn.Linear(n_dims, n_targets)
def forward(self, x):
return self.linear(x)
def train_loop(dataloader, model, loss_fn, optimizer):
total_loss = 0.0
# Set the model to training mode - important for batch normalization and dropout layers
# Unnecessary in this situation but added for best practices
model.train()
for X, y in dataloader:
# Compute prediction and loss
pred = model(X)
loss = loss_fn(pred.ravel(), y)
# Backpropagation
loss.backward()
optimizer.step()
optimizer.zero_grad()
# training loss
total_loss += loss
return total_loss
model = TorchLinearRegression(X_train.shape[1], 1)
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
loss_fn = torch.nn.MSELoss()
device = "cpu"
model = model.to(device)
dataset = torch.utils.data.TensorDataset(
torch.Tensor(X_train).to(device), torch.Tensor(y_train).to(device)
)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=1)
for i in range(5):
loss = train_loop(dataloader, model, loss_fn, optimizer)
print(f"iteration {i}, loss={loss}")
iteration 0, loss=405055.0
iteration 1, loss=96568.84375
iteration 2, loss=80600.0
iteration 3, loss=79749.5546875
iteration 4, loss=79699.8359375
Let’s check the error
TorchLinearRegression: tl2=102.52056525913498, tr2=0.9441479269434102
And the coefficients.
print("coefficients:")
for p in model.parameters():
print(p)
coefficients:
Parameter containing:
tensor([[ 0.3743, 0.3008, 0.7140, 26.3154, 25.4603]], requires_grad=True)
Parameter containing:
tensor([-0.2682], requires_grad=True)
Conversion to ONNX¶
Let’s convert it to ONNX.
ep = torch.onnx.export(model, (torch.Tensor(X_test[:2]),), dynamo=True)
onx = ep.model_proto
~/github/onnxscript/onnxscript/converter.py:816: FutureWarning: 'onnxscript.values.Op.param_schemas' is deprecated in version 0.1 and will be removed in the future. Please use '.op_signature' instead.
param_schemas = callee.param_schemas()
~/github/onnxscript/onnxscript/converter.py:816: FutureWarning: 'onnxscript.values.OnnxFunction.param_schemas' is deprecated in version 0.1 and will be removed in the future. Please use '.op_signature' instead.
param_schemas = callee.param_schemas()
[torch.onnx] Obtain model graph for `TorchLinearRegression([...]` with `torch.export.export(..., strict=False)`...
[torch.onnx] Obtain model graph for `TorchLinearRegression([...]` with `torch.export.export(..., strict=False)`... ✅
[torch.onnx] Run decomposition...
[torch.onnx] Run decomposition... ✅
[torch.onnx] Translate the graph into ONNX...
[torch.onnx] Translate the graph into ONNX... ✅
Let’s check it is work.
sess = InferenceSession(onx.SerializeToString(), providers=["CPUExecutionProvider"])
res = sess.run(None, {"x": X_test.astype(np.float32)[:2]})
print(res)
[array([[ 69.933266],
[-15.684591]], dtype=float32)]
And the model.

Optimization¶
By default, the exported model is not optimized and leaves many local functions. They can be inlined and the model optimized with method optimize.

With dynamic shapes¶
The dynamic shapes are used by torch.export.export()
and must
follow the convention described there.
ep = torch.onnx.export(
model,
(torch.Tensor(X_test[:2]),),
dynamic_shapes={"x": {0: torch.export.Dim("batch")}},
dynamo=True,
)
ep.optimize()
onx = ep.model_proto
print(pretty_onnx(onx))
[torch.onnx] Obtain model graph for `TorchLinearRegression([...]` with `torch.export.export(..., strict=False)`...
[torch.onnx] Obtain model graph for `TorchLinearRegression([...]` with `torch.export.export(..., strict=False)`... ✅
[torch.onnx] Run decomposition...
[torch.onnx] Run decomposition... ✅
[torch.onnx] Translate the graph into ONNX...
[torch.onnx] Translate the graph into ONNX... ✅
opset: domain='' version=18
input: name='x' type=dtype('float32') shape=['batch', 5]
init: name='linear.weight' type=float32 shape=(1, 5)
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.2681867], dtype=float32)
Gemm(x, linear.weight, linear.bias, beta=1.00, transB=1, alpha=1.00, transA=0) -> linear
output: name='linear' type=dtype('float32') shape=['batch', 1]
For simplicity, it is possible to use torch.export.Dim.DYNAMIC
or torch.export.Dim.AUTO
.
ep = torch.onnx.export(
model,
(torch.Tensor(X_test[:2]),),
dynamic_shapes={"x": {0: torch.export.Dim.AUTO}},
dynamo=True,
)
ep.optimize()
onx = ep.model_proto
print(pretty_onnx(onx))
[torch.onnx] Obtain model graph for `TorchLinearRegression([...]` with `torch.export.export(..., strict=False)`...
[torch.onnx] Obtain model graph for `TorchLinearRegression([...]` with `torch.export.export(..., strict=False)`... ✅
[torch.onnx] Run decomposition...
[torch.onnx] Run decomposition... ✅
[torch.onnx] Translate the graph into ONNX...
[torch.onnx] Translate the graph into ONNX... ✅
opset: domain='' version=18
input: name='x' type=dtype('float32') shape=['s35', 5]
init: name='linear.weight' type=float32 shape=(1, 5)
init: name='linear.bias' type=float32 shape=(1,) -- array([-0.2681867], dtype=float32)
Gemm(x, linear.weight, linear.bias, beta=1.00, transB=1, alpha=1.00, transA=0) -> linear
output: name='linear' type=dtype('float32') shape=['s35', 1]
Total running time of the script: (0 minutes 5.865 seconds)
Related examples