Change Logs¶
0.3.0¶
#181: add MaskedScatterNDOfShape custom operator
#175: adds custom operator MulSub and SubMul on CUDA
#173: adds custom operator AddSharedInput, MulSharedInput on CUDA
#170: adds custom operator TriMatrix on CUDA
#169: adds custom operator ReplaceZero on CUDA
#168: adds custom operator MulSigmoid on CUDA
#167: adds custom operator Rotary on CUDA
#165: adds custom operators AddAddAdd, MulMulMul on CUDA
#163: use onnxruntime==1.17.3 as default
#162: add ScatterNDOfShape implementation on CUDA without atomics
#159: add AddAdd custom operator on CUDA
#158: add MulMul custom operator on CUDA
#157: add ScatterNDOfShape custom operator
#155: add a function to draw a timeline from a profile
#154: improves ploting legend for profiling
#151: refactoring of TreeEnsemble code to make them faster
0.2.4¶
#120: use onnxruntime==1.16.3 as default
#115, #116, #118: adds C implementation of SVMRegressor, SVMClassifier reference operator based on it, and custom kernels for onnxruntime as well
#111, #117, #119: adds C implementation of TfIdfVectorizer + python implementation of Tokenizer + custom kernel for onnxruntime
#110: allows LEQ as an alias for BRANCH_LEQ for nodes_modes in TreeEnsemble* operators
#108: improves command lines documentation, fix an issue in command line stats
#103: add methods to compute statistics on TreeEnsemble and initializers
0.2.3¶
#99: use onnxruntime==1.16.1 as default
#96: implements a fonction to convert a ModelProto into string (not bytes), add a function to multiply the number of trees in a TreeEnsemble
#75: add an implementation of murmurhash3 to validate some options
#93: validates the wheels in CI
#89: add a function to merge models and update them if both have different opsets
0.2.2¶
0.2.1¶
#79: update to onnxruntime v1.16.0
#77: helpers to benchmark a model
#74: add a function to enumerate all intermediate results with onnxruntime
#71, #72, #73: add function to analyse a profile produce by onnxruntime
#67: add a function to extract a subgraph of a model
#59, #60, #61, #62, #63, #65, #66, #68, #69, #70: add local functions to quantize into float 8, float 16
#57: add C implementation for DynamicQuantizeLinear (for experimentation)
#56: add C implementation to cast a float into float 8
#55, #58: add basic functionality to transform a graph, starts with basic quantization
#51: fix optimized TreeEnsembleRegressor and adds TreeEnsembleClassifier as custom ops
#50: add command line store to store intermediate outputs
#49: add option to save intermediate results in CReferenceEvaluator
#45: add option cuda-link to setup.py to specify how to link with CUDA library
#41: implements a custom kernel for RandomForestRegressor easier to optimize
#34: update to onnxruntime v1.15.1
#31: implement a custom CUDA kernel (gemm)
#32: update to onnxruntime v1.15.0
#27: add a custom kernel with parameters to onnxruntime
#26: add a custom kernel to onnxruntime
#24: use Eigen to implement Conv operator
#23: make pip wheel . work
#22: rename cmake into _cmake to avoid warnings related to cmake package
#19: minimal settings to use onnxruntime
#14: minimal setting to use CUDA
#8: support for C++ unit test