Releases · microsoft/onnxruntime · GitHub

Full Content

Release Notes

What's new? This release adds an optimized CPU/MLAS implementation of DequantizeLinear (8 bit) and introduces the build option client_package_build, which enables default options that are more appropriate for client/on-device workloads (e.g., disable thread spinning by default). Build System & Packages Add –client_package_build option (#25351) - @jywu-msft Remove the python installation steps from win-qnn-arm64-ci-pipeline.yml (#25552) - @snnn CPU EP Add multithreaded/vectorized implementation of DequantizeLinear for int8 and uint8 inputs (SSE2, NEON) (#24818) - @adrianlizarraga QNN EP Add support for the Upsample, Einsum, LSTM, and CumSum operators (#24265, #24616, #24646, #24820) - @quic-zhaoxul, @1duo, @chenweng-quic, @Akupadhye Fuse scale into Softmax (#24809) - @qti-yuduo Enable DSP queue polling when performance is set to “burst” mode (#25361) - @quic-calvnguy Update QNN SDK to version 2.36.1 (#25388) - @qti-jkilpatrick Include the license file from QNN SDK in the Microsoft.ML.OnnxRunitme.QNN NuGet package (#25158) - @HectorSVC

Source Url

https://github.com/microsoft/onnxruntime/releases

Published At

Tuesday, August 12, 2025

12:00:00 AM

Discovered At

Monday, August 25, 2025

10:25:34 PM

Confidence

1