v0.12.0
This release highlights an optimized Wgpu Backend, clearer examples and documentation, and numerous bug fixes.
Notably, breaking changes in device management mandate explicit device specification to prevent potential bugs.
Additionally, the new PyTorch recorder simplifies model porting by enabling automatic import of PyTorch's weights.
We also put a lot of efforts into improving our CI infrastructure for enhanced reliability, efficiency, and scalability.
Changes
Tensor & Module API
- Added support for generic modules #1147 @nathanielsimard
- Added support for tuple modules #1186 @varonroy
- Enabled loading PyTorch .pt (weights/states) files directly to module's record, currently available on Linux & MacOS #1085 @antimora
- Added mish and softplus activation functions #1071 @pacowong
- Improved chunk performance in backends @1032 @Kelvinyu1117
- [Breaking] Added the device as an argument for tensor operations that require it, replacing the previous optional device usage #1081 #518 #1110 @kpot
- Code update involves either using
Default::default
for the same behavior or specifying the desired device.
- Code update involves either using
- Allowed raw tensors to be serialized/deserialized directly with serde #1041 @jmacglashan
- [Breaking] Forced the choice of the device for deserialization #1160 #1165 @nathanielsimard
- Added element-wise pow operation #1133 @skewballfox
- Refactored the tensor backend API names #1174 @skewballfox
- [Breaking] Changed the default recorder to
NamedMpkFileRecorder
#1161 #1151 @laggui- After a bit of exploration, we removed any type of compression because it adds to much overhead
Examples & Documentation
- Updated the text-classification example #1044 @nathanielsimard
- Fixed import and type redefinitions in
mnist-web-inference
#1100 @syl20bnr - Fixed documentation of Tensor::stack #1105 @PonasKovas
- Fixed some typos in links in the burn-book #1127 @laggui
- Added an example for a custom CSV dataset #1129 #1082 @laggui
- Fixed missing ticks in Burn book and removed unused example dependency #1144 @laggui
- Added a new example for regression problems #1150 #1148 @ashdtu
- Added model saving and loading examples in the book #1164 #1156 @laggui
- Added Rust concept notes and explanations to the Burn Book #1169 #1155 @laggui
- Fixed jupyter notebook and ONNX IR example #1170 @unrenormalizable
- Added a custom mnist dataset, removing the Python dependency for running the guide and the mnist example #1176 #1157 @laggui
- Updated documentation and book sections on PyTorch import #1180 @antimora
- Updated burn-book with improved tensor documentation #1183 #1103 @ashdtu
- Updated burn-book with a new dataset transforms section #1183 #1154 @ashdtu
- Update CONTRIBUTING.md with code guidelines. #1134 @syl20bnr
- Fixed documentation of Multi Head Attention #1205 @ashdtu
Wgpu Backend
- Optimized the repeat operation with a new kernel #1068 @louisfd
- Improved reduce autotune by adding the stride to the autotune key #1070 @louisfd
- Refactored binary operations to use the new JIT compiler IR #1078 @nathanielsimard
- Added persistent cache for autotune #1087 @syl20bnr
Fusion
- Refactored burn-fusion, making it possible to eventually save the JIT state #1104 @nathanielsimard
- Improved fusion in the Wgpu backend with caching #1069 @nathanielsimard
- Supported fusing int operations with burn-fusion #1093 @nathanielsimard
- Supported automatic vectorization of operations fused with burn-fusion in WGPU #1123 #1111 @nathanielsimard
- Supported automatically executing in-place operations fused with burn-fusion in WGPU #1128 #1124 @nathanielsimard
- Heavily refactored burn-fusion to better reflect the stream optimization process #1135 @nathanielsimard
- Heavily refactored burn-fusion to save all execution plans for any trigger #1143 @nathanielsimard
- Supported multiple concurrent optimization streams #1149 #1117 @nathanielsimard
- Supported overlapping optimization builders #1162 @nathanielsimard
- Supported fusing
ones
,zeroes
, andfull
operations #1159 @nathanielsimard - Supported autotuning fused element-wise kernels #1188 #1112 @nathanielsimard
Infra
- Support testing accelerate(MacOS) on the burn-ndarray backend #1050 @dcvz
- Improved CI output by introducing groups #1024 @dcvz
- Updated scheduled CI tasks #1028 @Luni-4
- Added support for Windows Pipeline #925 @Luni-4
- Fixed CI for testing the wgpu backend by pinning versions #1120 @syl20bnr
- Fixed burn-compute build command with no-std #1109 @syl20bnr
- Temporarily disabled unnecessary steps on Windows runners to save CI time #1107 @syl20bnr
- Refactored serialization of backend comparison benchmarks #1131 @syl20bnr
- Fixed doc build on docs.rs #1168 @syl20bnr
- Added cargo xtask commands for dependencies and vulnerabilities checks #1181 #965 @syl20bnr
- Added cargo xtask command to manage books #1192 @syl20bnr
Chore
- Shared some properties across the cargo workspace #1039 @dcvz
- Formatted the codebase with nightly where the stable version falls short #1017 @AlexErrant
- Improved panic messages on the web @1051 @dcvz
- Used web-time in wasm #1060 @sigma-andex
- Refactored some tensor tests #1089 @nathanielsimard
- Made Embedding weights public #1094 @unrenormalizable
- Updated candle version and added support for slice_assign #1095 @louisfd
- Records no longer require Debug and Clone #1137 @nathanielsimard
- Removed cargo warning #1108 @syl20bnr
- Updated wgpu version to 0.19.0 #1166 @nathanielsimard
- Added tests for Slice assign vs Cat in LSTM backward #1146 @louisfd
- Updated xtask publish task #1189 @Luni-4
- Enable dependabot daily #1195 @Luni-4
- Updated Ratatui version #1204 @nathanielsimard
- Updated tch version #1206 @laggui
Bug Fixes
- Fixed a slice issue in the LibTorch backend that could corrupt tensors' data #1064 #1055 @nathanielsimard
- Fixed issues with tensor stack and reshape on ndarray #1053 #1058 @AuruTus
- Fixed multithread progress aggregation in dataloader #1083 #1063 @louisfd
- Resolved a numerical bug with tanh on MacOS with Wgpu #1086 #1090 @louisfd
- Fixed burn-fusion, where only operations followed by a sync were fused #1093 @nathanielsimard
- Removed the requirement for users to add
serde
as a dependency for Burn #1091 @nathanielsimard - Fixed transformer prenorm on the residual path #1054 @Philonoist
- Fixed conv2d initialization by supporting fan_out #1138 @laggui
- Resolved the problem of sigmoid gradient generating NaN #1140 #1139 @wcshds
- Fixed
FullPrecisionSettings
type for integers #1163 @laggui - Fixed batchnorm not working properly when training on multiple devices #1167 @wcshds
- Fixed powf function in WGPU, with new tests #1193 #1207 @skewballfox @louisfd
- Fixed regex in PyTorch Recorder #1196 @antimora