Releases: fastmachinelearning/hls4ml
Releases · fastmachinelearning/hls4ml
foxglove 1.0.0
What's Changed
hls4ml v1.0.0 "foxglove" introduces several significant improvements:
- A new QONNX frontend by @jmitrevs introduced in #979
- The ability for hls4ml to automatically infer the precision of data types by @vloncar introduced in #855
- The addition of an experimental backend for Intel oneAPI by @jmitrevs introduced in #955
- The addition of a backend for Siemens Catapult by @dgburnette in #956
- Added support for HGQ proxy models by @calad0i in #914
- An API for hardware-aware optimization by @bo3z in #768 and #809
The full list of other improvements and fixes is:
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #949
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #953
- hls4ml Optimization API [Part 1] by @bo3z in #768
- QKeras support for RNN layers by @laurilaatu in #856
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #962
- Try to fix sphinx problem by restricting tensorflow-model-optimization by @jmitrevs in #967
- Bump pre-commit/action from 3.0.0 to 3.0.1 by @dependabot in #968
- Change fractional (and others) to be a property, move quantizers by @jmitrevs in #964
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #969
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #971
- vitis backend tarball fix by @calad0i in #972
- remove special vitis version of nnet_dense_resource.h by @jmitrevs in #975
- Allow Vitis synthesis tests by @jmduarte in #927
- Fix cleanup of synthesis tests (leftover from 927) by @vloncar in #989
- Fix sphinx by pinning tensorflow<=2.15 by @jmduarte in #992
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #984
- add clock uncertainty configuration option by @jmitrevs in #870
- Stage initial set of changes for the Catapult backend by @dgburnette in #956
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #999
- fix unwanted tested file change in #956 by @calad0i in #1000
- Fix SR backend synth missing variables by @bo3z in #993
- Upsampling support for PyTorch models by @vloncar in #977
- Split fpga_types into separate files by @vloncar in #998
- Support negative_slope in quantized_relu by @vloncar in #987
- Group more tests per YAML to reduce the number of envs created by @vloncar in #996
- Automatic precision inference by @vloncar in #855
- Remove unnecessary transposes related to conversion to channels_last format by @vloncar in #976
- Update pytest docker image to 0.5.4 by @jmitrevs in #1005
- Fix pre-commit warning and change '.h5' to '.keras' for written output by @jmitrevs in #1006
- Fix extension test for Keras v3 by @vloncar in #1009
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1007
- updated pytest docker image by @jmitrevs in #1017
- SepConv1d/2d for io_parallel with Latency strategy by @vloncar in #1012
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1021
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1023
- Latency Pooling Header Updates by @calad0i in #973
- Make im2col default option for quartus by @calad0i in #1010
- add protection for when kernel_quantizer is None by @jmitrevs in #997
- prevent test directory overwrites for activation by @jmitrevs in #1031
- Update Jenkinsfile to use new Docker image and Python 3.10 environment by @vloncar in #1033
- clean-up test ci yaml generater by @calad0i in #1036
- Add View to layer name map for pytorch parser by @JanFSchulte in #1039
- Add RNN support for Pytorch by @JanFSchulte in #850
- Add Vitis to pytorch API tests by @JanFSchulte in #1040
- clean up mult-dimensional dense by @jmitrevs in #1042
- Add namespaces and optional writer config by @vloncar in #986
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1044
- Add support for HGQ proxy model by @calad0i in #914
- Bug Fix for Operand Shape Mismatch in BatchNorm Fusion (PyTorch) by @sei-rquartiano in #1045
- remove precision settings that make pytest for batchnorm in pytorch fail by @JanFSchulte in #1053
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1047
- rm slow mnist training in test by @calad0i in #1018
- Add an optimizer to replace SeparableConv by Depthwise + Conv (pointwise) by @jmitrevs in #1022
- Add functionality to use granularity option also for pytorch models by @JanFSchulte in #1051
- Update pooling logic for Vivado, Vitis, and Catapult backends by @jmitrevs in #1056
- remove padding attribute by @jmitrevs in #1061
- Run long-running pytests out of the batch by @vloncar in #1062
- Fix tanh activiation in pytorch parser by @JanFSchulte in #1055
- make auto the default for layer config by @jmitrevs in #1016
- remove checks on 'padding' that were missed in previous PR by @jmitrevs in #1064
- Remove extras flow by @vloncar in #1067
- Expose alpha and theta type for parametrized activations by @jmitrevs in #1069
- Raise exception on compile errors by @vloncar in #1068
- update qkeras in Jenkinsfile by @jmitrevs in #1072
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1075
- hls4ml Optimization API [Part 2] by @bo3z in #809
- Hardcore weight txt path by @vloncar in #1089
- quote the ${WEIGHT_DIR} to handle special characters by @jmitrevs in #1091
- Beginnings of the oneAPI backend by @jmitrevs in #955
- update keras activation parsing, especially leaky relu by @jmitrevs in #1085
- Fix softmax parsing in pytorch and add test by @JanFSchulte in #1086
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #1098
- Change indexing in filling result for io_parallel convolutions, Vitis by @jmitrevs in #1102
- Update QONNX parsing for 1.0 by @jmitrevs in #979
- remove incorrect input from Constant nodes by @jmitrevs in #1119
- add max_precision to onnx parser by @jmitrevs in #1113
- Add RF to config templates for "Merge"...
edelweiss 0.8.1
What's Changed
- Fix for #905 by @calad0i in #906
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #921
- Fix logos in README.md by @vloncar in #930
- Fix writer precision when fp bits >= 14 by @calad0i in #909
- Let repack_stream optimizer inheirt original precision by @calad0i in #907
- Update A3D3 grant no. by @schsu in #941
- Add precision inherition for when generating stream clone by @calad0i in #911
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #942
- Quartus multi out with stream fix by @calad0i in #908
- Fix profiling for Keras LSTM layers. by @Landay7 in #940
- Fix for multiple inputs that may get out of order by @jmduarte in #937
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #944
- Bump actions/upload-artifact from 3 to 4 by @dependabot in #943
- better repalce_node fn by @calad0i in #934
- bump to 0.8.1 by @jmitrevs in #945
New Contributors
Full Changelog: v0.8.0...v0.8.1
edelweiss 0.8.0
What's Changed
- Decouple pipeline style from strategy by @vloncar in #781
- Don't use reader in ModelGraph and layers by @vloncar in #770
- Remove tf_to_hls by @vloncar in #795
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #796
- Fix parsing of QConv2DBatchnorm weights by @vloncar in #802
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #801
- Discussion - Inlined Conv slows down latency significantly (up to x15 - x20) by @bo3z in #800
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #807
- Fix over-allocation of bits for quantised po2 by @bo3z in #806
- Propagate zeros from Conv layers to multiplication config by @bo3z in #797
- Fix Vitis Conv1D/2D latency strategy by @vloncar in #815
- Improved parsing of pytorch models using torch.FX - Clean by @JanFSchulte in #799
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #816
- Support for parsing nested models by @vloncar in #794
- Fix loading weights in n-dim dense -> 1x1 conv by @vloncar in #821
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #828
- Fix loading weights in GarNetStacked and GarNet internal array precisions by @joshlerner in #827
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #830
- Fix profiling for GRU/LSTM by @drankincms in #833
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #835
- remove obsolete and unused docker directory by @jmitrevs in #836
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #842
- Remove obsolete parameter mapping between pytorch and keras by @JanFSchulte in #847
- Make binary CNN match between Keras and hls4ml by @jmitrevs in #804
- No longer make ExponentPrecisionType and XnorPrecisionType inherit from IntegerPrecisionType by @jmitrevs in #845
- Add support for flattening to the pytorch parser by @JanFSchulte in #852
- Add option to configure IP version by @AdrianAlan in #851
- Bug fix for named nn.Sequential in pytorch parser by @JanFSchulte in #848
- Add QDepthwiseConv2D, DepthwiseConv2D, DepthwiseConv1D support by @jmitrevs in #834
- Symbolic expressions in hls4ml by @vloncar in #660
- Update dependencies, add testing extras by @jmitrevs in #837
- Bump actions/checkout from 3 to 4 by @dependabot in #866
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #869
- try to use new runners for gitlab CI by @jmitrevs in #879
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #880
- Fix weight precision format string by @vloncar in #877
- add acknowledgments by @jmduarte in #862
- Support for quantized SeparableConv1D/2D by @vloncar in #861
- Speed up Keras profiling by @AdrianAlan in #863
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #882
- Fix profiling SeparableConv1D and SeparableConv2D by @qberthet in #891
- Add support for filt_height==1 for streaming quartus conv2d by @jmitrevs in #886
- Fix config structure name in pragma for SeparableConv1D by @qberthet in #884
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #895
- Fix bit overflow with softmax by @calad0i in #887
- bump 0.8.0rc1 by @jmitrevs in #915
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #902
- Add funding acknowledgements by @jmduarte in #918
- Fix fetching models from example-models repo by @vloncar in #919
- add blank line to make rst format correct by @jmitrevs in #923
- Update default FPGA part number from KU115 to VU13P by @jmduarte in #924
- update to 0.8.0 by @jmitrevs in #925
New Contributors
- @pre-commit-ci made their first contribution in #796
- @joshlerner made their first contribution in #827
- @qberthet made their first contribution in #891
Full Changelog: v0.7.1...v0.8.0
edelweiss 0.8.0rc1
What's Changed
- Decouple pipeline style from strategy by @vloncar in #781
- Don't use reader in ModelGraph and layers by @vloncar in #770
- Remove tf_to_hls by @vloncar in #795
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #796
- Fix parsing of QConv2DBatchnorm weights by @vloncar in #802
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #801
- Discussion - Inlined Conv slows down latency significantly (up to x15 - x20) by @bo3z in #800
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #807
- Fix over-allocation of bits for quantised po2 by @bo3z in #806
- Propagate zeros from Conv layers to multiplication config by @bo3z in #797
- Fix Vitis Conv1D/2D latency strategy by @vloncar in #815
- Improved parsing of pytorch models using torch.FX - Clean by @JanFSchulte in #799
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #816
- Support for parsing nested models by @vloncar in #794
- Fix loading weights in n-dim dense -> 1x1 conv by @vloncar in #821
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #828
- Fix loading weights in GarNetStacked and GarNet internal array precisions by @joshlerner in #827
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #830
- Fix profiling for GRU/LSTM by @drankincms in #833
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #835
- remove obsolete and unused docker directory by @jmitrevs in #836
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #842
- Remove obsolete parameter mapping between pytorch and keras by @JanFSchulte in #847
- Make binary CNN match between Keras and hls4ml by @jmitrevs in #804
- No longer make ExponentPrecisionType and XnorPrecisionType inherit from IntegerPrecisionType by @jmitrevs in #845
- Add support for flattening to the pytorch parser by @JanFSchulte in #852
- Add option to configure IP version by @AdrianAlan in #851
- Bug fix for named nn.Sequential in pytorch parser by @JanFSchulte in #848
- Add QDepthwiseConv2D, DepthwiseConv2D, DepthwiseConv1D support by @jmitrevs in #834
- Symbolic expressions in hls4ml by @vloncar in #660
- Update dependencies, add testing extras by @jmitrevs in #837
- Bump actions/checkout from 3 to 4 by @dependabot in #866
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #869
- try to use new runners for gitlab CI by @jmitrevs in #879
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #880
- Fix weight precision format string by @vloncar in #877
- add acknowledgments by @jmduarte in #862
- Support for quantized SeparableConv1D/2D by @vloncar in #861
- Speed up Keras profiling by @AdrianAlan in #863
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #882
- Fix profiling SeparableConv1D and SeparableConv2D by @qberthet in #891
- Add support for filt_height==1 for streaming quartus conv2d by @jmitrevs in #886
- Fix config structure name in pragma for SeparableConv1D by @qberthet in #884
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #895
- Fix bit overflow with softmax by @calad0i in #887
- bump 0.8.0rc1 by @jmitrevs in #915
New Contributors
- @pre-commit-ci made their first contribution in #796
- @joshlerner made their first contribution in #827
- @qberthet made their first contribution in #891
Full Changelog: v0.7.1...v0.8.0rc1
delphinium 0.7.1
What's Changed
- bump version to v0.7.0 by @jmduarte in #778
- Fix for 2D conv layers in the special case of io_parallel with full parallelization by @drankincms in #760
- Fix RNN layers when strategy=resource by @vloncar in #780
- Update Jenkins test environment to avoid dependency hell by @vloncar in #786
- Explicitly set strategy for pointwise conv by @vloncar in #785
- Minor docs fixes for 0.7.1 by @vloncar in #788
- bump 0.7.1 by @jmitrevs in #791
Full Changelog: v0.7.0...v0.7.1
delphinium
What's Changed
- fix conv1d io_parallel resource by @jmitrevs in #403
- Speed up CI tests by @thesps in #407
- Fix GlobalPooling1D Layers by @jmduarte in #399
- Fix batched multiple inputs by @jmduarte in #414
- Fixed 'qkeras_mnist_dense' example build problem #423 by @siorpaes in #424
- Update for pyyaml 6.0 by @thesps in #435
axi_stream_driver
update by @nicologhielmetti in #420- Reshape fixes: don't repack stream for flatten; remove final reshape by @jmduarte in #443
- Fix Conv2D with
io_type = io_parallel
&Strategy: Resource
by @thesps in #448 - Support applying Softmax over multidimensional tensors by @vloncar in #384
- Disable some unsupported layers by @thesps in #447
- Fixes: quantized_relu & unsigned profiling part II by @thesps in #441
- GarNet and GarNetStack in config.py by @yiiyama in #344
- support ZeroPadding layers by @jmduarte in #480
- New backend development framework by @vloncar in #395
- Register
ApplyAlpha
layer templates by @thesps in #499 - Parsing extended by @nicologhielmetti in #501
- Remove intermediate casting in product by @jmitrevs in #490
- Add QKeras as a package dependency by @vloncar in #511
- Copy flows from config by @thesps in #510
- VivadoAccelerator backend updates by @thesps in #508
- Optimized look-up table by @nemerchiedde in #527
- Upsampling2D test case by @ChiRuiChen in #520
- Support UpSampling1D by @vloncar in #475
- RNN support (part 1) by @vloncar in #521
- Quartus Custom Matrix Multiplication & Quantization by @bo3z in #523
- Vivado-equivalent implementation of Softmax on Quartus by @bo3z in #540
- Ensure 2 bits for scale in po2 quantizers by @vloncar in #531
- Link update by @bkmgit in #519
- Fix removal of nodes ingested by multiple downstream nodes by @jmduarte in #544
- Enable SeparableConv2d by @jmduarte in #547
- Extension API by @vloncar in #528
- change string ReuseFactor to int by @jmitrevs in #416
- Make the size of bn scale and bias what they really are by @jmitrevs in #532
- Raise runtime error when a layer is named
input
by @jmduarte in #482 - fix insertion before a node with multiple inputs + support additional broadcasting by @jmduarte in #551
- Pointwise conv1d/2d resource by @jmduarte in #471
- Quartus Embedding Layer by @bo3z in #548
- Fix for QActivations passed as an argument by @AdrianAlan in #553
- Don't override precision directly in the QKeras optimizer by @vloncar in #567
- Remove the in/out size from top function by @vloncar in #559
- Transpose2d, Concatenate2d, and up to 3 Clones for io_stream by @jmduarte in #402
- Remove io_serial as io_stream and add some more info in docs. by @Duchstf in #334
- Update docs for v0.6.0 by @thesps in #453
- Use correct number of args for multiple outputs by @apfusco in #487
- Fixed a few typos in the documentation by @pitmonticone in #467
- returning integer from _compute_n_samples by @JochiSt in #537
- Providing support for Alveo boards by @selwyn96 in #552
- Make layer names case sensitive in config. by @jmitrevs in #577
- Add issue and PR templates by @jmduarte in #582
- Vivado Backend GRU/LSTM support by @drankincms in #560
- Update CI template syntax by @thesps in #593
- Update flow dependencies by @vloncar in #588
- Fix parsing of ZeroPadding layers by @vloncar in #595
- remove cppname by @jmitrevs in #562
- Remove email helpline from the docs by @vloncar in #601
- Fixes for GRU/LSTM in Vivado backend by @drankincms in #598
- Remove io_serial by @vloncar in #609
- Fix test_graph by @vloncar in #611
- Override parent backend optimizer passes with derived backend passes by @thesps in #597
- Enforce function pipelining when using io_parallel with Resource strategy by @vloncar in #605
- FIFO depth optimization by @nicologhielmetti in #509
- Add tracing support for the quartus backend by @jmitrevs in #583
- Quartus streaming support for Activations, Dense & Batch Normalization by @bo3z in #557
- QConv alpha != 1 bug fix by @bo3z in #612
- Quartus Stream Embedding by @bo3z in #625
- change master to main by @jmitrevs in #602
- Edit order of the optimizers in the flow so that BramFactor is followed by @jmitrevs in #621
- Softmax LUT Optimization by @bo3z in #570
- Quartus Synthesis Flow Improvement by @bo3z in #618
- Quartus Extensions by @bo3z in #628
- Quartus GRU by @bo3z in #596
- Quartus Merge layers by @bo3z in #634
- fix nondefault project name handling by @jmitrevs in #626
- Fix parsing of logic synthesis reports by @vloncar in #639
- Fix conv1d stream implementation hls directives by @Jonathan-Shoemaker in #635
- Implementation and optimizations linked to Simple-RNN and LSTM for qu… by @nemerchiedde in #575
- Softsign optimization by @nemerchiedde in #585
- Parallel CNNs, Pooling & Image Layers for Quartus Backend by @bo3z in #561
- Quartus Streaming Softsign (PR #585 contd.) by @bo3z in #655
- Remove final reshapes even for Quartus by @jmitrevs in #661
- Unrolled CNN implementation by @vloncar in #600
- the strategy was not propagated in the pytest by @jmitrevs in #663
- Fix keras model loading issue with loading model with KerasH5 by @calad0i in #664
- append applied_flows container before filling instead of after by @jmitrevs in #641
- set version using
setuptools_scm
by @jmduarte in #479 - Argmax Softmax by @bo3z in #627
- Fix version extraction in Sphinx config by @vloncar in #669
- Add requested citations to README by @jmduarte in #615
- skip BatchNorm fusion when input/output is used multiple times by @jmduart...
delphinium rc1
What's Changed
- fix conv1d io_parallel resource by @jmitrevs in #403
- Speed up CI tests by @thesps in #407
- Fix GlobalPooling1D Layers by @jmduarte in #399
- Fix batched multiple inputs by @jmduarte in #414
- Fixed 'qkeras_mnist_dense' example build problem #423 by @siorpaes in #424
- Update for pyyaml 6.0 by @thesps in #435
axi_stream_driver
update by @nicologhielmetti in #420- Reshape fixes: don't repack stream for flatten; remove final reshape by @jmduarte in #443
- Fix Conv2D with
io_type = io_parallel
&Strategy: Resource
by @thesps in #448 - Support applying Softmax over multidimensional tensors by @vloncar in #384
- Disable some unsupported layers by @thesps in #447
- Fixes: quantized_relu & unsigned profiling part II by @thesps in #441
- GarNet and GarNetStack in config.py by @yiiyama in #344
- support ZeroPadding layers by @jmduarte in #480
- New backend development framework by @vloncar in #395
- Register
ApplyAlpha
layer templates by @thesps in #499 - Parsing extended by @nicologhielmetti in #501
- Remove intermediate casting in product by @jmitrevs in #490
- Add QKeras as a package dependency by @vloncar in #511
- Copy flows from config by @thesps in #510
- VivadoAccelerator backend updates by @thesps in #508
- Optimized look-up table by @nemerchiedde in #527
- Upsampling2D test case by @ChiRuiChen in #520
- Support UpSampling1D by @vloncar in #475
- RNN support (part 1) by @vloncar in #521
- Quartus Custom Matrix Multiplication & Quantization by @bo3z in #523
- Vivado-equivalent implementation of Softmax on Quartus by @bo3z in #540
- Ensure 2 bits for scale in po2 quantizers by @vloncar in #531
- Link update by @bkmgit in #519
- Fix removal of nodes ingested by multiple downstream nodes by @jmduarte in #544
- Enable SeparableConv2d by @jmduarte in #547
- Extension API by @vloncar in #528
- change string ReuseFactor to int by @jmitrevs in #416
- Make the size of bn scale and bias what they really are by @jmitrevs in #532
- Raise runtime error when a layer is named
input
by @jmduarte in #482 - fix insertion before a node with multiple inputs + support additional broadcasting by @jmduarte in #551
- Pointwise conv1d/2d resource by @jmduarte in #471
- Quartus Embedding Layer by @bo3z in #548
- Fix for QActivations passed as an argument by @AdrianAlan in #553
- Don't override precision directly in the QKeras optimizer by @vloncar in #567
- Remove the in/out size from top function by @vloncar in #559
- Transpose2d, Concatenate2d, and up to 3 Clones for io_stream by @jmduarte in #402
- Remove io_serial as io_stream and add some more info in docs. by @Duchstf in #334
- Update docs for v0.6.0 by @thesps in #453
- Use correct number of args for multiple outputs by @apfusco in #487
- Fixed a few typos in the documentation by @pitmonticone in #467
- returning integer from _compute_n_samples by @JochiSt in #537
- Providing support for Alveo boards by @selwyn96 in #552
- Make layer names case sensitive in config. by @jmitrevs in #577
- Add issue and PR templates by @jmduarte in #582
- Vivado Backend GRU/LSTM support by @drankincms in #560
- Update CI template syntax by @thesps in #593
- Update flow dependencies by @vloncar in #588
- Fix parsing of ZeroPadding layers by @vloncar in #595
- remove cppname by @jmitrevs in #562
- Remove email helpline from the docs by @vloncar in #601
- Fixes for GRU/LSTM in Vivado backend by @drankincms in #598
- Remove io_serial by @vloncar in #609
- Fix test_graph by @vloncar in #611
- Override parent backend optimizer passes with derived backend passes by @thesps in #597
- Enforce function pipelining when using io_parallel with Resource strategy by @vloncar in #605
- FIFO depth optimization by @nicologhielmetti in #509
- Add tracing support for the quartus backend by @jmitrevs in #583
- Quartus streaming support for Activations, Dense & Batch Normalization by @bo3z in #557
- QConv alpha != 1 bug fix by @bo3z in #612
- Quartus Stream Embedding by @bo3z in #625
- change master to main by @jmitrevs in #602
- Edit order of the optimizers in the flow so that BramFactor is followed by @jmitrevs in #621
- Softmax LUT Optimization by @bo3z in #570
- Quartus Synthesis Flow Improvement by @bo3z in #618
- Quartus Extensions by @bo3z in #628
- Quartus GRU by @bo3z in #596
- Quartus Merge layers by @bo3z in #634
- fix nondefault project name handling by @jmitrevs in #626
- Fix parsing of logic synthesis reports by @vloncar in #639
- Fix conv1d stream implementation hls directives by @Jonathan-Shoemaker in #635
- Implementation and optimizations linked to Simple-RNN and LSTM for qu… by @nemerchiedde in #575
- Softsign optimization by @nemerchiedde in #585
- Parallel CNNs, Pooling & Image Layers for Quartus Backend by @bo3z in #561
- Quartus Streaming Softsign (PR #585 contd.) by @bo3z in #655
- Remove final reshapes even for Quartus by @jmitrevs in #661
- Unrolled CNN implementation by @vloncar in #600
- the strategy was not propagated in the pytest by @jmitrevs in #663
- Fix keras model loading issue with loading model with KerasH5 by @calad0i in #664
- append applied_flows container before filling instead of after by @jmitrevs in #641
- set version using
setuptools_scm
by @jmduarte in #479 - Argmax Softmax by @bo3z in #627
- Fix version extraction in Sphinx config by @vloncar in #669
- Add requested citations to README by @jmduarte in #615
- skip BatchNorm fusion when input/output is used multiple times by @jmduart...
coris
What's Changed
VivadoAccelerator
backend: targetpynq-z2
andzcu102
boards directly from hls4ml by @nicologhielmetti- Updated
PyTorch
andONNX
converters by @Duchstf line_buffer
Conv2D implementation forio_stream
: reduced resource usage and latency by @Keb-L, @violatingcp, @vloncar- Support
QConv2DBatchnorm
layer fromQKeras
by @nicologhielmetti - Improved profiling plots - easier to compare original vs
hls4ml
converted models by @maksgraczyk - Better derivation of data types for
QKeras
models by @jmduarte, @thesps - Improved CI by @thesps
- More support for models with branches, skip connections,
Merge
andConcatenate
layers by @jmduarte, @vloncar - Support for
Dense
layers over multi-dimensional tensors by @vloncar - Overall improvements by @vloncar, @jmduarte, @thesps, @jmitrevs & others
New Contributors
- @siorpaes made their first contribution in #424
- @jmitrevs made their first contribution in #403
- @anders-wind made their first contribution in #302
- @KOVI89alipes made their first contribution in #318
- @maksgraczyk made their first contribution in #323
- @Keb-L made their first contribution in #332
- @ConsVin made their first contribution in #307
- @nicologhielmetti made their first contribution in #298
Full Changelog: v0.5.0...v0.6.0
bartsia
What's new:
- Streaming IO layer implementations, especially of Convolutional layers, accessed through the config with
IOType: io_stream
. Scales CNN support to much larger models than previously possible (see arXiv:2101.05108) - New documentation and API reference
- Further optimizations for QKeras / quantization aware training. A 'shift' operation is now used for
po2
quantizers - Allow redefinition of weights directory for standalone project compilation
profiling
for PyTorch models
Deprecated:
IOType : io_serial
is deprecated, and superceded by newIOType: io_stream
Bugfixes:
- Fix to Initiation Interval and different min/max latency for
Strategy: Resource
- Fix warnings in
hls4ml
command line script flow - Write yml config from Python API - for mixed API / command line flow
v0.5.0-beta
Pre-release of hls4ml version v0.5.0
.
What's new:
- Streaming IO layer implementations, especially of Convolutional layers, accessed through the config with
io_type: io_stream
. Scales CNN support to much larger models than previously possible (see paper) - New documentation and API reference
- Further optimizations for QKeras / quantization aware training. A 'shift' operation is now used for
po2
quantizers - Allow redefinition of weights directory for standalone project compilation