Skip to content

Releases: ROCm/MIOpen

MIOpen v2.3.0

31 Mar 21:45
Compare
Choose a tag to compare

Notes:

  • This release contains new implementations of the implicitGEMM and Winograd algorithms, performance improvements for convolutions, further support for 3D convolutional networks, and various bug fixes.

Changes:

  • Added 3D Pooling layers
  • Added backwards data algorithm for implicitGEMM
  • Added GEMM performance improvements via relaxed constraints in rocBLAS-Tensile
  • Added full CO v3 support for all kernels in MIOpen
  • Added new Winograd group convolution kernels
  • Added an API to query MIOpen's version
  • Added parallel compilation in initial convolutional algorithm search; partial solution to #130
  • Added SQLite binary program cache
  • Improved logging across all layers
  • Improved MIOpen's internal design for calling convolutional solvers
  • Fixed various bugs for the implicitGEMM algorithm

MIOpen v2.2.1

26 Feb 20:39
Compare
Choose a tag to compare

Notes:

  • This release contains bug fixes, documentation updates, and further code object version 3 support

Changes:

  • Added support for multiple ROCm installations
  • Added additional support for code object v3
  • Fixed issue with incorrect LRN calculation #127
  • Fixed incorrect performance database documentation
  • Fixed issue with incorrect workspace calculation in group convolutions
  • Fixed issue with unsupported hardware instructions used with inline assembly

MIOpen v2.2.0

19 Dec 23:13
Compare
Choose a tag to compare

Notes:

  • This release contains bug fixes, performance improvements, and expanded applicability for specific convolutional algorithms.
  • MIOpen has posted a citable paper on ArXiv here.
  • An SQLite database has been added to replace the text-based performance database. While the text file still exists, by default SQLite is used over the text-based performance database; see documentation from more details.

Changes:

  • Added per solution algorithm filtering environmental variable for debugging
  • Added SQLite3 database and build dependency. The text-based performance database support is deprecated and will be removed in the next release.
  • Added citation page to documentation pointing to MIOpen's paper
  • Added to the overall documentation
  • Fixed fusion compilation check issue
  • Fixed fusion group convolution warning
  • Improved performance of forward pooling
  • Improved performance of convolutions
  • Improved performance of spatial training batch normalization for some large batch size input configurations
  • Improved applicability of implicit GEMM convolution algorithm
  • Improved performance of calls to miopenConvolutionXXXGetWorkSpaceSize() functions
  • Improved conformance to code object version 3
  • Disabled SCGEMM convolution algorithm by default; this algorithm is deprecated and will be removed in future releases
  • Changed "hip_hcc" to "hip-hcc" for the MIOpen package requirements in CMakeLists.txt

MIOpen v2.1.0

25 Sep 16:45
Compare
Choose a tag to compare

Notes:

  • This release contains new layers, bug fixes, and a new convolution algorithm.

Changes:

  • Added a dropout layer API for training
  • Added a new SCGEMM algorithm for convolutions
  • Added further support for bfp16 in convolutions
  • Added a docker hub link for MIOpen docker images.
  • Fixed issue with NaN appearing on batch normalization backwards pass in fp16
  • Fixed softmax kernel bug in log mode #112
  • Fixed gfx803 support issue #869
  • Fixed gfx803 kernel issue #117
  • Fixed issue with disabled GEMM #119
  • Improved performance of batch normalization fp16 forward training layers
  • Improved performance of convolutions layers
  • Removed MIOpenGEMM as a requirement for the HIP backend. It is now optional.

MIOpen v2.0.1

13 Aug 16:19
Compare
Choose a tag to compare

Notes:

  • This release contains bug fixes and performance improvements.
  • Additionally, the convolution algorithm Implicit GEMM is now enabled by default
  • Known issues:
    • Backward propagation for batch normalization in fp16 mode may trigger NaN in some cases
    • Softmax Log mode may produce an incorrect result in back propagation

Changes:

  • Added Winograd multi-pass convolution kernel
  • Fixed issue with hip compiler paths
  • Fixed immediate mode behavior with auto-tuning environment variable
  • Fixed issue with system find-db in-memory cache, the fix enable the cache by default
  • Improved logging
  • Improved how symbols are hidden in the library
  • Updated default behavior to enable implicit GEMM

MIOpen v2.0.0

08 Jul 17:30
Compare
Choose a tag to compare

Notes:

  • This release contains several new features including an immediate mode for selecting convolutions, bfloat16 support, new layers, modes, and algorithms.
  • MIOpenDriver, a tool for benchmarking and developing kernels is now shipped with MIOpen.
  • BFloat16 now supported in HIP requires an updated rocBLAS as a GEMM backend.
  • Immediate mode API now provides the ability to quickly obtain a convolution kernel.
  • MIOpen now contains HIP source kernels and implements the ImplicitGEMM kernels. This is a new feature and is currently disabled by default. Use the environmental variable "MIOPEN_DEBUG_CONV_IMPLICIT_GEMM=1" to activation this feature. ImplicitGEMM requires an up to date HIP version of at least 1.5.9211.
  • A new "loss" catagory of layers has been added, of which, CTC loss is the first. See the API reference for more details.
  • 2.0 is the last release of active support for gfx803 architectures. In future releases, MIOpen will not actively debug and develop new features specifically for gfx803.
  • System Find-Db in memory cache is disabled by default. Please see build instructions to enable this feature.

Changes:

  • Added support for bfloat16 datatype in convolutions
  • Added softmax channel mode and new softmax version 2 API
  • Added fast / accurate / log softmax algorithms
  • Added new implicit GEMM convolution algorithm for forward and backwards data passes, disabled by default
  • Added int32 datatype support for output tensors in int8 convolutions
  • Added immediate mode for finding the best convolution kernel for a given configuration
  • Added a Find-Db infrastructure which stashes results of find on a user's system
  • Added a shipped System Find-Db containing offline run Find() results
  • Added an additional, faster batch norm assembly kernel for fp16
  • Added CTC loss layer
  • Added MIOpenDriver as a default component in MIOpen's build #34
  • Fixed C compatability for boolean types in C API #103
  • Fixed incorrect calculation in per-activation batch norm backwards pass #104
  • Fixed bug #95 with asm batch norm ISA
  • Fixed IsApplicable bug in Conv3x3Asm for group convolutions
  • Improved performance of 1x1 stride 2 fp32 convolutions in the forward and backwards data passes
  • Improved 3-D convolution stability
  • Improved applicability of direct convolution backwards weights for 2x2, 5x10, and 5x20 filter sizes
  • Improved maintainability in kernels and cpp code
  • Updated rocBLAS minimum version to branch master-rocm-2.6

MIOpen v1.8.1

03 May 22:01
Compare
Choose a tag to compare

Notes:

  • This release contains minor bug fixes and additional performance database improvements.

Changes:

  • Fixed accuracy issue with backwards weights
  • Fixed issue with name parsing for newer architectures
  • Added narrow workaround for 5x10 and 5x20 filter performance regression
  • Improved support in performance database for Radeon VII

MIOpen v1.8.0

12 Apr 04:33
917304e
Compare
Choose a tag to compare

Notes:

  • This release contains full 3-D convolution support and int8 support for inference.
  • Additionally, there are major updates in the performance database for major models including those found in Torchvision.
  • An assortment of bugs have been resolved in this release.

Changes:

  • Fixed various issues in assembly kernels
  • Fixed issue #92 and #79 for miopenOpTensor
  • Fixed issue #88 for bzip2
  • Fixed issue #77 algorithm mismatch
  • Added Winograd support for fp32 backwards weights
  • Added pooling inclusive mode
  • Added tuning for direct group convolution algorithms
  • Added additional kernel support for group convolutions
  • Added API for 3-D convolutions
  • Added support for int8 inference convolutions
  • Added integer selection for pooling indexing
  • Added minimum dependencies support
  • Added RNN fp16 support on the MIOpen-HIP backend
  • Added 1x1 convolution + bias + activation fusions
  • Added workaround for issue #84 GPU memory access fault
  • Added performance tuning for direct backwards weights
  • Improved performance database coverage
  • Improved internal quality by reducing redunant code
  • Improved build instructions in README.md
  • Improved performance database coverage for fusions
  • Updated Docker components and requirements

Known Issues:

  • RNNs do not support fp16 on the MIOpen-OpenCL backend
  • OpenCL backend does not support GEMM convolutions in fp16

MIOpen v1.7.1

06 Feb 16:01
Compare
Choose a tag to compare

Notes:

  • This release contains minor bug fixes and performance improvements.

Changes:

  • Fixed corrupt and obsolete performance database entries
  • Fixed issue #70
  • Fixed issue #72
  • Fixed issue #77
  • Removed default dependency of RNNs on rocBLAS
  • Added a workaround for softmax fp16 correctness issue
  • Added check to only make MIOpen with static boost libraries
  • Improved performance database coverage

Known Issues:

  • RNNs do not support fp16
  • OpenCL backend does not support GEMM convolutions in fp16
  • Layer fusions for convolution 1x1 fp16 are not supported
  • Layer fusions for large image 1x1 convolutions may cause an exception instead of a warning during compile phase if plan is not supported

MIOpen v1.7.0

19 Dec 18:24
Compare
Choose a tag to compare

Notes:

  • This release contains general bug fixes and an updated performance database
  • Group convolutions backwards weights performance has been improved
  • Logging across the library has been improved
  • Performance database has been updated

Changes:

  • Fixed logging issues with group convolution and pooling
  • Fixed sphinx version issue in document generation
  • Fixed issues with corrupt entries in performance database
  • Removed external dependency on libSSL and libCrypto
  • Added support for large image backwards weights in direct convolution
  • Added fp16 support for RNNs on the HIP backend
  • Improved performance database coverage

Known Issues:

  • RNNs do not support fp16
  • OpenCL backend does not support GEMM convolutions in fp16
  • Layer fusions for convolution 1x1 fp16 are not supported
  • Layer fusions for large image 1x1 convolutions may cause an exception instead of a warning during compile phase if plan is not supported