Releases · ROCm/MIOpen

31 Mar 21:45

daniellowell

2.3.0

ef17912

MIOpen v2.3.0

Notes:

This release contains new implementations of the implicitGEMM and Winograd algorithms, performance improvements for convolutions, further support for 3D convolutional networks, and various bug fixes.

Changes:

Added 3D Pooling layers
Added backwards data algorithm for implicitGEMM
Added GEMM performance improvements via relaxed constraints in rocBLAS-Tensile
Added full CO v3 support for all kernels in MIOpen
Added new Winograd group convolution kernels
Added an API to query MIOpen's version
Added parallel compilation in initial convolutional algorithm search; partial solution to #130
Added SQLite binary program cache
Improved logging across all layers
Improved MIOpen's internal design for calling convolutional solvers
Fixed various bugs for the implicitGEMM algorithm

Assets 2

26 Feb 20:39

daniellowell

2.2.1

9218683

MIOpen v2.2.1

Notes:

This release contains bug fixes, documentation updates, and further code object version 3 support

Changes:

Added support for multiple ROCm installations
Added additional support for code object v3
Fixed issue with incorrect LRN calculation #127
Fixed incorrect performance database documentation
Fixed issue with incorrect workspace calculation in group convolutions
Fixed issue with unsupported hardware instructions used with inline assembly

Assets 2

19 Dec 23:13

daniellowell

2.2.0

9fd3e57

MIOpen v2.2.0

Notes:

This release contains bug fixes, performance improvements, and expanded applicability for specific convolutional algorithms.
MIOpen has posted a citable paper on ArXiv here.
An SQLite database has been added to replace the text-based performance database. While the text file still exists, by default SQLite is used over the text-based performance database; see documentation from more details.

Changes:

Added per solution algorithm filtering environmental variable for debugging
Added SQLite3 database and build dependency. The text-based performance database support is deprecated and will be removed in the next release.
Added citation page to documentation pointing to MIOpen's paper
Added to the overall documentation
Fixed fusion compilation check issue
Fixed fusion group convolution warning
Improved performance of forward pooling
Improved performance of convolutions
Improved performance of spatial training batch normalization for some large batch size input configurations
Improved applicability of implicit GEMM convolution algorithm
Improved performance of calls to miopenConvolutionXXXGetWorkSpaceSize() functions
Improved conformance to code object version 3
Disabled SCGEMM convolution algorithm by default; this algorithm is deprecated and will be removed in future releases
Changed "hip_hcc" to "hip-hcc" for the MIOpen package requirements in CMakeLists.txt

Assets 6

25 Sep 16:45

daniellowell

2.1.0

a5a4129

MIOpen v2.1.0

Notes:

This release contains new layers, bug fixes, and a new convolution algorithm.

Changes:

Added a dropout layer API for training
Added a new SCGEMM algorithm for convolutions
Added further support for bfp16 in convolutions
Added a docker hub link for MIOpen docker images.
Fixed issue with NaN appearing on batch normalization backwards pass in fp16
Fixed softmax kernel bug in log mode #112
Fixed gfx803 support issue #869
Fixed gfx803 kernel issue #117
Fixed issue with disabled GEMM #119
Improved performance of batch normalization fp16 forward training layers
Improved performance of convolutions layers
Removed MIOpenGEMM as a requirement for the HIP backend. It is now optional.

Assets 6

13 Aug 16:19

daniellowell

2.0.1

93074ff

MIOpen v2.0.1

Notes:

This release contains bug fixes and performance improvements.
Additionally, the convolution algorithm Implicit GEMM is now enabled by default
Known issues:
- Backward propagation for batch normalization in fp16 mode may trigger NaN in some cases
- Softmax Log mode may produce an incorrect result in back propagation

Changes:

Added Winograd multi-pass convolution kernel
Fixed issue with hip compiler paths
Fixed immediate mode behavior with auto-tuning environment variable
Fixed issue with system find-db in-memory cache, the fix enable the cache by default
Improved logging
Improved how symbols are hidden in the library
Updated default behavior to enable implicit GEMM

Assets 6

08 Jul 17:30

daniellowell

2.0.0

326bf22

MIOpen v2.0.0

Notes:

This release contains several new features including an immediate mode for selecting convolutions, bfloat16 support, new layers, modes, and algorithms.
MIOpenDriver, a tool for benchmarking and developing kernels is now shipped with MIOpen.
BFloat16 now supported in HIP requires an updated rocBLAS as a GEMM backend.
Immediate mode API now provides the ability to quickly obtain a convolution kernel.
MIOpen now contains HIP source kernels and implements the ImplicitGEMM kernels. This is a new feature and is currently disabled by default. Use the environmental variable "MIOPEN_DEBUG_CONV_IMPLICIT_GEMM=1" to activation this feature. ImplicitGEMM requires an up to date HIP version of at least 1.5.9211.
A new "loss" catagory of layers has been added, of which, CTC loss is the first. See the API reference for more details.
2.0 is the last release of active support for gfx803 architectures. In future releases, MIOpen will not actively debug and develop new features specifically for gfx803.
System Find-Db in memory cache is disabled by default. Please see build instructions to enable this feature.

Changes:

Added support for bfloat16 datatype in convolutions
Added softmax channel mode and new softmax version 2 API
Added fast / accurate / log softmax algorithms
Added new implicit GEMM convolution algorithm for forward and backwards data passes, disabled by default
Added int32 datatype support for output tensors in int8 convolutions
Added immediate mode for finding the best convolution kernel for a given configuration
Added a Find-Db infrastructure which stashes results of find on a user's system
Added a shipped System Find-Db containing offline run Find() results
Added an additional, faster batch norm assembly kernel for fp16
Added CTC loss layer
Added MIOpenDriver as a default component in MIOpen's build #34
Fixed C compatability for boolean types in C API #103
Fixed incorrect calculation in per-activation batch norm backwards pass #104
Fixed bug #95 with asm batch norm ISA
Fixed IsApplicable bug in Conv3x3Asm for group convolutions
Improved performance of 1x1 stride 2 fp32 convolutions in the forward and backwards data passes
Improved 3-D convolution stability
Improved applicability of direct convolution backwards weights for 2x2, 5x10, and 5x20 filter sizes
Improved maintainability in kernels and cpp code
Updated rocBLAS minimum version to branch master-rocm-2.6

Assets 6

03 May 22:01

daniellowell

1.8.1

0bce818

MIOpen v1.8.1

Notes:

This release contains minor bug fixes and additional performance database improvements.

Changes:

Fixed accuracy issue with backwards weights
Fixed issue with name parsing for newer architectures
Added narrow workaround for 5x10 and 5x20 filter performance regression
Improved support in performance database for Radeon VII

Assets 6

12 Apr 04:33

daniellowell

1.8.0

917304e

MIOpen v1.8.0

Notes:

This release contains full 3-D convolution support and int8 support for inference.
Additionally, there are major updates in the performance database for major models including those found in Torchvision.
An assortment of bugs have been resolved in this release.

Changes:

Fixed various issues in assembly kernels
Fixed issue #92 and #79 for miopenOpTensor
Fixed issue #88 for bzip2
Fixed issue #77 algorithm mismatch
Added Winograd support for fp32 backwards weights
Added pooling inclusive mode
Added tuning for direct group convolution algorithms
Added additional kernel support for group convolutions
Added API for 3-D convolutions
Added support for int8 inference convolutions
Added integer selection for pooling indexing
Added minimum dependencies support
Added RNN fp16 support on the MIOpen-HIP backend
Added 1x1 convolution + bias + activation fusions
Added workaround for issue #84 GPU memory access fault
Added performance tuning for direct backwards weights
Improved performance database coverage
Improved internal quality by reducing redunant code
Improved build instructions in README.md
Improved performance database coverage for fusions
Updated Docker components and requirements

Known Issues:

RNNs do not support fp16 on the MIOpen-OpenCL backend
OpenCL backend does not support GEMM convolutions in fp16

Assets 6

06 Feb 16:01

daniellowell

1.7.1

6054829

MIOpen v1.7.1

Notes:

This release contains minor bug fixes and performance improvements.

Changes:

Fixed corrupt and obsolete performance database entries
Fixed issue #70
Fixed issue #72
Fixed issue #77
Removed default dependency of RNNs on rocBLAS
Added a workaround for softmax fp16 correctness issue
Added check to only make MIOpen with static boost libraries
Improved performance database coverage

Known Issues:

RNNs do not support fp16
OpenCL backend does not support GEMM convolutions in fp16
Layer fusions for convolution 1x1 fp16 are not supported
Layer fusions for large image 1x1 convolutions may cause an exception instead of a warning during compile phase if plan is not supported

Assets 6

19 Dec 18:24

daniellowell

1.7.0

7cb5f5f

MIOpen v1.7.0

Notes:

This release contains general bug fixes and an updated performance database
Group convolutions backwards weights performance has been improved
Logging across the library has been improved
Performance database has been updated

Changes:

Fixed logging issues with group convolution and pooling
Fixed sphinx version issue in document generation
Fixed issues with corrupt entries in performance database
Removed external dependency on libSSL and libCrypto
Added support for large image backwards weights in direct convolution
Added fp16 support for RNNs on the HIP backend
Improved performance database coverage

Known Issues:

RNNs do not support fp16
OpenCL backend does not support GEMM convolutions in fp16
Layer fusions for convolution 1x1 fp16 are not supported
Layer fusions for large image 1x1 convolutions may cause an exception instead of a warning during compile phase if plan is not supported

Assets 6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ROCm/MIOpen

MIOpen v2.3.0

MIOpen v2.2.1

MIOpen v2.2.0

MIOpen v2.1.0

MIOpen v2.0.1

MIOpen v2.0.0

MIOpen v1.8.1

MIOpen v1.8.0

MIOpen v1.7.1

MIOpen v1.7.0