Skip to content

Releases: ROCm/rocBLAS

rocBLAS-0.12.0.0 release for ROCM 1.7.0

05 Mar 21:40
Compare
Choose a tag to compare

Changelist:

  • add hgemm
  • additional fix for multi-process and multi-threading
  • new solution selection logic

rocBLAS-0.10.4.0 release for ROCM 1.7.0

08 Feb 16:30
Compare
Choose a tag to compare

Changelist:

  • fix race condition for multi-process and multi-thread
  • hipLaunchKernelGGL replaces hipLaunchKernel
  • add logging

rocBLAS-0.10.3.0 release for ROCM 1.6.4

05 Dec 15:35
Compare
Choose a tag to compare

Changelist:

  • add dgemm assembly from Tensile v3.4.0
  • fix packaging install path
  • integrate clang-format

rocBLAS-0.10.2.0 release for ROCM 1.6.4

30 Nov 23:39
Compare
Choose a tag to compare

Changelist:

  • ported to CentOS
  • updated to use Tensile v3.3.7 with v_add_i32->u32 fix and fix for M<4
  • refactored code and tests for rocblas_pointer_mode

rocBLAS-0.10.1.0 release for ROCM 1.6.4

15 Nov 00:18
Compare
Choose a tag to compare
Pre-release

Changelist:

  • add MI25 tuning for Tensile 3.3.4
  • fix sgemm assembly kernels for thread safety
  • correct iXamax to 1 based indexing
  • refactor tests

Release for ROCM 1.6.4

17 Oct 15:04
Compare
Choose a tag to compare
Pre-release

NOTE: API breaking changes introduced in this release related to: rocblas_iXamax, rocblas_iXamin, complex functions, and half functions.

Changelist:

  • correct API: rocblas_samax -> rocblas_isamax, rocblas_damax -> rocblas_idamax
  • remove from the API functions for complex and half that have not been implemented
  • update to Tensile v3.2.0. This uses sgemm assembly kernels for gfx803 and gfx900
  • add rocblas_sgeam and rocblas_dgeam functions
  • improve repeatability of rocblas_Xgemm performance tests
  • update perf script

release for ROCM 1.6.3

16 Oct 22:09
Compare
Choose a tag to compare
Pre-release

NOTE: API breaking changes introduced in this release, primarily related to library NAME and SONAME.

Changelist:

  • Library removed the suffix which annotated platform (i.e. now librocblas.so)
  • so-name link renamed to reflect the MAJOR version number, (currently 0, changed from 1)
  • Build system entirely rewritten to simplify build/install process. Convenience bash script added to automate builds on Ubuntu distro (install.sh script added to root)
  • Tensile updated to v3.0.4, which includes fixes for NaN propogating on GEMM calls with beta == 0
  • 2 new samples added in samples directory (gemm & strided gemm)
  • haxpy implementation added
  • extra unit tests added and benchmarking capabilities for axpy, dot, scal
  • Improved stability of TRSM unit tests

rocBLAS-0.4.3.0 release for ROCM 1.6

25 Jul 21:34
Compare
Choose a tag to compare
Pre-release

Library release associated with ROCM v1.6 release.

Library tuned for Fiji family hardware.

rocBLAS-0.4.2.3 release for ROCM 1.5

23 Jun 15:14
Compare
Choose a tag to compare
Pre-release

Library release associated with the ROCm v1.5 platform release.

Library tuned for Fiji family hardware.

API Change: The order parameter has been removed from the gemm function. gemm functions now only support column major ordering. If you have row major matrices switch the following parameters: transa and transb, m and n, A and B, lda and ldb.

Below is the rocblas_sgemm function prototype.

rocblas_sgemm(
rocblas_handle handle,
rocblas_operation transa, rocblas_operation transb,
rocblas_int m, rocblas_int n, rocblas_int k,
const float *alpha,
const float *A, rocblas_int lda,
const float *B, rocblas_int ldb,
const float *beta,
float *C, rocblas_int ldc);

rocBLAS-0.4.2.0 release for ROCM 1.6

05 Jul 20:25
Compare
Choose a tag to compare
Pre-release

Library release associated with ROCM v1.6 platform release.

Library tuned for Fiji family hardware.