OpenBLAS 0.3.13 version
·
3621 commits
to release-0.3.0
since this release
common:
- Added a generic bfloat16 SBGEMV kernel
- Fixed a potentially severe memory leak after fork in OpenMP builds
that was introduced in 0.3.12 - Added detection of the Fujitsu Fortran compiler
- Added detection of the (e)gfortran compiler on OpenBSD
- Added support for overriding the default name of the library independently
from symbol suffixing in the gmake builds (already supported in cmake)
RISC V:
-
Added a RISC V port optimized for C910V
POWER:
- Added optimized POWER10 kernels for SAXPY, CAXPY, SDOT, DDOT and DGEMV_N
- Improved DGEMM performance on POWER10
- Improved STRSM and DTRSM performance on POWER9 and POWER10
- Fixed segmemtation faults in DYNAMIC_ARCH builds
- Fixed compilation with the PGI compiler
x86:
- Fixed compilation of kernels that require SSE2 intrinsics since 0.3.12
x86_64:
- Added an optimized bfloat16 SBGEMV kernel for SkylakeX and Cooperlake
- Improved the performance of SASUM and DASUM kernels through parallelization
- Improved the performance of SROT and DROT kernels
- Improved the performance of multithreaded xSYRK
- Fixed OpenMP builds that use the LLVM Clang compiler together with GNU gfortran
(where linking of both the LLVM libomp and GNU libgomp could lead to lockups or
wrong results) - Fixed miscompilations by old gcc 4.6
- Fixed misdetection of AVX2 capability in some Sandybridge cpus
- Fixed lockups in builds combining DYNAMIC_ARCH with TARGET=GENERIC on OpenBSD
ARM64:
- Fixed segmentation faults in DYNAMIC_ARCH builds
MIPS:
- Improved kernels for Loongson 3R3 ("3A") and 3R4 ("3B") models, including MSA
- Fixed bugs in the MSA kernels for CGEMM, CTRMM, CGEMV and ZGEMV
- Added handling of zero increments in the MSA kernels for SSWAP and DSWAP
- Added DYNAMIC_ARCH support for MIPS64 (currently Loongson3R3/3R4 only)
SPARC:
- Fixed building 32 and 64 bit SPARC kernels with the SolarisStudio compilers
md5sum:
2ca05b9cee97f0d1a8ab15bd6ea2b747 OpenBLAS-0.3.13.tar.gz
ab433ae7ed37ad282a67c2cfcc7c4301 OpenBLAS-0.3.13.zip
855469f768c6e32cf68f9cdb6f5fa69e OpenBLAS-0.3.13-x64.zip
467463847f57f54b94242fb6393a0bf9 OpenBLAS-0.3.13-x86.zip