You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
The key has expired.
Added
Option to install script for number of jobs to use for rocBLAS and Tensile compilation (-j, --jobs)
Option to install script to build clients without using any Fortran (--clients_no_fortran)
rocblas_client_initialize function, to perform rocBLAS initialize for clients(benchmark/test) and report the execution time.
Added tests for output of reduction functions when given bad input
Added user specified initialization (rand_int/trig_float/hpl) for initializing matrices and vectors in rocblas-bench
Optimizations
Improved performance of trsm with side == left and n == 1
Improved perforamnce of trsm with side == left and m <= 32 along with side == right and n <= 32
Changed
For syrkx and trmm internal API use rocblas_stride datatype for offset
For non-batched and batched gemm_ex functions if the C matrix pointer equals the D matrix pointer (aliased) their respective type and leading dimension arguments must now match
Test client dependencies updated to GTest 1.11
non-global false positives reported by cppcheck from file based suppression to inline suppression. File based suppression will only be used for global false positives.
Help menu messages in install.sh
For ger function, typecast the 'lda'(offset) datatype to size_t during offset calculation to avoid overflow and remove duplicate template functions.
Modified default initialization from rand_int to hpl for initializing matrices and vectors in rocblas-bench
Fixed
For function trmv (non-transposed cases) avoid overflow in offset calculation