Skip to content

HALA v1.0

Latest
Compare
Choose a tag to compare
@mkstoyanov mkstoyanov released this 22 Oct 22:59
b162f98
  • C++ template wrappers for BLAS implementations using CPU, CUDA or ROCm frameworks
  • support for real and complex, single and double-precision
  • included support for matrix multiply and triangular solve methods using sparse matrices
  • added implementation for several iterative Krylov solvers (CPU and GPU versions)
  • support for some LAPACK methods (CPU only)
  • support for extended registers (sse3, avx, avx512) using templates wrapping CPU intrinsics