rocBLAS-2.0.0 for ROCm 2.0
Changelist:
- improved performance of fp16/fp32 rocblas_gemm_ex on gfx906
- support for i8/i32 rocblas_gemm_ex
- update vega-10 resnet50 tuning
- refactor testing to be data driven
- change gemm-ex API solution index from uint32_t to int32_t
- disable gemm and gemm_ex chunking
- fix gemv argument checking
- add performance script for p1b1 benchmark sizes
- refactor gemm code to reduce use of macros
- trsm performance regression fix