Skip to content

ROCm/aiter

Repository files navigation

aiter

image

AITER is AMD’s centralized repository that support various of high performance AI operators for AI workloads acceleration, where a good unified place for all the customer operator-level requests, which can match different customers' needs. Developers can focus on operators, and let the customers integrate this op collection into their own private/public/whatever framework.

Some summary of the features:

  • C++ level API
  • Python level API
  • The underneath kernel could come from triton/ck/asm
  • Not just inference kernels, but also training kernels and GEMM+communication kernels—allowing for workarounds in any kernel-framework combination for any architecture limitation.

Installation

git clone --recursive https://github.com/ROCm/aiter.git

or

git submodule sync ; git submodule update --init --recursive

Then

cd aiter
python3 setup.py develop

Run operators supported by aiter

There are number of op test, you can run them with: python3 op_tests/test_layernorm2d.py

Ops Description
ELEMENT WISE ops: + - * /
SIGMOID (x) = 1 / (1 + e^-x)
AllREDUCE Reduce + Broadcast
KVCACHE W_K W_V
MHA Multi-Head Attention
MLA Multi-head Latent Attention with KV-Cache layout
PA Paged Attention
FusedMoe Mixture of Experts
QUANT BF16/FP16 -> FP8/INT4
RMSNORM root mean square
LAYERNORM x = (x - u) / (σ2 + ϵ) e*0.5
ROPE Rotary Position Embedding
GEMM D=αAβB+C