Open
Description
We're doing some work over at https://github.com/huggingface/candle to improve our Metal backend, I've been collecting various gputraces for the different frameworks and was wondering if there was a documented/known way to generate one for llama.cpp during model inference.
Specifically talking about this type of debugger output: https://developer.apple.com/documentation/xcode/metal-debugger