-
Notifications
You must be signed in to change notification settings - Fork 11.3k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
convert : BailingMoE : fix qkv split when head_dim is 0
python
python script changes
#12687
opened Apr 1, 2025 by
CISC
Loading…
Fix clang warning in gguf_check_reserved_keys
ggml
changes relating to the ggml tensor library for machine learning
#12686
opened Apr 1, 2025 by
yeahdongcn
Loading…
vulkan: fix build when glslc doesn't support coopmat
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12683
opened Apr 1, 2025 by
wbruna
Loading…
WIP: Add support for CogAgent
examples
python
python script changes
server
#12679
opened Mar 31, 2025 by
Tianyue-Zhao
•
Draft
vocab : BailingMoE : change possessive quantifiers to greedy
#12677
opened Mar 31, 2025 by
CISC
Loading…
[CANN]get_rows and dup optimization
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#12671
opened Mar 31, 2025 by
noemotiovon
Loading…
update changes relating to the ggml tensor library for machine learning
rope_multi
:
ggml
#12665
opened Mar 31, 2025 by
foldl
Loading…
opencl : fix memory allocation size
ggml
changes relating to the ggml tensor library for machine learning
#12649
opened Mar 30, 2025 by
sparkleholic
Loading…
tts : implement sesame CSM + Mimi decoder
examples
python
python script changes
#12648
opened Mar 29, 2025 by
ngxson
Loading…
llama-server : implement universal assisted decoding
examples
server
#12635
opened Mar 28, 2025 by
g2mt
Loading…
vulkan: Hybrid waitForFences/getFenceStatus to reduce fence latency
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12630
opened Mar 28, 2025 by
jeffbolznv
Loading…
vulkan: Implement split_k for coopmat2 flash attention.
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#12627
opened Mar 28, 2025 by
jeffbolznv
Loading…
opencl: remove a self-referential macro
ggml
changes relating to the ggml tensor library for machine learning
#12626
opened Mar 28, 2025 by
linehill
Loading…
sycl: allow ggml-sycl configuration and compilation using Visual Studio project/solution
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#12625
opened Mar 28, 2025 by
s-Nick
Loading…
opencl: Add support for multiple devices
ggml
changes relating to the ggml tensor library for machine learning
Enable MMA for BF16 data types on Powerpc
ggml
changes relating to the ggml tensor library for machine learning
#12565
opened Mar 25, 2025 by
shalinib-ibm
•
Draft
vulkan: Implement grouped query attention in the coopmat2 FA shader
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12559
opened Mar 25, 2025 by
jeffbolznv
Loading…
ggml-quants : weighted rounding algorithms with cumulative search
generation quality
Quality of model output
ggml
changes relating to the ggml tensor library for machine learning
Less than 4 bits
Efforts related to viable quantized models using <4 bits
research 🔬
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
Tensor Encoding Scheme
https://github.com/ggerganov/llama.cpp/wiki/Tensor-Encoding-Schemes
Draft: vulkan: Add bfloat16 support
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12554
opened Mar 24, 2025 by
jeffbolznv
Loading…
llama-map to support hugepage feature of pagesize 2M or 1G which can …
#12552
opened Mar 24, 2025 by
nickhuang99
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2025-03-29.