Skip to content
@kvcache-ai

kvcache.ai

KVCache.AI is a joint research project between MADSys and top industry collaborators, focusing on efficient LLM serving.

Pinned Loading

  1. Mooncake Mooncake Public

    Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

    C++ 3.7k 343

  2. ktransformers ktransformers Public

    A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

    Python 14.8k 1.1k

Repositories

Showing 6 of 6 repositories
  • Mooncake Public

    Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

    kvcache-ai/Mooncake’s past year of commit activity
    C++ 3,723 Apache-2.0 343 120 (2 issues need help) 41 Updated Aug 8, 2025
  • sglang Public Forked from sgl-project/sglang

    SGLang is a fast serving framework for large language models and vision language models.

    kvcache-ai/sglang’s past year of commit activity
    Python 2 Apache-2.0 2,558 0 1 Updated Aug 8, 2025
  • ktransformers Public

    A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

    kvcache-ai/ktransformers’s past year of commit activity
    Python 14,810 Apache-2.0 1,058 593 16 Updated Aug 2, 2025
  • DeepEP_fault_tolerance Public Forked from deepseek-ai/DeepEP

    DeepEP: an efficient expert-parallel communication library that supports fault tolerance

    kvcache-ai/DeepEP_fault_tolerance’s past year of commit activity
    Cuda 1 MIT 890 0 0 Updated Jul 31, 2025
  • custom_flashinfer Public Forked from flashinfer-ai/flashinfer

    FlashInfer: Kernel Library for LLM Serving

    kvcache-ai/custom_flashinfer’s past year of commit activity
    Cuda 6 Apache-2.0 421 0 0 Updated Jul 24, 2025
  • vllm Public Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    kvcache-ai/vllm’s past year of commit activity
    Python 14 Apache-2.0 9,326 0 0 Updated Mar 27, 2025

Most used topics

Loading…