DOC: Add doc for fish speech and cogvlm2 video (#2149)

xorbitsai · Aug 23, 2024 · 16d1193 · 16d1193
1 parent e672767
commit 16d1193
Show file tree

Hide file tree

Showing 6 changed files with 69 additions and 0 deletions.
diff --git a/doc/source/models/builtin/audio/fishspeech-1.2-sft.rst b/doc/source/models/builtin/audio/fishspeech-1.2-sft.rst
@@ -0,0 +1,19 @@
+.. _models_builtin_fishspeech-1.2-sft:
+
+==================
+FishSpeech-1.2-SFT
+==================
+
+- **Model Name:** FishSpeech-1.2-SFT
+- **Model Family:** FishAudio
+- **Abilities:** text-to-audio
+- **Multilingual:** True
+
+Specifications
+^^^^^^^^^^^^^^
+
+- **Model ID:** fishaudio/fish-speech-1.2-sft
+
+Execute the following command to launch the model::
+
+   xinference launch --model-name FishSpeech-1.2-SFT --model-type audio
diff --git a/doc/source/models/builtin/audio/index.rst b/doc/source/models/builtin/audio/index.rst
@@ -25,6 +25,8 @@ The following is a list of built-in audio models in Xinference:
 
    cosyvoice-300m-sft
 
+   fishspeech-1.2-sft
+
    sensevoicesmall
 
    whisper-base

diff --git a/doc/source/models/builtin/llm/cogvlm2-video-llama3-chat.rst b/doc/source/models/builtin/llm/cogvlm2-video-llama3-chat.rst
@@ -0,0 +1,31 @@
+.. _models_llm_cogvlm2-video-llama3-chat:
+
+========================================
+cogvlm2-video-llama3-chat
+========================================
+
+- **Context Length:** 8192
+- **Model Name:** cogvlm2-video-llama3-chat
+- **Languages:** en, zh
+- **Abilities:** chat, vision
+- **Description:** CogVLM2-Video achieves state-of-the-art performance on multiple video question answering tasks.
+
+Specifications
+^^^^^^^^^^^^^^
+
+
+Model Spec 1 (pytorch, 12 Billion)
+++++++++++++++++++++++++++++++++++++++++
+
+- **Model Format:** pytorch
+- **Model Size (in billions):** 12
+- **Quantizations:** 4-bit, 8-bit, none
+- **Engines**: Transformers
+- **Model ID:** THUDM/cogvlm2-video-llama3-chat
+- **Model Hubs**:  `Hugging Face <https://huggingface.co/THUDM/cogvlm2-video-llama3-chat>`__, `ModelScope <https://modelscope.cn/models/ZhipuAI/cogvlm2-video-llama3-chat>`__
+
+Execute the following command to launch the model, remember to replace ``${quantization}`` with your
+chosen quantization method from the options listed above::
+
+   xinference launch --model-engine ${engine} --model-name cogvlm2-video-llama3-chat --size-in-billions 12 --model-format pytorch --quantization ${quantization}
+
diff --git a/doc/source/models/builtin/llm/index.rst b/doc/source/models/builtin/llm/index.rst
@@ -111,6 +111,11 @@ The following is a list of built-in LLM in Xinference:
      - 8192
      - CogVLM2 have achieved good results in many lists compared to the previous generation of CogVLM open source models. Its excellent performance can compete with some non-open source models.
 
+   * - :ref:`cogvlm2-video-llama3-chat <models_llm_cogvlm2-video-llama3-chat>`
+     - chat, vision
+     - 8192
+     - CogVLM2-Video achieves state-of-the-art performance on multiple video question answering tasks.
+
    * - :ref:`csg-wukong-chat-v0.1 <models_llm_csg-wukong-chat-v0.1>`
      - chat
      - 32768
@@ -534,6 +539,8 @@ The following is a list of built-in LLM in Xinference:
 
    cogvlm2
 
+   cogvlm2-video-llama3-chat
+
    csg-wukong-chat-v0.1
 
    deepseek

diff --git a/xinference/deploy/docker/requirements.txt b/xinference/deploy/docker/requirements.txt
@@ -61,6 +61,11 @@ tensorizer~=2.9.0
 imageio-ffmpeg  # For video
 eva-decord  # For video in VL
 jj-pytorchvideo # For CogVLM2-video
+loguru  # For Fish Speech
+natsort  # For Fish Speech
+loralib  # For Fish Speech
+opencc==1.1.6  # For Fish Speech
+faster_whisper  # For Fish Speech
 
 # sglang
 outlines>=0.0.44

diff --git a/xinference/deploy/docker/requirements_cpu.txt b/xinference/deploy/docker/requirements_cpu.txt
@@ -56,3 +56,8 @@ openai-whisper  # For CosyVoice
 imageio-ffmpeg  # For video
 eva-decord  # For video in VL
 jj-pytorchvideo # For CogVLM2-video
+loguru  # For Fish Speech
+natsort  # For Fish Speech
+loralib  # For Fish Speech
+opencc==1.1.6  # For Fish Speech
+faster_whisper  # For Fish Speech