Skip to content

Commit

Permalink
DOC: update readme & add tips for large image models (#2056)
Browse files Browse the repository at this point in the history
  • Loading branch information
qinxuye authored Aug 10, 2024
1 parent 3e7ed86 commit c4cbd38
Show file tree
Hide file tree
Showing 6 changed files with 160 additions and 70 deletions.
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,14 +34,14 @@ potential of cutting-edge AI models.
- Support speech recognition model: [#929](https://github.com/xorbitsai/inference/pull/929)
- Metrics support: [#906](https://github.com/xorbitsai/inference/pull/906)
### New Models
- Built-in support for [CogVideoX](https://github.com/THUDM/CogVideo): [#2049](https://github.com/xorbitsai/inference/pull/2049)
- Built-in support for [flux.1-schnell & flux.1-dev](https://www.basedlabs.ai/tools/flux1): [#2007](https://github.com/xorbitsai/inference/pull/2007)
- Built-in support for [MiniCPM-V 2.6](https://github.com/OpenBMB/MiniCPM-V): [#2031](https://github.com/xorbitsai/inference/pull/2031)
- Built-in support for [Kolors](https://huggingface.co/Kwai-Kolors/Kolors): [#2028](https://github.com/xorbitsai/inference/pull/2028)
- Built-in support for [SenseVoice](https://github.com/FunAudioLLM/SenseVoice): [#2008](https://github.com/xorbitsai/inference/pull/2008)
- Built-in support for [Mistral Large 2](https://mistral.ai/news/mistral-large-2407/): [#1944](https://github.com/xorbitsai/inference/pull/1944)
- Built-in support for [llama3.1](https://ai.meta.com/blog/meta-llama-3-1/): [#1932](https://github.com/xorbitsai/inference/pull/1932)
- Built-in support for [Mistral Nemo](https://mistral.ai/news/mistral-nemo/): [#1936](https://github.com/xorbitsai/inference/pull/1936)
- Built-in support for [CosyVoice](https://github.com/FunAudioLLM/CosyVoice): [#1881](https://github.com/xorbitsai/inference/pull/1881)
- Built-in support for [codegeex4](https://github.com/THUDM/CodeGeeX4): [#1888](https://github.com/xorbitsai/inference/pull/1888)
- Built-in support for [Gemma-2-it](https://huggingface.co/blog/gemma2): [#1774](https://github.com/xorbitsai/inference/pull/1774)
- Built-in support for [jina-reranker-v2](https://huggingface.co/jinaai/jina-reranker-v2-base-multilingual): [#1733](https://github.com/xorbitsai/inference/pull/1733)
- Built-in support for [Qwen2](https://github.com/QwenLM/Qwen2): [#1509](https://github.com/xorbitsai/inference/pull/1597)
### Integrations
- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable.
- [FastGPT](https://github.com/labring/FastGPT): a knowledge-based platform built on the LLM, offers out-of-the-box data processing and model invocation capabilities, allows for workflow orchestration through Flow visualization.
Expand Down
10 changes: 5 additions & 5 deletions README_zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,14 +31,14 @@ Xorbits Inference(Xinference)是一个性能强大且功能全面的分布
- 支持语音识别模型: [#929](https://github.com/xorbitsai/inference/pull/929)
- 增加 Metrics 统计信息: [#906](https://github.com/xorbitsai/inference/pull/906)
### 新模型
- 内置 [CogVideoX](https://github.com/THUDM/CogVideo): [#2049](https://github.com/xorbitsai/inference/pull/2049)
- 内置 [flux.1-schnell & flux.1-dev](https://www.basedlabs.ai/tools/flux1): [#2007](https://github.com/xorbitsai/inference/pull/2007)
- 内置 [MiniCPM-V 2.6](https://github.com/OpenBMB/MiniCPM-V): [#2031](https://github.com/xorbitsai/inference/pull/2031)
- 内置 [Kolors](https://huggingface.co/Kwai-Kolors/Kolors): [#2028](https://github.com/xorbitsai/inference/pull/2028)
- 内置 [SenseVoice](https://github.com/FunAudioLLM/SenseVoice): [#2008](https://github.com/xorbitsai/inference/pull/2008)
- 内置 [Mistral Large 2](https://mistral.ai/news/mistral-large-2407/): [#1944](https://github.com/xorbitsai/inference/pull/1944)
- 内置 [llama3.1](https://ai.meta.com/blog/meta-llama-3-1/): [#1932](https://github.com/xorbitsai/inference/pull/1932)
- 内置 [Mistral Nemo](https://mistral.ai/news/mistral-nemo/): [#1936](https://github.com/xorbitsai/inference/pull/1936)
- 内置 [CosyVoice](https://github.com/FunAudioLLM/CosyVoice): [#1881](https://github.com/xorbitsai/inference/pull/1881)
- 内置 [codegeex4](https://github.com/THUDM/CodeGeeX4): [#1888](https://github.com/xorbitsai/inference/pull/1888)
- 内置 [Gemma-2-it](https://huggingface.co/blog/gemma2): [#1774](https://github.com/xorbitsai/inference/pull/1774)
- 内置 [jina-reranker-v2](https://huggingface.co/jinaai/jina-reranker-v2-base-multilingual): [#1733](https://github.com/xorbitsai/inference/pull/1733)
- 内置 [Qwen2](https://github.com/QwenLM/Qwen2): [#1509](https://github.com/xorbitsai/inference/pull/1597)
### 集成
- [FastGPT](https://doc.fastai.site/docs/development/custom-models/xinference/):一个基于 LLM 大模型的开源 AI 知识库构建平台。提供了开箱即用的数据处理、模型调用、RAG 检索、可视化 AI 工作流编排等能力,帮助您轻松实现复杂的问答场景。
- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): 一个涵盖了大型语言模型开发、部署、维护和优化的 LLMOps 平台。
Expand Down
51 changes: 29 additions & 22 deletions doc/source/locale/zh_CN/LC_MESSAGES/models/model_abilities/audio.po
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ msgid ""
msgstr ""
"Project-Id-Version: Xinference \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-07-30 21:20+0800\n"
"POT-Creation-Date: 2024-08-09 19:13+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
Expand Down Expand Up @@ -131,27 +131,31 @@ msgstr ""
msgid "Belle-whisper-large-v3-zh"
msgstr ""

#: ../../source/models/model_abilities/audio.rst:60
#: ../../source/models/model_abilities/audio.rst:57
msgid "SenseVoiceSmall"
msgstr ""

#: ../../source/models/model_abilities/audio.rst:61
msgid "Text to audio"
msgstr "文本转语音"

#: ../../source/models/model_abilities/audio.rst:62
#: ../../source/models/model_abilities/audio.rst:63
msgid "ChatTTS"
msgstr ""

#: ../../source/models/model_abilities/audio.rst:63
#: ../../source/models/model_abilities/audio.rst:64
msgid "CosyVoice"
msgstr ""

#: ../../source/models/model_abilities/audio.rst:66
#: ../../source/models/model_abilities/audio.rst:67
msgid "Quickstart"
msgstr "快速入门"

#: ../../source/models/model_abilities/audio.rst:69
#: ../../source/models/model_abilities/audio.rst:70
msgid "Transcription"
msgstr "转录"

#: ../../source/models/model_abilities/audio.rst:71
#: ../../source/models/model_abilities/audio.rst:72
msgid ""
"The Transcription API mimics OpenAI's `create transcriptions API "
"<https://platform.openai.com/docs/api-"
Expand All @@ -163,11 +167,11 @@ msgstr ""
"可以通过 cURL、OpenAI Client 或者 Xinference 的 Python 客户端来尝试 "
"Transcription API:"

#: ../../source/models/model_abilities/audio.rst:122
#: ../../source/models/model_abilities/audio.rst:123
msgid "Translation"
msgstr "翻译"

#: ../../source/models/model_abilities/audio.rst:124
#: ../../source/models/model_abilities/audio.rst:125
msgid ""
"The Translation API mimics OpenAI's `create translations API "
"<https://platform.openai.com/docs/api-"
Expand All @@ -179,11 +183,11 @@ msgstr ""
"通过 cURL、OpenAI Client 或 Xinference 的 Python 客户端来尝试使用 "
"Translation API:"

#: ../../source/models/model_abilities/audio.rst:174
#: ../../source/models/model_abilities/audio.rst:175
msgid "Speech"
msgstr "语音"

#: ../../source/models/model_abilities/audio.rst:176
#: ../../source/models/model_abilities/audio.rst:177
msgid ""
"The Speech API mimics OpenAI's `create speech API "
"<https://platform.openai.com/docs/api-reference/audio/createSpeech>`_. We"
Expand All @@ -194,44 +198,47 @@ msgstr ""
"openai.com/docs/api-reference/audio/createSpeech>`_。你可以通过 cURL、"
"OpenAI Client 或者 Xinference 的 Python 客户端来尝试 Speech API:"

#: ../../source/models/model_abilities/audio.rst:179
#: ../../source/models/model_abilities/audio.rst:180
msgid "Speech API use non-stream by default as"
msgstr "Speech API 默认使用非流式"

#: ../../source/models/model_abilities/audio.rst:181
#: ../../source/models/model_abilities/audio.rst:182
msgid ""
"The stream output of ChatTTS is not as good as the non-stream output, "
"please refer to: https://github.com/2noise/ChatTTS/pull/564"
msgstr ""
"ChatTTS 的流式输出不如非流式的效果好,参考:https://github.com/2noise/ChatTTS/pull/564"
"ChatTTS 的流式输出不如非流式的效果好,参考:https://github.com/2noise/"
"ChatTTS/pull/564"

#: ../../source/models/model_abilities/audio.rst:182
#: ../../source/models/model_abilities/audio.rst:183
msgid ""
"The stream requires ffmpeg<7: "
"https://pytorch.org/audio/stable/installation.html#optional-dependencies"
msgstr "流式要求 ffmpeg<7:https://pytorch.org/audio/stable/installation.html#optional-dependencies"
msgstr ""
"流式要求 ffmpeg<7:https://pytorch.org/audio/stable/installation.html#"
"optional-dependencies"

#: ../../source/models/model_abilities/audio.rst:234
#: ../../source/models/model_abilities/audio.rst:235
msgid "CosyVoice Usage"
msgstr "CosyVoice 模型使用"

#: ../../source/models/model_abilities/audio.rst:236
#: ../../source/models/model_abilities/audio.rst:237
msgid "Basic usage, launch model ``CosyVoice-300M-SFT``."
msgstr "基本使用,加载模型 ``CosyVoice-300M-SFT``。"

#: ../../source/models/model_abilities/audio.rst:285
#: ../../source/models/model_abilities/audio.rst:286
msgid "Clone voice, launch model ``CosyVoice-300M``."
msgstr "克隆声音,加载模型 ``CosyVoice-300M``。"

#: ../../source/models/model_abilities/audio.rst:308
#: ../../source/models/model_abilities/audio.rst:309
msgid "Cross lingual usage, launch model ``CosyVoice-300M``."
msgstr "跨语言使用,加载模型 ``CosyVoice-300M``。"

#: ../../source/models/model_abilities/audio.rst:327
#: ../../source/models/model_abilities/audio.rst:328
msgid "Instruction based, launch model ``CosyVoice-300M-Instruct``."
msgstr "基于指令的声音合成,加载模型 ``CosyVoice-300M-Instruct``。"

#: ../../source/models/model_abilities/audio.rst:344
#: ../../source/models/model_abilities/audio.rst:345
msgid ""
"More instructions and examples, could be found at https://fun-audio-"
"llm.github.io/ ."
Expand Down
93 changes: 70 additions & 23 deletions doc/source/locale/zh_CN/LC_MESSAGES/models/model_abilities/image.po
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ msgid ""
msgstr ""
"Project-Id-Version: Xinference \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-06-26 12:25+0000\n"
"POT-Creation-Date: 2024-08-09 19:13+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
Expand All @@ -20,8 +20,8 @@ msgstr ""
"Generated-By: Babel 2.14.0\n"

#: ../../source/models/model_abilities/image.rst:5
msgid "Images (Experimental)"
msgstr "图像(实验性质)"
msgid "Images"
msgstr "图像"

#: ../../source/models/model_abilities/image.rst:7
msgid "Learn how to generate images with Xinference."
Expand Down Expand Up @@ -101,15 +101,23 @@ msgstr ""
msgid "sd3-medium"
msgstr ""

#: ../../source/models/model_abilities/image.rst:47
#: ../../source/models/model_abilities/image.rst:44
msgid "FLUX.1-schnell"
msgstr ""

#: ../../source/models/model_abilities/image.rst:45
msgid "FLUX.1-dev"
msgstr ""

#: ../../source/models/model_abilities/image.rst:49
msgid "Quickstart"
msgstr "快速入门"

#: ../../source/models/model_abilities/image.rst:50
#: ../../source/models/model_abilities/image.rst:52
msgid "Text-to-image"
msgstr "文生图"

#: ../../source/models/model_abilities/image.rst:52
#: ../../source/models/model_abilities/image.rst:54
msgid ""
"The Text-to-image API mimics OpenAI's `create images API "
"<https://platform.openai.com/docs/api-reference/images/create>`_. We can "
Expand All @@ -119,38 +127,77 @@ msgstr ""
"可以通过 cURL、OpenAI Client 或 Xinference 的方式尝试使用 Text-to-image "
"API。"

#: ../../source/models/model_abilities/image.rst:108
#: ../../source/models/model_abilities/image.rst:109
msgid "Tips for Large Image models including sd3-medium, FLUX.1"
msgstr "大型图像模型部署(sd3-medium、FLUX.1 系列)贴士"

#: ../../source/models/model_abilities/image.rst:111
msgid "Useful extra parameters can be passed to launch including:"
msgstr "有用的传递给加载模型的额外参数包括:"

#: ../../source/models/model_abilities/image.rst:113
msgid ""
"If you are running ``sd3-medium`` on a GPU less than 24GB and "
"encountering out of memory, consider to add an extra param for launching "
"according to `this article "
"<https://huggingface.co/docs/diffusers/v0.29.1/en/api/pipelines/stable_diffusion/stable_diffusion_3"
"#dropping-the-t5-text-encoder-during-inference>`_."
"``--cpu_offload True``: specifying ``True`` will offload the components "
"of the model to CPU during inference in order to save memory, while "
"seeing a slight increase in inference latency. Model offloading will only"
" move a model component onto the GPU when it needs to be executed, while "
"keeping the remaining components on the CPU."
msgstr ""
"如果你在小于 24GB 的显卡上运行 ``sd3-medium`` 碰到内存不足的问题时,根据 "
"`这篇文章 <https://huggingface.co/docs/diffusers/v0.29.1/en/api/"
"pipelines/stable_diffusion/stable_diffusion_3#dropping-the-t5-text-"
"encoder-during-inference>`_ 考虑在加载模型时增加额外选项。"
"``--cpu_offload True``:指定 ``True`` 会在推理过程中将模型的组件卸载到 CPU 上以节省内存,"
"这会导致推理延迟略有增加。模型卸载仅会在需要执行时将模型组件移动到 GPU 上,同时保持其余组件在 CPU 上"

#: ../../source/models/model_abilities/image.rst:111
#: ../../source/models/model_abilities/image.rst:117
msgid ""
"xinference launch --model-name sd3-medium --model-type image "
"--text_encoder_3 None"
"``--quantize_text_encoder <text encoder layer>``: We leveraged the "
"``bitsandbytes`` library to load and quantize the T5-XXL text encoder to "
"8-bit precision. This allows you to keep using all text encoders "
"while only slightly impacting performance."
msgstr "``--quantize_text_encoder <text encoder layer>``:我们利用 ``bitsandbytes`` 库"
"加载并量化 T5-XXL 文本编码器至8位精度。这使得你能够在仅轻微影响性能的情况下继续使用全部文本编码器。"

#: ../../source/models/model_abilities/image.rst:120
msgid ""
"``--text_encoder_3 None``, for sd3-medium, removing the memory-intensive "
"4.7B parameter T5-XXL text encoder during inference can significantly "
"decrease the memory requirements with only a slight loss in performance."
msgstr ""
"``--text_encoder_3 None``,对于 sd3-medium,"
"移除在推理过程中内存密集型的47亿参数T5-XXL文本编码器可以显著降低内存需求,而仅造成性能上的轻微损失。"

#: ../../source/models/model_abilities/image.rst:124
msgid ""
"If you are trying to run large image models liek sd3-medium or FLUX.1 "
"series on GPU card that has less memory than 24GB, you may encounter OOM "
"when launching or inference. Try below solutions."
msgstr "如果你试图在显存小于24GB的GPU上运行像sd3-medium或FLUX.1系列这样的大型图像模型,"
"你在启动或推理过程中可能会遇到显存溢出(OOM)的问题。尝试以下解决方案。"

#: ../../source/models/model_abilities/image.rst:114
#: ../../source/models/model_abilities/image.rst:128
msgid "For FLUX.1 series, try to apply quantization."
msgstr "对于 FLUX.1 系列,尝试应用量化。"

#: ../../source/models/model_abilities/image.rst:134
msgid "For sd3-medium, apply quantization to ``text_encoder_3``."
msgstr "对于 sd3-medium 模型,对 ``text_encoder_3`` 应用量化。"

#: ../../source/models/model_abilities/image.rst:141
msgid "Or removing memory-intensive T5-XXL text encoder for sd3-medium."
msgstr "或者,移除 sd3-medium 模型中内存密集型的 T5-XXL 文本编码器。"

#: ../../source/models/model_abilities/image.rst:148
msgid "Image-to-image"
msgstr "图生图"

#: ../../source/models/model_abilities/image.rst:116
#: ../../source/models/model_abilities/image.rst:150
msgid "You can find more examples of Images API in the tutorial notebook:"
msgstr "你可以在教程笔记本中找到更多 Images API 的示例。"

#: ../../source/models/model_abilities/image.rst:120
#: ../../source/models/model_abilities/image.rst:154
msgid "Stable Diffusion ControlNet"
msgstr ""

#: ../../source/models/model_abilities/image.rst:123
#: ../../source/models/model_abilities/image.rst:157
msgid "Learn from a Stable Diffusion ControlNet example"
msgstr "学习一个 Stable Diffusion 控制网络的示例"


Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ msgid ""
msgstr ""
"Project-Id-Version: Xinference \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-06-05 12:48+0800\n"
"POT-Creation-Date: 2024-07-28 22:01+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
Expand Down Expand Up @@ -79,11 +79,15 @@ msgstr ""
msgid ":ref:`MiniCPM-Llama3-V 2.5 <models_llm_minicpm-llama3-v-2_5>`"
msgstr ""

#: ../../source/models/model_abilities/vision.rst:33
#: ../../source/models/model_abilities/vision.rst:30
msgid ":ref:`GLM-4V <models_llm_glm-4v>`"
msgstr ""

#: ../../source/models/model_abilities/vision.rst:34
msgid "Quickstart"
msgstr "快速入门"

#: ../../source/models/model_abilities/vision.rst:35
#: ../../source/models/model_abilities/vision.rst:36
msgid ""
"Images are made available to the model in two main ways: by passing a "
"link to the image or by passing the base64 encoded image directly in the "
Expand All @@ -92,23 +96,23 @@ msgstr ""
"模型可以通过两种主要方式获取图像:通过传递图像的链接或直接在请求中传递 "
"base64 编码的图像。"

#: ../../source/models/model_abilities/vision.rst:39
#: ../../source/models/model_abilities/vision.rst:40
msgid "Example using OpenAI Client"
msgstr "使用 OpenAI 客户端的示例"

#: ../../source/models/model_abilities/vision.rst:70
#: ../../source/models/model_abilities/vision.rst:71
msgid "Uploading base 64 encoded images"
msgstr "上传 Base64 编码的图片"

#: ../../source/models/model_abilities/vision.rst:112
#: ../../source/models/model_abilities/vision.rst:113
msgid "You can find more examples of ``vision`` ability in the tutorial notebook:"
msgstr "你可以在教程笔记本中找到更多关于 ``vision`` 能力的示例。"

#: ../../source/models/model_abilities/vision.rst:116
#: ../../source/models/model_abilities/vision.rst:117
msgid "Qwen VL Chat"
msgstr ""

#: ../../source/models/model_abilities/vision.rst:119
#: ../../source/models/model_abilities/vision.rst:120
msgid "Learn vision ability from a example using qwen-vl-chat"
msgstr "通过使用 qwen-vl-chat 的示例来学习使用 LLM 的视觉能力"

Loading

0 comments on commit c4cbd38

Please sign in to comment.