Skip to content

Latest commit

 

History

History
10 lines (8 loc) · 4.53 KB

MODELS.md

File metadata and controls

10 lines (8 loc) · 4.53 KB

Models used in Generative Assistants

Here you may find a list of models that currently available for use in Generative Assistants.

model name model link open-source? size (billion parameters) GPU usage max tokens (prompt + response) description
BLOOMZ 7B link yes 7.1B 33GB 2,048 tokens An open-source multilingual task-oriented large language model. BLOOMZ 7B1 comes from BLOOMZ model family (featuring 560M, 1.1B, 1.7B, 3B, 7.1B, and 176B parameter versions). Each of the models is a BLOOM model of corresponding size, fine-tuned on cross-lingual task-instruction dataset (46 languages, 16 NLP tasks). For more details about BLOOM, refer to this paper. For more details about BLOOMZ and its dataset, refer to this paper.
GPT-J 6B link yes 6B 25GB 2,048 tokens An open-source large language model. English-only, not fine-tuned for instruction following, not capable of code generation. For more details, refer to this GitHub repo
GPT-3.5 link no (paid access via API) supposedly, 175B - (cannot be run locally) 4,097 tokens Based on text-davinci-003 -- the largest and most capable of GPT-3/GPT-3.5 models family (featuring davinci, curie, babbage, ada models) not optimized for chat. Unlike earlier GPT-3 models, also able to understand and generate code. Unlike GPT-3.5 turbo, not optimised for chat. For more details, refer to OpenAI website.
ChatGPT link no (paid access via API) supposedly, 175B - (cannot be run locally) 4,096 tokens Based on gpt-3.5-turbo -- the most capable of the entire GPT-3/GPT-3.5 models family. Optimized for chat. Able to understand and generate code. For more details, refer to OpenAI website.