-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Frequently Asked Questions
ThiloteE edited this page Aug 27, 2024
·
22 revisions
There are many "best" models for many situations. The factors of what is best for you depends on the following:
- How much effort you want to put into setting it up.
- If you want it all done for you "asap"
- Scroll through our "Add Models" list without using the keyword search function. The models are pre-configured and ready to use.
- If you want to get a custom one and configure it yourself. These are NOT configured; we have a WIKI explaining how to do this.
- Download using the keyword search function through our "Add Models" page to find models from Hugging Face.
- Sideload from some other website.
- If you want it all done for you "asap"
-
Hardware requirements
- As a general rule of thump:
- Smaller models require less memory (RAM or VRAM) and will run faster.
- Larger models require more memory and will run slower, but outperform in terms of capabilities and produce better output.
- Newer models tend to outperform older models to such a degree that sometimes smaller newer models outperform larger older models.
- As a general rule of thump:
- What you need the model to do.
- The models working with GPT4All are made for generating text.
- Multi-lingual models are better at certain languages.
- Coding models are better at understanding code.
- Agentic or Function/Tool Calling models will use tools made available to them.
- Instruct models are better at being directed for tasks.
- Chat models are good for conversational purposes.
- Uncensored models are good for roleplaying or story writing.
- These come in various forms and are derived from the Chat or Instruct variants.
- The models working with GPT4All are made for generating text.
- Look at benchmarks to get an idea of which does what better.
- Settings directory:
C:\Users\%USERNAME%\AppData\Roaming\nomic.ai
- Models directory:
C:\Users\%USERNAME%\AppData\Local\nomic.ai\GPT4All
- Settings directory:
/Users/{username}/.config/gpt4all.io
- Models directory:
/Users/{username}/Library/Application Support/nomic.ai/GPT4All
- Settings directory:
/home/{username}/.config/nomic.ai
- Models directory:
/home/{username}/.local/share/nomic.ai/GPT4All
- Temperature: This controls the randomness of predictions, lower values make the model more deterministic, while higher values increase randomness.
- Top K: This limits the sampling pool to the most probable tokens. For example, if K=50, only the 50 most likely tokens are considered for the next word prediction.
- Top P: The model looks at all possible next tokens and picks the smallest group of tokens that together have a total probability of at least this percentage. For instance, a setting of "1" will include 100% of all probable tokens. If P=0.9, it includes the fewest number of tokens with a combined probability of at least 90%. The lower this number is set towards 0 the less tokens will be included in the set the model will use next.
- Min P: This sets a minimum probability threshold for individual tokens. The remaining selected tokens have a combined probability of 100%. A setting of "1" will include only 1 token with a probability if 100%. A much lower setting like P=0.05, includes the smallest number of tokens with a probability greater than 5%.
Experience how settings like Temperature, Top K, Top P, Min P change model behavior in this live example
Find the right number of GPU layers in the model settings. If you have a small amount of GPU memory you will want to start low and move up until the model wont load. Then use the last known good setting. Make sure the model has GPU support.
- Vulkan supports f16, Q4_0, Q4_1 models with GPU. (some models won't have any GPU support)
- Cuda supports all gguf formats. (some models won't have any GPU support)
Ensure you are using the GPU if you have one. See "Settings > Application : Device" Make sure it is set to use either Vulkan or Cuda.