Albert au format GGUF #10

scenaristeur · 2024-05-29T11:39:42Z

Bonjour,
avez-vous prévu de mettre à disposition sur https://huggingface.co/AgentPublic les différents modèles Albert au format GGUF , comme le fait TheBloke

Ceci faciliterait leur utilisation par des outils comme node-llama-cpp ou llama-cpp-python pour le développement d'applications

doc de conversion ggerganov/llama.cpp#2948

bolinocroustibat · 2024-05-29T15:46:44Z

Bonjour et merci pour votre intérêt, oui tout à fait c'est prévu, même si ce n'est pas dans les priorités.

scenaristeur · 2024-05-29T21:22:54Z

J'ai tenté la conversion mais j'ai un message d'erreur

python llama.cpp/convert.py AgentPublic-albertlight-7b  --outfile AgentPublic-albertlight-7b.gguf --outtype q8_0
INFO:convert:Loading model file AgentPublic-albertlight-7b/model-00001-of-00003.safetensors
INFO:convert:Loading model file AgentPublic-albertlight-7b/model-00001-of-00003.safetensors
INFO:convert:Loading model file AgentPublic-albertlight-7b/model-00002-of-00003.safetensors
INFO:convert:Loading model file AgentPublic-albertlight-7b/model-00003-of-00003.safetensors
INFO:convert:model parameters count : 13015864320 (13B)
INFO:convert:params = Params(n_vocab=32000, n_embd=5120, n_layer=40, n_ctx=4096, n_ff=13824, n_head=40, n_head_kv=40, n_experts=None, n_experts_used=None, f_norm_eps=1e-05, rope_scaling_type=None, f_rope_freq_base=10000.0, f_rope_scale=None, n_orig_ctx=None, rope_finetuned=None, ftype=<GGMLFileType.MostlyQ8_0: 7>, path_model=PosixPath('AgentPublic-albertlight-7b'))
INFO:convert:Loaded vocab file PosixPath('AgentPublic-albertlight-7b/tokenizer.model'), type 'spm'
INFO:convert:Vocab info: <SentencePieceVocab with 32000 base tokens and 1 added tokens>
INFO:convert:Special vocab info: <SpecialVocab with 0 merges, special tokens {'bos': 1, 'eos': 2, 'unk': 0, 'pad': 0}, add special tokens {'bos': False, 'eos': False}>
INFO:convert:Writing AgentPublic-albertlight-7b.gguf, format 7
Traceback (most recent call last):
  File "/home/smag/dev/albert_conversion/llama.cpp/convert.py", line 1714, in <module>
    main()
  File "/home/smag/dev/albert_conversion/llama.cpp/convert.py", line 1708, in main
    OutputFile.write_all(outfile, ftype, params, model, vocab, special_vocab,
  File "/home/smag/dev/albert_conversion/llama.cpp/convert.py", line 1280, in write_all
    check_vocab_size(params, vocab, pad_vocab=pad_vocab)
  File "/home/smag/dev/albert_conversion/llama.cpp/convert.py", line 1099, in check_vocab_size
    raise ValueError(msg)
ValueError: Vocab size mismatch (model has 32000, but AgentPublic-albertlight-7b/tokenizer.model has 32001).

je modifie la dernière ligne de config.json en "vocab_size": 32001 et vous tiens au courant

scenaristeur · 2024-05-29T21:34:32Z

mais quand je tente de le charger dans llama-cpp-python , j'ai ce 32000 qui refait surface :

llama_model_load: error loading model: check_tensor_dims: tensor 'token_embd.weight' has wrong shape; expected 5120, 32001, got 5120, 32000, 1, 1

dionc le modèle a un soucis sur la "vocab_size" ou sur sa définition dans les fichiers de config

llm_load_print_meta: LF token         = 13 '<0x0A>'
llm_load_tensors: ggml ctx size =    0.14 MiB
llama_model_load: error loading model: check_tensor_dims: tensor 'token_embd.weight' has wrong shape; expected  5120, 32001, got  5120, 32000,     1,     1
llama_load_model_from_file: failed to load model
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)

scenaristeur · 2024-05-29T21:38:06Z

si on considère ceci :

"added_tokens": [
    {
      "id": 0,
      "content": "<unk>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    {
      "id": 1,
      "content": "<s>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": true,
      "special": true
    },
    {
      "id": 2,
      "content": "</s>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": true,
      "special": true
    },
    {
      "id": 32000,
      "content": "<pad>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": true,
      "special": false
    }
  ],

32000 + le 0, ça fait 32001, non ?

scenaristeur · 2024-05-29T21:49:38Z

la solution serait ggerganov/llama.cpp#3583 (comment) de supprimer la ligne dans added_token.

mais c'est un peu strange, on a deux token 0 : unk et pad . je sui seulement utilisateur en énéral, je n'ai jamais buildé / converti de modèle, à quoi correspond ?

INFO:gguf.vocab:Setting special token type bos to 1
INFO:gguf.vocab:Setting special token type eos to 2
INFO:gguf.vocab:Setting special token type unk to 0
INFO:gguf.vocab:Setting special token type pad to 0

scenaristeur · 2024-05-29T22:21:29Z

plus d'infos sur la méthode : https://qwen.readthedocs.io/en/latest/quantization/gguf.html

scenaristeur · 2024-05-29T22:37:00Z

et le GGUF et Q8_0 https://huggingface.co/spoggy/AgentPublic-albertlight/tree/main

pedevineau · 2024-05-30T09:29:50Z

Merci @scenaristeur pour ce bug report. Nous essayons de reproduire ce bug

pedevineau · 2024-05-30T15:59:54Z

Nous avons réussi à produire un gguf la semaine dernière, mais nous n'avons pas encore réussi à reproduire votre bug report.
Le GGUF est un sujet que nous considérons comme important mais non prioritaire.
Je laisse l'issue ouverte pour partager une solution à votre erreur quand nous pourrons trouver du temps.
N'hésitez pas à nous tenir au courant en cas d'update.
Merci beaucoup

bolinocroustibat · 2024-06-04T13:04:24Z

Pour info @scenaristeur, plusieurs personnes de la communauté sur HuggingFace ont produit des versions GGUF d'Albert light 7B, à notre connaissance pour l'instant :

bolinocroustibat assigned pedevineau May 29, 2024

bolinocroustibat closed this as completed May 29, 2024

pedevineau reopened this May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Albert au format GGUF #10

Albert au format GGUF #10

scenaristeur commented May 29, 2024 •

edited

Loading

bolinocroustibat commented May 29, 2024

scenaristeur commented May 29, 2024

scenaristeur commented May 29, 2024 •

edited

Loading

scenaristeur commented May 29, 2024

scenaristeur commented May 29, 2024

scenaristeur commented May 29, 2024

scenaristeur commented May 29, 2024

pedevineau commented May 30, 2024

pedevineau commented May 30, 2024

bolinocroustibat commented Jun 4, 2024 •

edited

Loading

Albert au format GGUF #10

Albert au format GGUF #10

Comments

scenaristeur commented May 29, 2024 • edited Loading

bolinocroustibat commented May 29, 2024

scenaristeur commented May 29, 2024

scenaristeur commented May 29, 2024 • edited Loading

scenaristeur commented May 29, 2024

scenaristeur commented May 29, 2024

scenaristeur commented May 29, 2024

scenaristeur commented May 29, 2024

pedevineau commented May 30, 2024

pedevineau commented May 30, 2024

bolinocroustibat commented Jun 4, 2024 • edited Loading

scenaristeur commented May 29, 2024 •

edited

Loading

scenaristeur commented May 29, 2024 •

edited

Loading

bolinocroustibat commented Jun 4, 2024 •

edited

Loading