This repository has been archived by the owner on Jan 27, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathllama.log
67 lines (65 loc) · 4.54 KB
/
llama.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
clip_model_load: model name: openai/clip-vit-large-patch14-336
clip_model_load: description: image encoder for LLaVA
clip_model_load: GGUF version: 3
clip_model_load: alignment: 32
clip_model_load: n_tensors: 377
clip_model_load: n_kv: 19
clip_model_load: ftype: q4_0
clip_model_load: loaded meta data with 19 key-value pairs and 377 tensors from llava-v1.5-7b-mmproj-Q4_0.gguf
clip_model_load: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
clip_model_load: - kv 0: general.architecture str = clip
clip_model_load: - kv 1: clip.has_text_encoder bool = false
clip_model_load: - kv 2: clip.has_vision_encoder bool = true
clip_model_load: - kv 3: clip.has_llava_projector bool = true
clip_model_load: - kv 4: general.file_type u32 = 2
clip_model_load: - kv 5: general.name str = openai/clip-vit-large-patch14-336
clip_model_load: - kv 6: general.description str = image encoder for LLaVA
clip_model_load: - kv 7: clip.vision.image_size u32 = 336
clip_model_load: - kv 8: clip.vision.patch_size u32 = 14
clip_model_load: - kv 9: clip.vision.embedding_length u32 = 1024
clip_model_load: - kv 10: clip.vision.feed_forward_length u32 = 4096
clip_model_load: - kv 11: clip.vision.projection_dim u32 = 768
clip_model_load: - kv 12: clip.vision.attention.head_count u32 = 16
clip_model_load: - kv 13: clip.vision.attention.layer_norm_epsilon f32 = 0.000010
clip_model_load: - kv 14: clip.vision.block_count u32 = 23
clip_model_load: - kv 15: clip.vision.image_mean arr[f32,3] = [0.481455, 0.457828, 0.408211]
clip_model_load: - kv 16: clip.vision.image_std arr[f32,3] = [0.268630, 0.261303, 0.275777]
clip_model_load: - kv 17: clip.use_gelu bool = false
clip_model_load: - kv 18: general.quantization_version u32 = 2
clip_model_load: - type f32: 235 tensors
clip_model_load: - type f16: 1 tensors
clip_model_load: - type q4_0: 141 tensors
clip_model_load: CLIP using CPU backend
clip_model_load: text_encoder: 0
clip_model_load: vision_encoder: 1
clip_model_load: llava_projector: 1
clip_model_load: model size: 169.18 MB
clip_model_load: metadata size: 0.17 MB
clip_model_load: params backend buffer size = 169.18 MB (377 tensors)
get_key_idx: note: key clip.vision.image_grid_pinpoints not found in file
get_key_idx: note: key clip.vision.mm_patch_merge_type not found in file
get_key_idx: note: key clip.vision.image_crop_resolution not found in file
clip_model_load: compute allocated memory: 32.89 MB
warming up the model with an empty run
llama server listening at http://127.0.0.1:8080
encode_image_with_clip: image embedding created: 576 tokens
encode_image_with_clip: image encoded in 698.12 ms by CLIP ( 1.21 ms per image patch)
evaluated 610 image tokens in 5332135 us at 114.401 tok/sec
encode_image_with_clip: image embedding created: 576 tokens
encode_image_with_clip: image encoded in 658.55 ms by CLIP ( 1.14 ms per image patch)
evaluated 610 image tokens in 6122562 us at 99.6315 tok/sec
encode_image_with_clip: image embedding created: 576 tokens
encode_image_with_clip: image encoded in 641.54 ms by CLIP ( 1.11 ms per image patch)
evaluated 610 image tokens in 6103907 us at 99.936 tok/sec
encode_image_with_clip: image embedding created: 576 tokens
encode_image_with_clip: image encoded in 648.21 ms by CLIP ( 1.13 ms per image patch)
evaluated 610 image tokens in 6093163 us at 100.112 tok/sec
encode_image_with_clip: image embedding created: 576 tokens
encode_image_with_clip: image encoded in 636.02 ms by CLIP ( 1.10 ms per image patch)
evaluated 610 image tokens in 6136775 us at 99.4007 tok/sec
encode_image_with_clip: image embedding created: 576 tokens
encode_image_with_clip: image encoded in 643.11 ms by CLIP ( 1.12 ms per image patch)
evaluated 610 image tokens in 6048914 us at 100.845 tok/sec
encode_image_with_clip: image embedding created: 576 tokens
encode_image_with_clip: image encoded in 664.80 ms by CLIP ( 1.15 ms per image patch)
evaluated 610 image tokens in 5293222 us at 115.242 tok/sec