Replies: 1 comment
-
Is the error the same everytime? e.g. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Im trying to run inference on my tuned model, I tuned the HF weights and not the original.
i did use custom data
dataset:
component: torchtune.datasets.text_completion_dataset
running: tune run generate --config inference.yaml prompt="What are some interesting sites to visit in the Bay Area?"
with this config
output
DEBUG:torchtune.utils.logging:Setting manual seed to local seed 1234. Local seed is seed + rank = 1234 + 0
INFO:torchtune.utils.logging:Model is initialized with precision torch.bfloat16.
INFO:torchtune.utils.logging:What are some interesting sites to visit in the Bay Area?
INFO:torchtune.utils.logging:Time for inference: 0.65 sec total, 1.54 tokens/sec
INFO:torchtune.utils.logging:Bandwidth achieved: 31.64 GB/s
INFO:torchtune.utils.logging:Memory used: 20.62 GB
Would like help as to how i can run inference with this model
I aslo tired converting the weights using convert_hf_to_gguf.py from llama cpp
but i get
RuntimeError: Internal: could not parse ModelProto from /tmp/complete-model/tokenizer.model
I have tried replacing the tokenizer and redownloading it from hf but none seem to work.
Beta Was this translation helpful? Give feedback.
All reactions