You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "/home/ubuntu/text-generation-webui/modules/callbacks.py", line 56, in gentask
ret = self.mfunc(callback=_callback, *args, **self.kwargs)
File "/home/ubuntu/text-generation-webui/modules/text_generation.py", line 361, in generate_with_callback
shared.model.generate(**kwargs)
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/generation/utils.py", line 1652, in generate
return self.sample(
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/generation/utils.py", line 2734, in sample
outputs = self(
File "/home/ubuntu/text-generation-webui/modules/exllama_hf.py", line 96, in call
self.ex_model.forward(seq_tensor[longest_prefix:-1].view(1, -1), ex_cache, preprocess_only=True, lora=self.lora)
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/exllama/model.py", line 972, in forward
r = self._forward(input_ids[:, chunk_begin : chunk_end],
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/exllama/model.py", line 1058, in _forward
hidden_states = decoder_layer.forward(hidden_states, cache, buffers[device], lora)
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/exllama/model.py", line 536, in forward
hidden_states = self.self_attn.forward(hidden_states, cache, buffer, lora)
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/exllama/model.py", line 491, in forward
attn_output = F.scaled_dot_product_attention(query_states, key_states, value_states, attn_mask = buffer.attn_mask, is_causal = False)
RuntimeError: The expanded size of the tensor (641) must match the existing size (640) at non-singleton dimension 3. Target sizes: [1, 40, 621, 641]. Tensor sizes: [1, 1, 621, 640]
text-generation-webui v1.7 with exllama-0.0.17. The same prompt and SampleParam sometimes produce error and sometimes work fine.
The text was updated successfully, but these errors were encountered:
Traceback (most recent call last):
File "/home/ubuntu/text-generation-webui/modules/callbacks.py", line 56, in gentask
ret = self.mfunc(callback=_callback, *args, **self.kwargs)
File "/home/ubuntu/text-generation-webui/modules/text_generation.py", line 361, in generate_with_callback
shared.model.generate(**kwargs)
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/generation/utils.py", line 1652, in generate
return self.sample(
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/generation/utils.py", line 2734, in sample
outputs = self(
File "/home/ubuntu/text-generation-webui/modules/exllama_hf.py", line 96, in call
self.ex_model.forward(seq_tensor[longest_prefix:-1].view(1, -1), ex_cache, preprocess_only=True, lora=self.lora)
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/exllama/model.py", line 972, in forward
r = self._forward(input_ids[:, chunk_begin : chunk_end],
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/exllama/model.py", line 1058, in _forward
hidden_states = decoder_layer.forward(hidden_states, cache, buffers[device], lora)
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/exllama/model.py", line 536, in forward
hidden_states = self.self_attn.forward(hidden_states, cache, buffer, lora)
File "/home/ubuntu/miniconda3/envs/textgen/lib/python3.10/site-packages/exllama/model.py", line 491, in forward
attn_output = F.scaled_dot_product_attention(query_states, key_states, value_states, attn_mask = buffer.attn_mask, is_causal = False)
RuntimeError: The expanded size of the tensor (641) must match the existing size (640) at non-singleton dimension 3. Target sizes: [1, 40, 621, 641]. Tensor sizes: [1, 1, 621, 640]
text-generation-webui v1.7 with exllama-0.0.17. The same prompt and SampleParam sometimes produce error and sometimes work fine.
The text was updated successfully, but these errors were encountered: