Skip to content

Issue with updated import and kernel compatibility for Qwen2_5_VL model #1582

Open
@berkesule

Description

@berkesule

When running the updated version of the script examples/multimodal_vision/qwen_2_5_vl_example.py using the same model and version as before, I encounter the following error related to linear kernel implementations.

In the older version of the codebase, I was importing the model as:
from llmcompressor.transformers.tracing import TraceableQwen2_5_VLForConditionalGeneration

However, in the new version, the import is changed to:
from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration

When running the same model and version with this new import, I get the following error related to linear kernel implementation failures.

Is there any way to access the old import that worked previously?

Environment
Include all relevant environment information:

  1. OS running google colab
  2. Python 3.11.13
  3. llmcompressor==0.5.3
  4. torch 2.6.0

Errors
ValueError: Failed to find a kernel that can implement the WNA16 linear layer. Reasons:

  • MacheteLinearKernel requires capability 90, current compute capability is 80
  • AllSparkLinearKernel cannot implement due to: For Ampere GPU, AllSpark does not support group_size = 128. Only group_size = -1 are supported.
  • MarlinLinearKernel cannot implement due to: Weight output_size_per_partition = 3420 is not divisible by min_thread_n = 64. Consider reducing tensor_parallel_size or running with --quantization gptq.
  • BitBLASLinearKernel cannot implement due to: bitblas is not installed. Please install bitblas by running pip install bitblas>=0.1.0
  • ExllamaLinearKernel cannot implement due to: Output features must be a multiple of the pack factor (32 / num_bits) so that we can correctly pack the zero points

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingvllmUsing vLLM

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions