Skip to content

Commit 8cf408e

Browse files
committed
change default
Signed-off-by: Kyle Sayers <[email protected]>
1 parent 3765e3c commit 8cf408e

File tree

1 file changed

+4
-3
lines changed
  • src/llmcompressor/modifiers/quantization/gptq

1 file changed

+4
-3
lines changed

src/llmcompressor/modifiers/quantization/gptq/base.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -75,8 +75,9 @@ class GPTQModifier(Modifier, QuantizationMixin):
7575
:param block_size: Used to determine number of columns to compress in one pass
7676
:param dampening_frac: Amount of dampening to apply to H, as a fraction of the
7777
diagonal norm
78-
:param actorder: order in which weight columns are quantized. For more information,
79-
on actorder options, see https://github.com/vllm-project/vllm/pull/8135
78+
:param actorder: order in which weight columns are quantized. Defaults to "static"
79+
activation ordering, which achieves best accuracy recovery with no runtime cost.
80+
For more information, see https://github.com/vllm-project/vllm/pull/8135
8081
:param offload_hessians: Set to True for decreased memory usage but increased
8182
runtime.
8283
@@ -109,7 +110,7 @@ class GPTQModifier(Modifier, QuantizationMixin):
109110
sequential_targets: Union[str, List[str], None] = None
110111
block_size: int = 128
111112
dampening_frac: Optional[float] = 0.01
112-
actorder: Optional[ActivationOrdering] = None
113+
actorder: Optional[ActivationOrdering] = ActivationOrdering.STATIC
113114
offload_hessians: bool = False
114115

115116
# private variables

0 commit comments

Comments
 (0)