[Issue]: VAE decode step fails when batch size=4 #3830

putnam · 2025-03-18T09:00:34Z

Issue Description

Test case:

Use SD 3.5 Large Turbo, using default VAE (built-in)
Sampler = Euler FlowMatch, guidance scale=1, steps=4, 1024x1024, batch count=1
If batch size is not 4, everything is fine. If batch size == 4, the following error occurs:

RuntimeError: Given groups=1, weight of size [512, 16, 3, 3], expected input[16, 4, 128, 128] to have 16 channels, but got 4 channels instead

Debugging myself, this if-block fires when batch size=4:

sdnext/modules/processing_vae.py

Line 296 in 6633e8e

if latents.shape[0] == 4 and latents.shape[1] != 4: # likely animatediff latent

This block seems to be responsible. It does not trigger when batch size != 4.

You may notice a modified file in my log output. It's just debug logs I added to the if blocks in processing_vae.py around L296.

I don't have a ton of experience with this code base beyond just tracking this down or I'd try to fix it. Thanks for maintaining this project.

Version Platform Description

Version: app=sd.next updated=2025-03-14 hash=6633e8e5 branch=master

Using NVIDIA A6000 on current Debian testing.

Startup log:

sd  | 08:46:58-103362 INFO     Starting SD.Next
sd  | 08:46:58-106791 INFO     Logger: file="/sdnext-git/sdnext/sdnext.log"
sd  |                          level=DEBUG host="170aaa7ce976" size=79 mode=create
sd  | 08:46:58-107833 INFO     Python: version=3.11.2 platform=Linux
sd  |                          bin="/sdnext-git/sdnext/venv/bin/python3"
sd  |                          venv="/sdnext-git/sdnext/venv"
sd  | 08:46:58-132392 INFO     Version: app=sd.next updated=2025-03-14 hash=6633e8e5
sd  |                          branch=master
sd  |                          url=https://github.com/vladmandic/automatic.git/tree/ma
sd  |                          ster ui=main
sd  | 08:46:58-623648 INFO     Platform: arch=x86_64 cpu= system=Linux
sd  |                          release=6.12.17-amd64 python=3.11.2 locale=('en_US',
sd  |                          'UTF-8') docker=False
sd  | 08:46:58-626576 DEBUG    Packages: prefix=venv
sd  |                          site=['venv/lib/python3.11/site-packages']
sd  | 08:46:58-628670 INFO     Args: ['--listen', '--docs', '--experimental',
sd  |                          '--insecure', '--debug']
sd  | 08:46:58-630334 DEBUG    Setting environment tuning
sd  | 08:46:58-631972 DEBUG    Torch allocator:
sd  |                          "garbage_collection_threshold:0.80,max_split_size_mb:51
sd  |                          2"
sd  | 08:46:58-642714 DEBUG    Torch overrides: cuda=False rocm=False ipex=False
sd  |                          directml=False openvino=False zluda=False
sd  | 08:46:58-645391 INFO     CUDA: nVidia toolkit detected
sd  | 08:46:58-674628 WARNING  Modified files: ['modules/processing_vae.py']
sd  | 08:46:58-729548 INFO     Install: verifying requirements
sd  | 08:46:58-741834 DEBUG    Timestamp repository update time: Fri Mar 14 17:05:04
sd  |                          2025
sd  | 08:46:58-743062 INFO     Startup: standard
sd  | 08:46:58-743639 INFO     Verifying submodules
sd  | 08:46:59-135899 DEBUG    Git detached head detected:
sd  |                          folder="extensions-builtin/sd-extension-chainner"
sd  |                          reattach=main
sd  | 08:46:59-137183 DEBUG    Git submodule: extensions-builtin/sd-extension-chainner
sd  |                          / main
sd  | 08:46:59-154644 DEBUG    Git detached head detected:
sd  |                          folder="extensions-builtin/sd-extension-system-info"
sd  |                          reattach=main
sd  | 08:46:59-156758 DEBUG    Git submodule:
sd  |                          extensions-builtin/sd-extension-system-info / main
sd  | 08:46:59-174155 DEBUG    Git detached head detected:
sd  |                          folder="extensions-builtin/sd-webui-agent-scheduler"
sd  |                          reattach=main
sd  | 08:46:59-176248 DEBUG    Git submodule:
sd  |                          extensions-builtin/sd-webui-agent-scheduler / main
sd  | 08:46:59-193475 DEBUG    Git detached head detected:
sd  |                          folder="extensions-builtin/sdnext-modernui"
sd  |                          reattach=main
sd  | 08:46:59-194911 DEBUG    Git submodule: extensions-builtin/sdnext-modernui /
sd  |                          main
sd  | 08:46:59-230238 DEBUG    Git detached head detected:
sd  |                          folder="extensions-builtin/stable-diffusion-webui-rembg
sd  |                          " reattach=master
sd  | 08:46:59-232367 DEBUG    Git submodule:
sd  |                          extensions-builtin/stable-diffusion-webui-rembg /
sd  |                          master
sd  | 08:46:59-248312 DEBUG    Git detached head detected:
sd  |                          folder="modules/k-diffusion" reattach=master
sd  | 08:46:59-250408 DEBUG    Git submodule: modules/k-diffusion / master
sd  | 08:46:59-268378 DEBUG    Git detached head detected: folder="wiki"
sd  |                          reattach=master
sd  | 08:46:59-270510 DEBUG    Git submodule: wiki / master
sd  | 08:46:59-291438 DEBUG    Register paths
sd  | 08:46:59-345779 DEBUG    Installed packages: 223
sd  | 08:46:59-347064 DEBUG    Extensions all: ['sd-extension-chainner',
sd  |                          'sd-extension-system-info', 'sd-webui-agent-scheduler',
sd  |                          'stable-diffusion-webui-images-browser',
sd  |                          'stable-diffusion-webui-rembg', 'sdnext-modernui']
sd  | 08:46:59-404780 DEBUG    Extension installer:
sd  |                          /sdnext-git/sdnext/extensions-builtin/sd-webui-agent-sc
sd  |                          heduler/install.py
sd  | 08:47:01-758434 DEBUG    Extension installer:
sd  |                          /sdnext-git/sdnext/extensions-builtin/stable-diffusion-
sd  |                          webui-images-browser/install.py
sd  | 08:47:04-115878 DEBUG    Extension installer:
sd  |                          /sdnext-git/sdnext/extensions-builtin/stable-diffusion-
sd  |                          webui-rembg/install.py


sd  | 08:47:12-529383 DEBUG    Extensions all: ['sd-webui-api-payload-display']
sd  | 08:47:12-557005 INFO     Extensions enabled: ['sd-extension-chainner',
sd  |                          'sd-extension-system-info', 'sd-webui-agent-scheduler',
sd  |                          'stable-diffusion-webui-images-browser',
sd  |                          'stable-diffusion-webui-rembg', 'sdnext-modernui',
sd  |                          'sd-webui-api-payload-display']
sd  | 08:47:12-557855 INFO     Install: verifying requirements
sd  | 08:47:12-558518 DEBUG    Setup complete without errors: 1742287633
sd  | 08:47:12-560971 DEBUG    Extension preload: {'extensions-builtin': 0.0,
sd  |                          'extensions': 0.0}
sd  | 08:47:12-562067 INFO     Command line args: ['--listen', '--docs',
sd  |                          '--experimental', '--insecure', '--debug']
sd  |                          insecure=True listen=True experimental=True debug=True
sd  |                          docs=True args=[]
sd  | 08:47:12-563041 DEBUG    Env flags: []
sd  | 08:47:12-563652 DEBUG    Linker flags: preload="None"
sd  |                          path=":/sdnext-git/sdnext/venv/lib/"
sd  | 08:47:12-564411 DEBUG    Starting module: <module 'webui' from
sd  |                          '/sdnext-git/sdnext/webui.py'>
sd  | 08:47:17-575168 DEBUG    System: cores=48 affinity=48 threads=24
sd  | 08:47:17-576398 INFO     Torch: torch==2.5.1+cu124 torchvision==0.20.1+cu124
sd  | 08:47:17-577167 INFO     Packages: diffusers==0.33.0.dev0 transformers==4.46.2
sd  |                          accelerate==1.1.1 gradio==3.43.2 pydantic==1.10.15
sd  | 08:47:17-762676 INFO     Device detect: memory=47.5 default=balanced
sd  | 08:47:17-766647 DEBUG    Read: file="/sdnext-git/sdnext/config.json" json=17
sd  |                          bytes=806 time=0.000 fn=<module>:load
sd  | 08:47:18-079790 WARNING  Setting validation: unknown=['cross_attention_options']
sd  | 08:47:18-081554 INFO     Engine: backend=Backend.DIFFUSERS compute=cuda
sd  |                          device=cuda attention="Scaled-Dot-Product" mode=no_grad
sd  | 08:47:18-083010 DEBUG    Read: file="html/reference.json" json=65 bytes=33769
sd  |                          time=0.000 fn=_call_with_frames_removed:<module>
sd  | 08:47:18-083991 DEBUG    Torch attention: type="sdpa" flash=True memory=True
sd  |                          math=True
sd  | 08:47:18-122086 INFO     Torch parameters: backend=cuda device=cuda config=Auto
sd  |                          dtype=torch.bfloat16 context=no_grad nohalf=False
sd  |                          nohalfvae=False upcast=False deterministic=False
sd  |                          tunable=[False, True] fp16=pass bf16=pass
sd  |                          optimization="Scaled-Dot-Product"
sd  | 08:47:18-271228 DEBUG    ONNX: version=1.20.1 provider=CUDAExecutionProvider,
sd  |                          available=['AzureExecutionProvider',
sd  |                          'CPUExecutionProvider']
sd  | 08:47:18-326841 INFO     Device: device=NVIDIA RTX A6000 n=1 arch=sm_90
sd  |                          capability=(8, 6) cuda=12.4 cudnn=90100
sd  |                          driver=535.216.03
sd  | 08:47:18-389246 DEBUG    Entering start sequence
sd  | 08:47:18-390562 DEBUG    Initializing
sd  | 08:47:18-391397 DEBUG    Read: file="metadata.json" json=1 bytes=96 time=0.000
sd  |                          fn=initialize:init_metadata
sd  | 08:47:18-392518 DEBUG    Read: file="cache.json" json=1 bytes=189 time=0.000
sd  |                          fn=initialize:init_cache
sd  | 08:47:18-393408 DEBUG    Huggingface cache:
sd  |                          path="/config/.cache/huggingface/hub"
sd  | 08:47:18-420789 INFO     Available VAEs: path="models/VAE" items=1
sd  | 08:47:18-421885 INFO     Available UNets: path="models/UNET" items=0
sd  | 08:47:18-422668 INFO     Available TEs: path="models/Text-encoder" items=0
sd  | 08:47:18-423789 INFO     Available Models:
sd  |                          safetensors="models/Stable-diffusion":1
sd  |                          diffusers="models/Diffusers":3 items=4 time=0.00
sd  | 08:47:18-429492 INFO     Available LoRAs: path="models/Lora" items=0 folders=2
sd  |                          time=0.00
sd  | 08:47:18-438005 INFO     Available Styles: path="models/styles" items=288
sd  |                          time=0.01
sd  | 08:47:18-479118 INFO     Available Detailer: path="models/yolo" items=10
sd  |                          downloaded=0
sd  | 08:47:18-480330 DEBUG    Extensions: disabled=['Lora', 'sd-webui-controlnet']
sd  | 08:47:18-481016 INFO     Load extensions
sd  | 08:47:19-317828 INFO     Extension:
sd  |                          script='extensions-builtin/sd-webui-agent-scheduler/scr
sd  |                          ipts/task_scheduler.py' Using sqlite file:
sd  |                          extensions-builtin/sd-webui-agent-scheduler/task_schedu
sd  |                          ler.sqlite3
sd  | 08:47:19-486301 DEBUG    Extensions init time: total=1.00
sd  |                          sd-webui-agent-scheduler=0.75
sd  |                          stable-diffusion-webui-images-browser=0.16
sd  | 08:47:19-492401 DEBUG    Read: file="html/upscalers.json" json=4 bytes=2640
sd  |                          time=0.000 fn=__init__:__init__
sd  | 08:47:19-493532 DEBUG    Read:
sd  |                          file="extensions-builtin/sd-extension-chainner/models.j
sd  |                          son" json=24 bytes=2693 time=0.000
sd  |                          fn=__init__:find_scalers
sd  | 08:47:19-494771 DEBUG    chaiNNer models: path="models/chaiNNer" defined=24
sd  |                          discovered=0 downloaded=0
sd  | 08:47:19-496156 INFO     Available Upscalers: items=65 downloaded=0 user=0
sd  |                          time=0.01 types=['None', 'Resize', 'Latent',
sd  |                          'AsymmetricVAE', 'DCC', 'ChaiNNer', 'RealESRGAN',
sd  |                          'SCUNet', 'SwinIR', 'AuraSR', 'ESRGAN', 'Diffusion']
sd  | 08:47:19-498865 DEBUG    UI start sequence
sd  | 08:47:19-499691 INFO     UI locale: name="Auto"
sd  | 08:47:19-500389 INFO     UI theme: type=Modern name="Default" available=32
sd  | 08:47:19-502839 DEBUG    UI theme:
sd  |                          css="extensions-builtin/sdnext-modernui/themes/Default.
sd  |                          css" base="base.css" user="None"
sd  | 08:47:19-504873 DEBUG    UI initialize: txt2img
sd  | 08:47:19-538260 DEBUG    Networks: experimental model="SDXL Flash Mini"
sd  | 08:47:19-543996 DEBUG    Networks: type='model' items=69 subfolders=4
sd  |                          tab=txt2img folders=['models/Stable-diffusion',
sd  |                          'models/Diffusers', 'models/Reference'] list=0.02
sd  |                          thumb=0.00 desc=0.00 info=0.00 workers=8
sd  | 08:47:19-546092 DEBUG    Networks: type='lora' items=0 subfolders=1 tab=txt2img
sd  |                          folders=['models/Lora'] list=0.01 thumb=0.01 desc=0.00
sd  |                          info=0.00 workers=8
sd  | 08:47:19-556142 DEBUG    Networks: type='style' items=288 subfolders=3
sd  |                          tab=txt2img folders=['models/styles', 'html'] list=0.01
sd  |                          thumb=0.00 desc=0.00 info=0.00 workers=8
sd  | 08:47:19-559681 DEBUG    Networks: type='embedding' items=0 subfolders=1
sd  |                          tab=txt2img folders=['models/embeddings'] list=0.00
sd  |                          thumb=0.00 desc=0.00 info=0.00 workers=8
sd  | 08:47:19-561135 DEBUG    Networks: type='vae' items=1 subfolders=1 tab=txt2img
sd  |                          folders=['models/VAE'] list=0.00 thumb=0.00 desc=0.00
sd  |                          info=0.00 workers=8
sd  | 08:47:19-562519 DEBUG    Networks: type='history' items=0 subfolders=1
sd  |                          tab=txt2img folders=[] list=0.00 thumb=0.00 desc=0.00
sd  |                          info=0.00 workers=8
sd  | 08:47:19-868714 DEBUG    UI initialize: img2img
sd  | 08:47:20-078828 DEBUG    UI initialize: control models="models/control"
sd  | 08:47:20-290288 DEBUG    Script:
sd  |                          fn="extensions-builtin/sd-webui-agent-scheduler/scripts
sd  |                          /task_scheduler.py" type=control skip
sd  | 08:47:20-291432 DEBUG    Script:
sd  |                          fn="extensions/sd-webui-api-payload-display/scripts/api
sd  |                          _payload_display.py" type=control skip
sd  | 08:47:20-653728 DEBUG    Read: file="ui-config.json" json=0 bytes=2 time=0.000
sd  |                          fn=__init__:read_from_file
sd  | 08:47:22-001359 DEBUG    Extension list: processed=397 installed=9 enabled=7
sd  |                          disabled=2 visible=397 hidden=0
sd  | 08:47:22-208785 DEBUG    Root paths: ['/sdnext-git/sdnext']
sd  | 08:47:22-305474 INFO     Local URL: http://localhost:7860/
sd  | 08:47:22-459464 INFO     API Docs: http://localhost:7860/docs
sd  | 08:47:22-460108 INFO     API ReDocs: http://localhost:7860/redocs
sd  | 08:47:22-461394 DEBUG    API middleware: [<class
sd  |                          'starlette.middleware.base.BaseHTTPMiddleware'>, <class
sd  |                          'starlette.middleware.gzip.GZipMiddleware'>]
sd  | 08:47:22-462195 DEBUG    API initialize
sd  | 08:47:22-977045 INFO     [AgentScheduler] Task queue is empty
sd  | 08:47:22-980573 INFO     [AgentScheduler] Registering APIs
sd  | 08:47:23-084045 DEBUG    UI: connected
sd  | 08:47:23-151879 DEBUG    Scripts setup: time=0.506 ['K-Diffusion
sd  |                          Samplers:0.108', 'XYZ Grid:0.053', 'IP Adapters:0.045',
sd  |                          'Mixture-of-Diffusers: Tile Control:0.026', 'Face:
sd  |                          Multiple ID Transfers:0.025', 'FreeScale: Tuning-Free
sd  |                          Scale Fusion:0.015', 'Video: CogVideoX:0.013', 'Video:
sd  |                          LTX Video:0.012', 'Video: AnimateDiff:0.012',
sd  |                          'ConsiStory: Consistent Image Generation:0.011',
sd  |                          'Video: VGen Image-to-Video:0.011']
sd  | 08:47:23-153319 DEBUG    Model metadata: file="metadata.json" no changes
sd  | 08:47:23-154775 DEBUG    Model requested: fn=run:<lambda>
sd  | 08:47:23-156902 INFO     Load model:
sd  |                          select="Diffusers/stabilityai/stable-diffusion-3.5-larg
sd  |                          e-turbo [ec07796fc0]"

Relevant log output

2025-03-18 03:47:51.209	DEBUG	sd	sd_vae_approx	VAE load: type=approximate model='models/VAE-approx/model.pt'
2025-03-18 03:48:00.341	DEBUG	sd	launch	Server: alive=True requests=55 memory=27.83/125.64 status='running' task='Load' timestamp='20250318084723' id='task(a6w1k3n5padrksr)' job=0 jobs=0 total=1 step=0 steps=0 queued=0 uptime=43 elapsed=37.19 eta=None progress=0
2025-03-18 03:48:10.209	ERROR	sd	processing_vae	animatediff latent block
2025-03-18 03:48:10.210	ERROR	sd	processing_vae	full vae decode block
2025-03-18 03:48:10.282	ERROR	sd	processing_vae	VAE decode: Given groups=1, weight of size [512, 16, 3, 3], expected input[16, 4, 128, 128] to have 16 channels, but got 4 channels instead
2025-03-18 03:48:10.284	ERROR	sd	errors	VAE decode: RuntimeError
2025-03-18 03:48:10.894	DEBUG	sd	processing_vae	Decode: vae='default' upcast=False slicing=False tiling=False latents=[16, 4, 128, 128]:cuda:0:torch.bfloat16 time=0.683

Backend

Diffusers

UI

ModernUI

Branch

Master

Model

StableDiffusion 3.x

Acknowledgements

I have read the above and searched for existing issues
I confirm that this is classified correctly and its not an extension issue

The text was updated successfully, but these errors were encountered:

vladmandic · 2025-03-19T14:50:06Z

fixed. thanks for detailed analysis, helps a lot.

vladmandic closed this as completed Mar 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Issue]: VAE decode step fails when batch size=4 #3830

[Issue]: VAE decode step fails when batch size=4 #3830

putnam commented Mar 18, 2025

vladmandic commented Mar 19, 2025

[Issue]: VAE decode step fails when batch size=4 #3830

[Issue]: VAE decode step fails when batch size=4 #3830

Comments

putnam commented Mar 18, 2025

Issue Description

Version Platform Description

Relevant log output

Backend

UI

Branch

Model

Acknowledgements

vladmandic commented Mar 19, 2025