Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: VAE decode step fails when batch size=4 #3830

Closed
2 tasks done
putnam opened this issue Mar 18, 2025 · 1 comment
Closed
2 tasks done

[Issue]: VAE decode step fails when batch size=4 #3830

putnam opened this issue Mar 18, 2025 · 1 comment

Comments

@putnam
Copy link

putnam commented Mar 18, 2025

Issue Description

Test case:

  1. Use SD 3.5 Large Turbo, using default VAE (built-in)
  2. Sampler = Euler FlowMatch, guidance scale=1, steps=4, 1024x1024, batch count=1
  3. If batch size is not 4, everything is fine. If batch size == 4, the following error occurs:
RuntimeError: Given groups=1, weight of size [512, 16, 3, 3], expected input[16, 4, 128, 128] to have 16 channels, but got 4 channels instead

Debugging myself, this if-block fires when batch size=4:

if latents.shape[0] == 4 and latents.shape[1] != 4: # likely animatediff latent

This block seems to be responsible. It does not trigger when batch size != 4.

You may notice a modified file in my log output. It's just debug logs I added to the if blocks in processing_vae.py around L296.

I don't have a ton of experience with this code base beyond just tracking this down or I'd try to fix it. Thanks for maintaining this project.

Version Platform Description

Version: app=sd.next updated=2025-03-14 hash=6633e8e5 branch=master

Using NVIDIA A6000 on current Debian testing.

Startup log:

sd  | 08:46:58-103362 INFO     Starting SD.Next
sd  | 08:46:58-106791 INFO     Logger: file="/sdnext-git/sdnext/sdnext.log"
sd  |                          level=DEBUG host="170aaa7ce976" size=79 mode=create
sd  | 08:46:58-107833 INFO     Python: version=3.11.2 platform=Linux
sd  |                          bin="/sdnext-git/sdnext/venv/bin/python3"
sd  |                          venv="/sdnext-git/sdnext/venv"
sd  | 08:46:58-132392 INFO     Version: app=sd.next updated=2025-03-14 hash=6633e8e5
sd  |                          branch=master
sd  |                          url=https://github.com/vladmandic/automatic.git/tree/ma
sd  |                          ster ui=main
sd  | 08:46:58-623648 INFO     Platform: arch=x86_64 cpu= system=Linux
sd  |                          release=6.12.17-amd64 python=3.11.2 locale=('en_US',
sd  |                          'UTF-8') docker=False
sd  | 08:46:58-626576 DEBUG    Packages: prefix=venv
sd  |                          site=['venv/lib/python3.11/site-packages']
sd  | 08:46:58-628670 INFO     Args: ['--listen', '--docs', '--experimental',
sd  |                          '--insecure', '--debug']
sd  | 08:46:58-630334 DEBUG    Setting environment tuning
sd  | 08:46:58-631972 DEBUG    Torch allocator:
sd  |                          "garbage_collection_threshold:0.80,max_split_size_mb:51
sd  |                          2"
sd  | 08:46:58-642714 DEBUG    Torch overrides: cuda=False rocm=False ipex=False
sd  |                          directml=False openvino=False zluda=False
sd  | 08:46:58-645391 INFO     CUDA: nVidia toolkit detected
sd  | 08:46:58-674628 WARNING  Modified files: ['modules/processing_vae.py']
sd  | 08:46:58-729548 INFO     Install: verifying requirements
sd  | 08:46:58-741834 DEBUG    Timestamp repository update time: Fri Mar 14 17:05:04
sd  |                          2025
sd  | 08:46:58-743062 INFO     Startup: standard
sd  | 08:46:58-743639 INFO     Verifying submodules
sd  | 08:46:59-135899 DEBUG    Git detached head detected:
sd  |                          folder="extensions-builtin/sd-extension-chainner"
sd  |                          reattach=main
sd  | 08:46:59-137183 DEBUG    Git submodule: extensions-builtin/sd-extension-chainner
sd  |                          / main
sd  | 08:46:59-154644 DEBUG    Git detached head detected:
sd  |                          folder="extensions-builtin/sd-extension-system-info"
sd  |                          reattach=main
sd  | 08:46:59-156758 DEBUG    Git submodule:
sd  |                          extensions-builtin/sd-extension-system-info / main
sd  | 08:46:59-174155 DEBUG    Git detached head detected:
sd  |                          folder="extensions-builtin/sd-webui-agent-scheduler"
sd  |                          reattach=main
sd  | 08:46:59-176248 DEBUG    Git submodule:
sd  |                          extensions-builtin/sd-webui-agent-scheduler / main
sd  | 08:46:59-193475 DEBUG    Git detached head detected:
sd  |                          folder="extensions-builtin/sdnext-modernui"
sd  |                          reattach=main
sd  | 08:46:59-194911 DEBUG    Git submodule: extensions-builtin/sdnext-modernui /
sd  |                          main
sd  | 08:46:59-230238 DEBUG    Git detached head detected:
sd  |                          folder="extensions-builtin/stable-diffusion-webui-rembg
sd  |                          " reattach=master
sd  | 08:46:59-232367 DEBUG    Git submodule:
sd  |                          extensions-builtin/stable-diffusion-webui-rembg /
sd  |                          master
sd  | 08:46:59-248312 DEBUG    Git detached head detected:
sd  |                          folder="modules/k-diffusion" reattach=master
sd  | 08:46:59-250408 DEBUG    Git submodule: modules/k-diffusion / master
sd  | 08:46:59-268378 DEBUG    Git detached head detected: folder="wiki"
sd  |                          reattach=master
sd  | 08:46:59-270510 DEBUG    Git submodule: wiki / master
sd  | 08:46:59-291438 DEBUG    Register paths
sd  | 08:46:59-345779 DEBUG    Installed packages: 223
sd  | 08:46:59-347064 DEBUG    Extensions all: ['sd-extension-chainner',
sd  |                          'sd-extension-system-info', 'sd-webui-agent-scheduler',
sd  |                          'stable-diffusion-webui-images-browser',
sd  |                          'stable-diffusion-webui-rembg', 'sdnext-modernui']
sd  | 08:46:59-404780 DEBUG    Extension installer:
sd  |                          /sdnext-git/sdnext/extensions-builtin/sd-webui-agent-sc
sd  |                          heduler/install.py
sd  | 08:47:01-758434 DEBUG    Extension installer:
sd  |                          /sdnext-git/sdnext/extensions-builtin/stable-diffusion-
sd  |                          webui-images-browser/install.py
sd  | 08:47:04-115878 DEBUG    Extension installer:
sd  |                          /sdnext-git/sdnext/extensions-builtin/stable-diffusion-
sd  |                          webui-rembg/install.py


sd  | 08:47:12-529383 DEBUG    Extensions all: ['sd-webui-api-payload-display']
sd  | 08:47:12-557005 INFO     Extensions enabled: ['sd-extension-chainner',
sd  |                          'sd-extension-system-info', 'sd-webui-agent-scheduler',
sd  |                          'stable-diffusion-webui-images-browser',
sd  |                          'stable-diffusion-webui-rembg', 'sdnext-modernui',
sd  |                          'sd-webui-api-payload-display']
sd  | 08:47:12-557855 INFO     Install: verifying requirements
sd  | 08:47:12-558518 DEBUG    Setup complete without errors: 1742287633
sd  | 08:47:12-560971 DEBUG    Extension preload: {'extensions-builtin': 0.0,
sd  |                          'extensions': 0.0}
sd  | 08:47:12-562067 INFO     Command line args: ['--listen', '--docs',
sd  |                          '--experimental', '--insecure', '--debug']
sd  |                          insecure=True listen=True experimental=True debug=True
sd  |                          docs=True args=[]
sd  | 08:47:12-563041 DEBUG    Env flags: []
sd  | 08:47:12-563652 DEBUG    Linker flags: preload="None"
sd  |                          path=":/sdnext-git/sdnext/venv/lib/"
sd  | 08:47:12-564411 DEBUG    Starting module: <module 'webui' from
sd  |                          '/sdnext-git/sdnext/webui.py'>
sd  | 08:47:17-575168 DEBUG    System: cores=48 affinity=48 threads=24
sd  | 08:47:17-576398 INFO     Torch: torch==2.5.1+cu124 torchvision==0.20.1+cu124
sd  | 08:47:17-577167 INFO     Packages: diffusers==0.33.0.dev0 transformers==4.46.2
sd  |                          accelerate==1.1.1 gradio==3.43.2 pydantic==1.10.15
sd  | 08:47:17-762676 INFO     Device detect: memory=47.5 default=balanced
sd  | 08:47:17-766647 DEBUG    Read: file="/sdnext-git/sdnext/config.json" json=17
sd  |                          bytes=806 time=0.000 fn=<module>:load
sd  | 08:47:18-079790 WARNING  Setting validation: unknown=['cross_attention_options']
sd  | 08:47:18-081554 INFO     Engine: backend=Backend.DIFFUSERS compute=cuda
sd  |                          device=cuda attention="Scaled-Dot-Product" mode=no_grad
sd  | 08:47:18-083010 DEBUG    Read: file="html/reference.json" json=65 bytes=33769
sd  |                          time=0.000 fn=_call_with_frames_removed:<module>
sd  | 08:47:18-083991 DEBUG    Torch attention: type="sdpa" flash=True memory=True
sd  |                          math=True
sd  | 08:47:18-122086 INFO     Torch parameters: backend=cuda device=cuda config=Auto
sd  |                          dtype=torch.bfloat16 context=no_grad nohalf=False
sd  |                          nohalfvae=False upcast=False deterministic=False
sd  |                          tunable=[False, True] fp16=pass bf16=pass
sd  |                          optimization="Scaled-Dot-Product"
sd  | 08:47:18-271228 DEBUG    ONNX: version=1.20.1 provider=CUDAExecutionProvider,
sd  |                          available=['AzureExecutionProvider',
sd  |                          'CPUExecutionProvider']
sd  | 08:47:18-326841 INFO     Device: device=NVIDIA RTX A6000 n=1 arch=sm_90
sd  |                          capability=(8, 6) cuda=12.4 cudnn=90100
sd  |                          driver=535.216.03
sd  | 08:47:18-389246 DEBUG    Entering start sequence
sd  | 08:47:18-390562 DEBUG    Initializing
sd  | 08:47:18-391397 DEBUG    Read: file="metadata.json" json=1 bytes=96 time=0.000
sd  |                          fn=initialize:init_metadata
sd  | 08:47:18-392518 DEBUG    Read: file="cache.json" json=1 bytes=189 time=0.000
sd  |                          fn=initialize:init_cache
sd  | 08:47:18-393408 DEBUG    Huggingface cache:
sd  |                          path="/config/.cache/huggingface/hub"
sd  | 08:47:18-420789 INFO     Available VAEs: path="models/VAE" items=1
sd  | 08:47:18-421885 INFO     Available UNets: path="models/UNET" items=0
sd  | 08:47:18-422668 INFO     Available TEs: path="models/Text-encoder" items=0
sd  | 08:47:18-423789 INFO     Available Models:
sd  |                          safetensors="models/Stable-diffusion":1
sd  |                          diffusers="models/Diffusers":3 items=4 time=0.00
sd  | 08:47:18-429492 INFO     Available LoRAs: path="models/Lora" items=0 folders=2
sd  |                          time=0.00
sd  | 08:47:18-438005 INFO     Available Styles: path="models/styles" items=288
sd  |                          time=0.01
sd  | 08:47:18-479118 INFO     Available Detailer: path="models/yolo" items=10
sd  |                          downloaded=0
sd  | 08:47:18-480330 DEBUG    Extensions: disabled=['Lora', 'sd-webui-controlnet']
sd  | 08:47:18-481016 INFO     Load extensions
sd  | 08:47:19-317828 INFO     Extension:
sd  |                          script='extensions-builtin/sd-webui-agent-scheduler/scr
sd  |                          ipts/task_scheduler.py' Using sqlite file:
sd  |                          extensions-builtin/sd-webui-agent-scheduler/task_schedu
sd  |                          ler.sqlite3
sd  | 08:47:19-486301 DEBUG    Extensions init time: total=1.00
sd  |                          sd-webui-agent-scheduler=0.75
sd  |                          stable-diffusion-webui-images-browser=0.16
sd  | 08:47:19-492401 DEBUG    Read: file="html/upscalers.json" json=4 bytes=2640
sd  |                          time=0.000 fn=__init__:__init__
sd  | 08:47:19-493532 DEBUG    Read:
sd  |                          file="extensions-builtin/sd-extension-chainner/models.j
sd  |                          son" json=24 bytes=2693 time=0.000
sd  |                          fn=__init__:find_scalers
sd  | 08:47:19-494771 DEBUG    chaiNNer models: path="models/chaiNNer" defined=24
sd  |                          discovered=0 downloaded=0
sd  | 08:47:19-496156 INFO     Available Upscalers: items=65 downloaded=0 user=0
sd  |                          time=0.01 types=['None', 'Resize', 'Latent',
sd  |                          'AsymmetricVAE', 'DCC', 'ChaiNNer', 'RealESRGAN',
sd  |                          'SCUNet', 'SwinIR', 'AuraSR', 'ESRGAN', 'Diffusion']
sd  | 08:47:19-498865 DEBUG    UI start sequence
sd  | 08:47:19-499691 INFO     UI locale: name="Auto"
sd  | 08:47:19-500389 INFO     UI theme: type=Modern name="Default" available=32
sd  | 08:47:19-502839 DEBUG    UI theme:
sd  |                          css="extensions-builtin/sdnext-modernui/themes/Default.
sd  |                          css" base="base.css" user="None"
sd  | 08:47:19-504873 DEBUG    UI initialize: txt2img
sd  | 08:47:19-538260 DEBUG    Networks: experimental model="SDXL Flash Mini"
sd  | 08:47:19-543996 DEBUG    Networks: type='model' items=69 subfolders=4
sd  |                          tab=txt2img folders=['models/Stable-diffusion',
sd  |                          'models/Diffusers', 'models/Reference'] list=0.02
sd  |                          thumb=0.00 desc=0.00 info=0.00 workers=8
sd  | 08:47:19-546092 DEBUG    Networks: type='lora' items=0 subfolders=1 tab=txt2img
sd  |                          folders=['models/Lora'] list=0.01 thumb=0.01 desc=0.00
sd  |                          info=0.00 workers=8
sd  | 08:47:19-556142 DEBUG    Networks: type='style' items=288 subfolders=3
sd  |                          tab=txt2img folders=['models/styles', 'html'] list=0.01
sd  |                          thumb=0.00 desc=0.00 info=0.00 workers=8
sd  | 08:47:19-559681 DEBUG    Networks: type='embedding' items=0 subfolders=1
sd  |                          tab=txt2img folders=['models/embeddings'] list=0.00
sd  |                          thumb=0.00 desc=0.00 info=0.00 workers=8
sd  | 08:47:19-561135 DEBUG    Networks: type='vae' items=1 subfolders=1 tab=txt2img
sd  |                          folders=['models/VAE'] list=0.00 thumb=0.00 desc=0.00
sd  |                          info=0.00 workers=8
sd  | 08:47:19-562519 DEBUG    Networks: type='history' items=0 subfolders=1
sd  |                          tab=txt2img folders=[] list=0.00 thumb=0.00 desc=0.00
sd  |                          info=0.00 workers=8
sd  | 08:47:19-868714 DEBUG    UI initialize: img2img
sd  | 08:47:20-078828 DEBUG    UI initialize: control models="models/control"
sd  | 08:47:20-290288 DEBUG    Script:
sd  |                          fn="extensions-builtin/sd-webui-agent-scheduler/scripts
sd  |                          /task_scheduler.py" type=control skip
sd  | 08:47:20-291432 DEBUG    Script:
sd  |                          fn="extensions/sd-webui-api-payload-display/scripts/api
sd  |                          _payload_display.py" type=control skip
sd  | 08:47:20-653728 DEBUG    Read: file="ui-config.json" json=0 bytes=2 time=0.000
sd  |                          fn=__init__:read_from_file
sd  | 08:47:22-001359 DEBUG    Extension list: processed=397 installed=9 enabled=7
sd  |                          disabled=2 visible=397 hidden=0
sd  | 08:47:22-208785 DEBUG    Root paths: ['/sdnext-git/sdnext']
sd  | 08:47:22-305474 INFO     Local URL: http://localhost:7860/
sd  | 08:47:22-459464 INFO     API Docs: http://localhost:7860/docs
sd  | 08:47:22-460108 INFO     API ReDocs: http://localhost:7860/redocs
sd  | 08:47:22-461394 DEBUG    API middleware: [<class
sd  |                          'starlette.middleware.base.BaseHTTPMiddleware'>, <class
sd  |                          'starlette.middleware.gzip.GZipMiddleware'>]
sd  | 08:47:22-462195 DEBUG    API initialize
sd  | 08:47:22-977045 INFO     [AgentScheduler] Task queue is empty
sd  | 08:47:22-980573 INFO     [AgentScheduler] Registering APIs
sd  | 08:47:23-084045 DEBUG    UI: connected
sd  | 08:47:23-151879 DEBUG    Scripts setup: time=0.506 ['K-Diffusion
sd  |                          Samplers:0.108', 'XYZ Grid:0.053', 'IP Adapters:0.045',
sd  |                          'Mixture-of-Diffusers: Tile Control:0.026', 'Face:
sd  |                          Multiple ID Transfers:0.025', 'FreeScale: Tuning-Free
sd  |                          Scale Fusion:0.015', 'Video: CogVideoX:0.013', 'Video:
sd  |                          LTX Video:0.012', 'Video: AnimateDiff:0.012',
sd  |                          'ConsiStory: Consistent Image Generation:0.011',
sd  |                          'Video: VGen Image-to-Video:0.011']
sd  | 08:47:23-153319 DEBUG    Model metadata: file="metadata.json" no changes
sd  | 08:47:23-154775 DEBUG    Model requested: fn=run:<lambda>
sd  | 08:47:23-156902 INFO     Load model:
sd  |                          select="Diffusers/stabilityai/stable-diffusion-3.5-larg
sd  |                          e-turbo [ec07796fc0]"

Relevant log output

2025-03-18 03:47:51.209	DEBUG	sd	sd_vae_approx	VAE load: type=approximate model='models/VAE-approx/model.pt'
2025-03-18 03:48:00.341	DEBUG	sd	launch	Server: alive=True requests=55 memory=27.83/125.64 status='running' task='Load' timestamp='20250318084723' id='task(a6w1k3n5padrksr)' job=0 jobs=0 total=1 step=0 steps=0 queued=0 uptime=43 elapsed=37.19 eta=None progress=0
2025-03-18 03:48:10.209	ERROR	sd	processing_vae	animatediff latent block
2025-03-18 03:48:10.210	ERROR	sd	processing_vae	full vae decode block
2025-03-18 03:48:10.282	ERROR	sd	processing_vae	VAE decode: Given groups=1, weight of size [512, 16, 3, 3], expected input[16, 4, 128, 128] to have 16 channels, but got 4 channels instead
2025-03-18 03:48:10.284	ERROR	sd	errors	VAE decode: RuntimeError
2025-03-18 03:48:10.894	DEBUG	sd	processing_vae	Decode: vae='default' upcast=False slicing=False tiling=False latents=[16, 4, 128, 128]:cuda:0:torch.bfloat16 time=0.683

Backend

Diffusers

UI

ModernUI

Branch

Master

Model

StableDiffusion 3.x

Acknowledgements

  • I have read the above and searched for existing issues
  • I confirm that this is classified correctly and its not an extension issue
@vladmandic
Copy link
Owner

fixed. thanks for detailed analysis, helps a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants