-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: Given groups=1, weight of size [320, 8, 3, 3], expected input[46, 9, 32, 80] to have 8 channels, but got 9 channels instead #8
Comments
Update: I fixed it. One of the files got renamed in the weights for StereoCrafter |
Could you please inform if both the steps can be run with 8gb vram. I have rtx 2060 super. Is there any other steps required to run on 8gb vram gpu ?
|
Yeah. I'm using the regular RTX 2060, which has 6GB VRAM You are limited to certain resolutions though. 1920x1080 content should be limited to 10 seconds for step 1. The higher the fps, the less seconds you need to put in to save VRAM per clip. Take your splatted video and downscale it to 2048x1024 or 2560x1280 for step 2. 2048x1024 splatted videos should load into your VRAM without x2 tiling. 2560x1280 splatted videos might be possible too. Experiment with certain resolutions and make sure they both perfectly divide into 128. |
I could not able to run 1080p video in step 1. 720*576p worker for me. Although I have to test more. But step 1 was extremely slow. Is there anything can be done ? 8gb vram 16gb ram. Is there any way to divide the step 1 where we just first create depth map and then splatting. Or use custom depth map ? Also how to just limit second step into sbs mode only ? |
I reccomend using depth videos already generated by DepthCrafter or Depth Anything. It's the easiest way to get 1080p splatted videos. I don't know how to have the output just output SBS by itself. I use a script that just outputs the right eye view as a result. |
I have done more tests- 1. I could use 1080p video with 2 seconds clip. Although I have found that when I use lower resolutions like 720*406p or 720p Another thing is even if I use 1080p video or high right side view has lack of details whereas Lastly, Could you please guide on how to use already generated depth video and then just do the splatting. What should be the commands and input videos. |
Hi Archit01, maybe you could check this #2 (comment) for using an already generated depth video. |
Thank you. I have checked that comment but as I not much of the technical person so if there is some example on how to execute the commands with proper arguments would be much appreciated. For example if I have 2 video files one is left eye view or orignal video and other is depth map video of the same so now how to execute the same to do splatting |
I use the inference script from here: https://github.com/enoky/StereoCrafter Download depth_splatting.py from the files section. Have your code set up like this: python depth_splatting.py --input_source_clips <path_to_input_videos> --input_depth_maps <path_to_pre_rendered_depth_maps> --output_splatted <path_to_output_videos> --unet_path <path_to_unet_model> --pre_trained_path <path_to_depthcrafter_model> --max_disp 20 --process_length -1 --batch_size 10 Change the bolded parts to the folder path on your PC. You can change --max_disp to 30 for stronger 3D strength. |
Make sure the 2d video and the depth video have the same filename, resolution, and fps for it to work. |
Thank you for all the suggestions. I will try it today. |
Thank you got the custom maps working and can now process longer videos at 1080p. The only thing is left that inpainted right view lacks the details and for unknown reason even with high resolution it can't match the quality of the example shown. |
The 1st step of StereoCrafter runs with no problems on my RTX 2060.
The 2nd step gives me this when I try to run inpainting_inference.py:
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
Loading pipeline components...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:04<00:00, 1.24it/s]
0%| | 0/8 [00:03<?, ?it/s]
Traceback (most recent call last):
File "inpainting_inference.py", line 297, in
Fire(main)
File "F:\anaconda3\envs\zoe\lib\site-packages\fire\core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "F:\anaconda3\envs\zoe\lib\site-packages\fire\core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "F:\anaconda3\envs\zoe\lib\site-packages\fire\core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "inpainting_inference.py", line 242, in main
video_latents = spatial_tiled_process(
File "inpainting_inference.py", line 77, in spatial_tiled_process
tile = process_func(
File "F:\anaconda3\envs\zoe\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "F:\anaconda3\envs\zoe\StereoCrafter\pipelines\stereo_video_inpainting.py", line 565, in call
noise_pred = self.unet(
File "F:\anaconda3\envs\zoe\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "F:\anaconda3\envs\zoe\lib\site-packages\diffusers\models\unets\unet_spatio_temporal_condition.py", line 428, in forward
sample = self.conv_in(sample)
File "F:\anaconda3\envs\zoe\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "F:\anaconda3\envs\zoe\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "F:\anaconda3\envs\zoe\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [320, 8, 3, 3], expected input[46, 9, 32, 80] to have 8 channels, but got 9 channels instead
The code was tested on Cuda 11.8 with the recommended requirements all installed.
The text was updated successfully, but these errors were encountered: