Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for Fp8 for the second inpainting step. #16

Open
Archit01 opened this issue Jan 16, 2025 · 2 comments
Open

Request for Fp8 for the second inpainting step. #16

Archit01 opened this issue Jan 16, 2025 · 2 comments

Comments

@Archit01
Copy link

Hello,
Thank you for this wonderful project.

I would kindly request the following as due to limitations on hardware i.e. 8gb vram gpu

1.) Fp8 support for the second inpainting step. Although I could run the fp16 but Fp8 should be faster on limited hardwares.
2.) any possible way to improve the inpainting quality when using lower resolution than 4k.
3.) Could we have output as frames sequence and input as frame sequence so that longer videos could be processed as when the frame is inpainted it will be written to out rather than holding in memory or something like the video which is saved at last.
4.) Is cpu offloading is enabled for second step ?
5.) Is it possible to use this technique with images (single image 2d to 3d) instead of video?

@xiaoyu258
Copy link
Contributor

  1. We use the diffusers to define and load the model, which supports giving the torch_dtype parameters in the from_pretrained function to change the data type, but we have not tested it on fp8.

  2. Our method supports running the video at lower resolution like 1080p and works fine in our examples.

  3. Yes, the streaming way could be implemented during testing.

  4. We do not use it in the second step, but you could use it by adding the line "pipe.enable_model_cpu_offload()" function in diffusers.

  5. Yes, you could test it on a video with a single frame as the image input.

@Archit01
Copy link
Author

  1. We use the diffusers to define and load the model, which supports giving the torch_dtype parameters in the from_pretrained function to change the data type, but we have not tested it on fp8.
  2. Our method supports running the video at lower resolution like 1080p and works fine in our examples.
  3. Yes, the streaming way could be implemented during testing.
  4. We do not use it in the second step, but you could use it by adding the line "pipe.enable_model_cpu_offload()" function in diffusers.
  5. Yes, you could test it on a video with a single frame as the image input.

Thank you for the response.
Unfortunately I don't have much technical background for coding so if you could share the script for second step with cpu offloading would be very helpful.

For point 2: yes the method works with lower resolution but the inpainted right side lacks so much details and looks like oil painting. Slight fix is to increase min and max guidance to 1.5-2. Proper fix would be great if any.

For point 5 I tried with single images but out of right side is corrupted/doesn't work. It is not for still images I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants