Skip to content

update: FluxKontextInpaintPipeline support #11820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 2, 2025

Conversation

vuongminh1907
Copy link
Contributor

@vuongminh1907 vuongminh1907 commented Jun 27, 2025

🚀 What does this PR do?

Hello! 👋 I'm truly impressed with Flux Kontext, but I noticed that inpainting functionality hasn’t been fully integrated yet. This PR adds support for inpainting using the 🤗 Diffusers library.

This contribution introduces:

🎯 Inpainting with text only

  1. Example using FluxKontextInpaintPipeline with just a prompt:
import torch 
from diffusers import FluxKontextInpaintPipeline
from diffusers.utils import load_image

prompt = "Change the yellow dinosaur to green one"

img_url = "https://github.com/ZenAI-Vietnam/Flux-Kontext-pipelines/blob/main/assets/dinosaur_input.jpeg?raw=true"
mask_url = "https://github.com/ZenAI-Vietnam/Flux-Kontext-pipelines/blob/main/assets/dinosaur_mask.png?raw=true"

source = load_image(img_url)
mask = load_image(mask_url)

image = pipe(prompt=prompt, image=source, mask_image=mask,strength=1.0).images[0]
image.save("kontext_inpainting_normal.png")
  1. 🖼️ Original image and mask:

ComfyUI_temp_fnmxi_00001_
download_21

  1. ✅ Result using FluxKontextInpaintPipeline:
    flux_inpainting

  2. ⚠️ When using the regular FluxKontext editing pipeline, the color change was not correctly applied to the target object:
    generation-b6483443-2cd3-47da-9d97-8966b9d586bb

🧩 Inpainting with image conditioning

  1. In addition to text prompts, FluxKontextInpaintPipeline also supports conditioning on a reference image via the image_reference parameter:
import torch 
from diffusers import FluxKontextInpaintPipeline
from diffusers.utils import load_image

pipe = FluxKontextInpaintPipeline.from_pretrained("black-forest-labs/FLUX.1-Kontext-dev", torch_dtype=torch.bfloat16)
pipe.to("cuda")
prompt = "Replace this ball"

img_url = "https://images.pexels.com/photos/39362/the-ball-stadion-football-the-pitch-39362.jpeg?auto=compress&cs=tinysrgb&dpr=1&w=500"
mask_url = "https://github.com/ZenAI-Vietnam/Flux-Kontext-pipelines/blob/main/assets/ball_mask.png?raw=true"
image_reference_url = "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTah3x6OL_ECMBaZ5ZlJJhNsyC-OSMLWAI-xw&s"

source = load_image(img_url)
mask = load_image(mask_url)
image_reference = load_image(image_reference_url)

mask = pipe.mask_processor.blur(mask, blur_factor=12)
image = pipe(prompt=prompt, image=source, mask_image=mask,image_reference=image_reference,strength=1.0).images[0]
image.save("kontext_inpainting_ref.png")
  1. 📥 Input image, mask, and reference image:
    ball_ref
    ball_mask
    ball_input

  2. 🎉 Output using FluxKontextInpaintPipeline:
    flux_inpainting_ball

I hope this PR will be helpful for the community and contribute positively to the Diffusers ecosystem! 🌱

Core library:

@nitinmukesh
Copy link

Awesome. 👍

@apolinario
Copy link
Collaborator

Fantastic! Would be cool to have a demo for it on Hugging Face Spaces while the PR gets reviewed :-)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the super cool PR! I left some comments
would you be able to provide more results for different strength values for both with and without the image_reference?

image: Optional[PipelineImageInput] = None,
image_reference: Optional[PipelineImageInput] = None,
mask_image: PipelineImageInput = None,
masked_image_latents: PipelineImageInput = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not support masked_image_latents for now so that we can simplify the logic a bit here - it's not used here

else:
masked_image = masked_image_latents

mask, masked_image_latents = self.prepare_mask_latents(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see masked_image_latents being used in this pipeline, let's remove logics that we don't need here

@vuongminh1907
Copy link
Contributor Author

vuongminh1907 commented Jun 28, 2025

Hi @yiyixuxu, I just updated the code based on your comments — thank you so much!

About updating the results with more strength values, I did a quick test and here are the outputs:

Inpainting with text only

5V (1)

Inpainting with image conditioning

5V (2)

It seems that KontextInpaint works quite well with strength=1.0.

@vuongminh1907
Copy link
Contributor Author

Fantastic! Would be cool to have a demo for it on Hugging Face Spaces while the PR gets reviewed :-)

While waiting for the PR to be reviewed, feel free to test it using:
pip install git+https://github.com/vuongminh1907/diffusers

@nitinmukesh
Copy link

nitinmukesh commented Jun 30, 2025

Thanks again @vuongminh1907

Tested using
pip install git+https://github.com/huggingface/diffusers.git@refs/pull/11820/head

Working good with nunchaku optimization. Will do more tests.

Kindly review and merge.

Nunchaku output

kontext_inpainting_normal

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a test case? will merge soon

@vuongminh1907
Copy link
Contributor Author

Just to confirm @yiyixuxu , should I add another test case, or will you take care of it?

@nne998
Copy link

nne998 commented Jul 1, 2025

Hi! Nice work! Can you add a custom node to ComfyUI to support this ? @vuongminh1907

@vuongminh1907
Copy link
Contributor Author

Hi! Nice work! Can you add a custom node to ComfyUI to support this ? @vuongminh1907
okey @nne998, I will take a look soon

@vuongminh1907
Copy link
Contributor Author

@nne998, I think ComfyUI can also work, these are 2 examples:

  1. Prompt: Make the girl take a phone instead of bread
    phone

  2. Prompt: Make the girl take eating apple instead of bread
    aplle

@strawberrymelonpanda
Copy link

strawberrymelonpanda commented Jul 1, 2025

I think ComfyUI can also work, these are 2 examples:

@vuongminh1907 How about the image_reference parameter? Will that work with existing nodes?
Great work, and thanks!

Edit: Apologies for being somewhat off-topic with a ComfyUI question. I was linked here from a ComfyUI issue and didn't realize I'd switched repos at first.

@vuongminh1907
Copy link
Contributor Author

@strawberrymelonpanda, I’ll take a look at the image_reference later. In the meantime, I’ve just released the Kontext Inpainting Node here: ComfyUI-Kontext-Inpainting

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Jul 1, 2025

would you be able to add a test case? can reference the current tests for flux pipelines https://github.com/huggingface/diffusers/tree/main/tests/pipelines/flux

@vuongminh1907
Copy link
Contributor Author

thanks @yiyixuxu , I added test case and it passed all checks.

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Jul 2, 2025

@bot /style

Copy link
Contributor

github-actions bot commented Jul 2, 2025

Style bot fixed some files and pushed the changes.

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you so much!

@yiyixuxu yiyixuxu merged commit d6fa329 into huggingface:main Jul 2, 2025
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants