Skip to content

update: FluxKontextInpaintPipeline support #11820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

vuongminh1907
Copy link

@vuongminh1907 vuongminh1907 commented Jun 27, 2025

🚀 What does this PR do?

Hello! 👋 I'm truly impressed with Flux Kontext, but I noticed that inpainting functionality hasn’t been fully integrated yet. This PR adds support for inpainting using the 🤗 Diffusers library.

This contribution introduces:

🎯 Inpainting with text only

  1. Example using FluxKontextInpaintPipeline with just a prompt:
import torch 
from diffusers import FluxKontextInpaintPipeline
from diffusers.utils import load_image

prompt = "Change the yellow dinosaur to green one"

img_url = "https://github.com/ZenAI-Vietnam/Flux-Kontext-pipelines/blob/main/assets/dinosaur_input.jpeg?raw=true"
mask_url = "https://github.com/ZenAI-Vietnam/Flux-Kontext-pipelines/blob/main/assets/dinosaur_mask.png?raw=true"

source = load_image(img_url)
mask = load_image(mask_url)

image = pipe(prompt=prompt, image=source, mask_image=mask,strength=1.0).images[0]
image.save("kontext_inpainting_normal.png")
  1. 🖼️ Original image and mask:

ComfyUI_temp_fnmxi_00001_
download_21

  1. ✅ Result using FluxKontextInpaintPipeline:
    flux_inpainting

  2. ⚠️ When using the regular FluxKontext editing pipeline, the color change was not correctly applied to the target object:
    generation-b6483443-2cd3-47da-9d97-8966b9d586bb

🧩 Inpainting with image conditioning

  1. In addition to text prompts, FluxKontextInpaintPipeline also supports conditioning on a reference image via the image_reference parameter:
import torch 
from diffusers import FluxKontextInpaintPipeline
from diffusers.utils import load_image

pipe = FluxKontextInpaintPipeline.from_pretrained("black-forest-labs/FLUX.1-Kontext-dev", torch_dtype=torch.bfloat16)
pipe.to("cuda")
prompt = "Replace this ball"

img_url = "https://images.pexels.com/photos/39362/the-ball-stadion-football-the-pitch-39362.jpeg?auto=compress&cs=tinysrgb&dpr=1&w=500"
mask_url = "https://github.com/ZenAI-Vietnam/Flux-Kontext-pipelines/blob/main/assets/ball_mask.png?raw=true"
image_reference_url = "https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTah3x6OL_ECMBaZ5ZlJJhNsyC-OSMLWAI-xw&s"

source = load_image(img_url)
mask = load_image(mask_url)
image_reference = load_image(image_reference_url)

mask = pipe.mask_processor.blur(mask, blur_factor=12)
image = pipe(prompt=prompt, image=source, mask_image=mask,image_reference=image_reference,strength=1.0).images[0]
image.save("kontext_inpainting_ref.png")
  1. 📥 Input image, mask, and reference image:
    ball_ref
    ball_mask
    ball_input

  2. 🎉 Output using FluxKontextInpaintPipeline:
    flux_inpainting_ball

I hope this PR will be helpful for the community and contribute positively to the Diffusers ecosystem! 🌱

Core library:

@nitinmukesh
Copy link

Awesome. 👍

@apolinario
Copy link
Collaborator

Fantastic! Would be cool to have a demo for it on Hugging Face Spaces while the PR gets reviewed :-)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the super cool PR! I left some comments
would you be able to provide more results for different strength values for both with and without the image_reference?

image: Optional[PipelineImageInput] = None,
image_reference: Optional[PipelineImageInput] = None,
mask_image: PipelineImageInput = None,
masked_image_latents: PipelineImageInput = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not support masked_image_latents for now so that we can simplify the logic a bit here - it's not used here

else:
masked_image = masked_image_latents

mask, masked_image_latents = self.prepare_mask_latents(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see masked_image_latents being used in this pipeline, let's remove logics that we don't need here

self._interrupt = False

# 2. Preprocess image
if image is not None and not (isinstance(image, torch.Tensor) and image.size(1) == self.latent_channels):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think image is not optional for this pipeline, no?

@vuongminh1907
Copy link
Author

vuongminh1907 commented Jun 28, 2025

Hi @yiyixuxu, I just updated the code based on your comments — thank you so much!

About updating the results with more strength values, I did a quick test and here are the outputs:

Inpainting with text only

5V (1)

Inpainting with image conditioning

5V (2)

It seems that KontextInpaint works quite well with strength=1.0.

@vuongminh1907
Copy link
Author

Fantastic! Would be cool to have a demo for it on Hugging Face Spaces while the PR gets reviewed :-)

While waiting for the PR to be reviewed, feel free to test it using:
pip install git+https://github.com/vuongminh1907/diffusers

@nitinmukesh
Copy link

nitinmukesh commented Jun 30, 2025

Thanks again @vuongminh1907

Tested using
pip install git+https://github.com/huggingface/diffusers.git@refs/pull/11820/head

Working good with nunchaku optimization. Will do more tests.

Kindly review and merge.

Nunchaku output

kontext_inpainting_normal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants