Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When release the doc of Data Preparation #1

Open
ModulatedConvolutionalNetworks opened this issue Jun 19, 2023 · 10 comments
Open

When release the doc of Data Preparation #1

ModulatedConvolutionalNetworks opened this issue Jun 19, 2023 · 10 comments

Comments

@ModulatedConvolutionalNetworks

Thanks for the wonderful job!
I want to known when release the doc of Data Preparation.
Thanks very much!

@guyuchao
Copy link
Contributor

This week, I will release the main branch for application purposes. Then I will start to write docs and give examples to use this codebase for the customized multi-concept generation.

@ModulatedConvolutionalNetworks
Copy link
Author

This week, I will release the main branch for application purposes. Then I will start to write docs and give examples to use this codebase for the customized multi-concept generation.

Hello,I expect to reproduce and reuse the surprising work.When the docs and examples will be released?

@guyuchao
Copy link
Contributor

guyuchao commented Jul 4, 2023

Sorry, I am in my vocation leave and will update after I am back.

@wangdong2023
Copy link

Hi, any updates on this? :D

@TalhaUsuf
Copy link

Hi you can see the link to the dataset sample (google drive link) that they provided in the README and just follow the same dir. structure and you can do captioning using Kohya repo.

@rezkanas
Copy link

rezkanas commented Mar 20, 2024

any update on data preprocessing? after debugging, I figure out so far that I need to resize all input photos and masks to 512, 288 shape and convert them to png formats for both masks and photos. after that, training started but I am stuck now in the following error

2024-03-20 23:12:48,823 INFO: ***** Running training *****
2024-03-20 23:12:48,823 INFO:   Num examples = 8500
2024-03-20 23:12:48,823 INFO:   Instantaneous batch size per device = 2
2024-03-20 23:12:48,823 INFO:   Total train batch size (w. parallel, distributed & accumulation) = 2
2024-03-20 23:12:48,823 INFO:   Total optimization steps = 4250.0
Traceback (most recent call last):
  File "/home/anasrezklinux/test_pycharm_link/main/mix_of_show/mix_of_show_git_repository/train_edlora.py", line 198, in <module>
    train(root_path, args)
  File "/home/anasrezklinux/test_pycharm_link/main/mix_of_show/mix_of_show_git_repository/train_edlora.py", line 119, in train
    loss = EDLoRA_trainer(batch['images'], batch['prompts'], masks, batch['img_masks'])
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/accelerate/utils/operations.py", line 680, in forward
    return model_forward(*args, **kwargs)
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/accelerate/utils/operations.py", line 668, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "/home/anasrezklinux/.local/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast
    return func(*args, **kwargs)
  File "/home/anasrezklinux/test_pycharm_link/main/mix_of_show/mix_of_show_git_repository/mixofshow/pipelines/trainer_edlora.py", line 256, in forward
    attention_loss = self.cal_attn_reg(attention_maps, masks, text_input_ids)
  File "/home/anasrezklinux/test_pycharm_link/main/mix_of_show/mix_of_show_git_repository/mixofshow/pipelines/trainer_edlora.py", line 298, in cal_attn_reg
    map_adjective, map_subject = cross_map[..., 0], cross_map[..., 1]
IndexError: index 0 is out of bounds for dimension 1 with size 0

please update.

More to that:

  • docs/Dataset.md is empty
  • your generic pointer to "data processing in sd-scripts" is not much of help, unless you can be more precise into which python file you are referring to in that repository.
  • I also saw that in your yaml file you have instance_transform part but do i still need to preprocess the data before feeding it into training, can you explain the redundancy here?

@guyuchao
Copy link
Contributor

Hi, recently I was busy on another projects. The instance transform automatically crops the human image, but it requires the short edge of your input image > 512. You may have another try to resize all your input images. The data processing in sd-scripts means the tag process instead of image pre-processing. We feed image into sd-webui to get caption.

@rezkanas
Copy link

Thank you @guyuchao for your support. I have changed my code to resize the input image while keeping the aspect ratio, so the shorter edge will be 550 (larger than 512). I produce masks that has the same resized dimension using yolo8m. The sizes of my images are:

1 original width/height: 480 673 after resizing width/height: 550 771
2 original width/height: 2736 3648 after resizing width/height: 550 733
3 original width/height: 1124 1497 after resizing width/height: 550 732
4 original width/height: 1080 1920 after resizing width/height: 550 977
5 original width/height: 1080 1920 after resizing width/height: 550 977
6 original width/height: 1152 2048 after resizing width/height: 550 977
7 original width/height: 3840 5120 after resizing width/height: 550 733
8 original width/height: 2736 3648 after resizing width/height: 550 733
9 original width/height: 1538 2048 after resizing width/height: 550 732
10 original width/height: 2736 3648 after resizing width/height: 550 733
11 original width/height: 2736 3648 after resizing width/height: 550 733
12 original width/height: 1124 1497 after resizing width/height: 550 732
13 original width/height: 2736 3648 after resizing width/height: 550 733
14 original width/height: 2736 3648 after resizing width/height: 550 733
15 original width/height: 1125 1500 after resizing width/height: 550 733
16 original width/height: 1125 1500 after resizing width/height: 550 733

but I am ending up with the same error message as I write above...

File "/home/anasrezklinux/test_pycharm_link/main/mix_of_show/mix_of_show_git_repository/mixofshow/pipelines/trainer_edlora.py", line 298, in cal_attn_reg
    map_adjective, map_subject = cross_map[..., 0], cross_map[..., 1]
IndexError: index 0 is out of bounds for dimension 1 with size 0

let me know if you need any more details to check this issue?

@rezkanas
Copy link

rezkanas commented Apr 2, 2024

@guyuchao any update on this error?

@rezkanas
Copy link

rezkanas commented Apr 8, 2024

I finally has found out the solution for above error. for anyone interested, the photos captions must include new concept tokens .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants