Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory issue? #11

Open
HOMGH opened this issue Nov 7, 2022 · 7 comments
Open

Memory issue? #11

HOMGH opened this issue Nov 7, 2022 · 7 comments

Comments

@HOMGH
Copy link

HOMGH commented Nov 7, 2022

Hi,
Thanks for sharing your code.
I got "scripts/ddpm/train_interpreter.sh: line 6: 3842 Killed python train_interpreter.py --exp experiments/${DATASET}/ddpm.json $MODEL_FLAGS" error.

I have ~65G RAM available on my Ubuntu. Considering your note that "it requires ~210Gb for 50 training images of 256x256."
Does it mean that it's not feasible to train the model on my system? How about evaluation?
Thanks in advance.

@dbaranchuk
Copy link
Contributor

You can try to i) reduce the number of training images ii) reduce feature dimensionality iii) store features on the disk and preprocess them by batches.

@Yi-Lynn
Copy link

Yi-Lynn commented Feb 4, 2023

You can try to i) reduce the number of training images ii) reduce feature dimensionality iii) store features on the disk and preprocess them by batches.

Hi, thanks for the wonderful work and codes you provided. Can you explain how would I implement the last one (iii)? How do I process the features by batches if they are too huge that I cannot even store them on the disk? I have encounted the same problem as @HOMGH and the extracted pixel representations are so huge that before the prepare_data() function return the data, the process I'm running the script will be killed. For example, when I run ddpm on the cat_15 dataset using the original experiment setting with 30 training images, when the program tries to run the following two lines, the process will crash.

X = X.transpose(1,0,2,3).reshape(d,-1).transpose(1,0)  # Here X with size of [30,8448,256,256] is converted to size [30*256*256, 8448]
y = y.flatten()

My solution is to write another prepare_data() function, which processes one image instead of all labelled training images at one time. Then during the training of the pixel classifier, at each epoch I will create a dataloader for one image and iterate through all training images. But there exists some gap between the final evaluation results I got and yours. Do you have any suggestions for that?

Thanks a lot for your help.

@Qyunhao
Copy link

Qyunhao commented Feb 11, 2023

You can try to i) reduce the number of training images ii) reduce feature dimensionality iii) store features on the disk and preprocess them by batches.

Hi, thanks for the wonderful work and codes you provided. Can you explain how would I implement the last one (iii)? How do I process the features by batches if they are too huge that I cannot even store them on the disk? I have encounted the same problem as @HOMGH and the extracted pixel representations are so huge that before the prepare_data() function return the data, the process I'm running the script will be killed. For example, when I run ddpm on the cat_15 dataset using the original experiment setting with 30 training images, when the program tries to run the following two lines, the process will crash.

X = X.transpose(1,0,2,3).reshape(d,-1).transpose(1,0)  # Here X with size of [30,8448,256,256] is converted to size [30*256*256, 8448]
y = y.flatten()

My solution is to write another prepare_data() function, which processes one image instead of all labelled training images at one time. Then during the training of the pixel classifier, at each epoch I will create a dataloader for one image and iterate through all training images. But there exists some gap between the final evaluation results I got and yours. Do you have any suggestions for that?

Thanks a lot for your help.
Hi,
Can you send me a copy of your code, I really need your code,
thank you very much!

@MariemOualha
Copy link

You can try to i) reduce the number of training images ii) reduce feature dimensionality iii) store features on the disk and preprocess them by batches.

Hi, thanks for the wonderful work and codes you provided. Can you explain how would I implement the last one (iii)? How do I process the features by batches if they are too huge that I cannot even store them on the disk? I have encounted the same problem as @HOMGH and the extracted pixel representations are so huge that before the prepare_data() function return the data, the process I'm running the script will be killed. For example, when I run ddpm on the cat_15 dataset using the original experiment setting with 30 training images, when the program tries to run the following two lines, the process will crash.

X = X.transpose(1,0,2,3).reshape(d,-1).transpose(1,0)  # Here X with size of [30,8448,256,256] is converted to size [30*256*256, 8448]
y = y.flatten()

My solution is to write another prepare_data() function, which processes one image instead of all labelled training images at one time. Then during the training of the pixel classifier, at each epoch I will create a dataloader for one image and iterate through all training images. But there exists some gap between the final evaluation results I got and yours. Do you have any suggestions for that?

Thanks a lot for your help.

Hello ,@Yi-Lynn , can you send me your code ?
thank you .

@SahadevPoudel
Copy link

@Yi-Lynn Could you send me your code? please?
@MariemOualha Did you implement? If you have implemented, Could you send me the code?

@choidaedae
Copy link

choidaedae commented Aug 17, 2023

@Yi-Lynn Could you send me your code? please? @MariemOualha Did you implement? If you have implemented, Could you send me the code?

@SahadevPoudel Hi, Poudel! I had the same problem, and I succeeded in implementing it in the way shown above. The newly proposed method declares a class called 'DividedImageLabelDataset' from src/datasets.py , which is imported from the train code and used. Please comment if you still want code about it!

@yunzhuC
Copy link

yunzhuC commented Feb 4, 2024

@Yi-Lynn Could you send me your code? please? @MariemOualha Did you implement? If you have implemented, Could you send me the code?

@SahadevPoudel Hi, Poudel! I had the same problem, and I succeeded in implementing it in the way shown above. The newly proposed method declares a class called 'DividedImageLabelDataset' from src/datasets.py , which is imported from the train code and used. Please comment if you still want code about it!

Hello! I'm experiencing the same problem, but I don't know what I should do to fix it. Can you share the code you modified? Thanks a lot.
This is my email address: [email protected]. Or you can choose other methods that are convenient for you, thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants