Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible Memory Leak, Colab Crashes #534

Open
GrahamGGreig opened this issue Feb 26, 2025 · 5 comments
Open

Possible Memory Leak, Colab Crashes #534

GrahamGGreig opened this issue Feb 26, 2025 · 5 comments

Comments

@GrahamGGreig
Copy link

Hi Meridian Team,

I have been testing out Meridian as a MMM replacement for Robyn. However, one issue I have been coming across is Colab crashing after 2-3 runs of the model due to an out of RAM error.

I'm have tried this in both Colab and Colab Enterprise with 12.7 GB of RAM on a T4 with attached GPU. I'm using 2 years of weekly data at the national level with 8 Input columns and 6, one-hot encoded controls (modelling important event dates). After each run the RAM goes up by 5-6 GB and running again adds to this. I have tried deleting all the variables manually before re-running but this has no effect.

Is there anything I could do to fix this issue? Or is there a known workaround for this? Right now it is making model experimentation extremely slow and making an automated grid search of potential priors impossible.

@AdimDrewnik
Copy link

Same here but I can only run Meridian demo code once. The second run causes lack of RAM.

@sonriks6
Copy link
Collaborator

Yes, the sample process takes a lot of RAM, I recommend to 'del' unused objects when possible and use gc.collect() as well, anyway the average memory for a model takes more than 6 Gb!

Any suggestions appreciated here!

@GrahamGGreig
Copy link
Author

Ah, using del and gc.collect() is what I have been doing but It has no impact on RAM allocation. In fact, I have deleted every variable held in memory followed by gc.collect() and I still have 8.3GB of RAM allocated. From looking at the code it is either coming from the posterior_sampler_callable or from the large number of cached functions. Is i t possible the posterior_sampler has a memory leak or is caching data anonymously?

@cpulavarthi
Copy link
Collaborator

Hello @GrahamGGreig,

Thank you for contacting us!

Meridian uses Tensorflow internally, and memory leak is a known Tensorflow issue. Following are some of the recommendations we can provide as of now:

I will update here if I come across any better solution for this issue. Feel free to reach out for any further queries.

Thank you

Google Meridian Support Team

@AdimDrewnik
Copy link

"If you are running out of memory during a single model run, use n_chains"
Do you mean reduce number of chains or run them serially, not in parallel?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants