-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simulator.py taking very long to run and RAM usage above 768 GB #76
Comments
Hi Andres, The problem is with sd. The sd is the sd of log normal distribution, instead of the whole distribution. So you will need to convert it according to wiki. if you sd is too large, it will generate some extremely long or short sequences, and then will be discarded because they are longer than the genome size or smaller than the minimum threshold. Let me know if you have further questions. Chen |
Hi Chen, Say we want a median of 8000, what Thank you |
Sorry for the late reply. The standard deviation is independent of genome size, and it purely depends on how much you want the reads to spread. I'd suggest |
Hi @cheny19, I also had this similar issue. Compared to default setting, simulator.py taking very long to run in the setting of Many thanks, |
@HLHsieh |
I executed the following
My goal is to simulate reads with distribution of median=20kb and std=10kb. I also tried to execute that command with the default value of median and std, and it went smoothly.
Please advise. Thanks! |
Hi @kmnip, I would like to follow up on this issue. Any suggestions would be appreciated. PS. My version is 3.1.0. Best, |
I did the characterization with E. coli and an SRA run from NCBI. Then, I used that generated profile in simulator.py. Everything works well if no -med and -sd are used. However, when I want a median of 8000 and sd of 200, the simulation gets stuck and takes very long. After a few hours, it uses all RAM and the job is killed by our HPC scheduler.
See below the code being used:
Any advice is greatly appreciated.
The text was updated successfully, but these errors were encountered: