Releases · bcgsc/NanoSim

23 Dec 23:35

cheny19

v2.5.0

5f8d7f9

v2.5.0

In this release, we implemented a few new features and resolved a few bugs.

New features:

Multiprocessing in the simulation stage. Based on our experience, 4 to 12 processers balance well between runtime and memory usage for simulating 1 million reads. The memory increases roughly linearly due to the nature of Python multiprocessing. As a rough estimate, it takes less than 5G memory to simulate human transcriptome with 4 processers.
Homopolymer simulation. For this parameter, we provide three options, each targeting each basecaller: Albacore, Guppy, and Guppy + flipflop model
Simulate aligned reads first, and then unaligned reads. These two types of reads are stored in separate files for better user experience.

Bug fixes:

Fixed retained intron / deleted exon problem in error calculation
Fixed index out of range bug in the simulation stage

Assets 2

18 Jul 21:51

SaberHQ

v2.4-beta

f900404

Simulating transcriptome ONT reads Pre-release

Pre-release

This is a pre-release version which is now capable of simulating both genomic and transcriptomic (cDNA and directRNA) ONT reads with even increased performance. Users may run the pipeline in "genome" or "transcriptome" mode. The transcriptome mode also models features of the library preparation protocols used, including intron retention events in cDNA and direct RNA reads. Further, it profiles transcript expression patterns.

We provided a very comprehensive README file for more information on how to run the pipeline in both modes.

Users who may have tried Trans-NanoSim before, can now rely on this version to simulate transcriptome ONT reads.

Major updates since pre-release v2.3-beta:

Added an optional flag (--uracil) to convert the thymine (T) bases to uracil (U) in the output fasta format. It is helpful if you are dealing with direct RNA reads.
Fixed a bug related to input file requirements when you use (--no_model_ir). Refer here: #63
Increased simulation speed substantially when IR modelling is not set (--no_model_ir). It performs 5-folds faster now. We also removed some redundant and unnecessary code lines to improve the overall performance of the pipeline.
As for "Perfect" reads (--perfect), we are now considering expression profiles when simulating them. Therefore your "perfect" error-free reads are going to follow your desired expression levels as well.

Please keep using the pipeline and share your thoughts on it. Cheers!

Assets 2

17 Jun 23:31

SaberHQ

v2.3-beta

6f44514

Simulating transcriptome ONT reads Pre-release

Pre-release

NOTE: Please do not use this release as it has an input requirement bug.

We provided a very comprehensive README file for more information on how to run the pipeline in both modes.

Users who may have tried Trans-NanoSim before, can now rely on this version to simulate transcriptome ONT reads.

This version has been tested on Python 2.7 and Python 3.6 with the latest compatible packages respectively.

Assets 2

17 Jan 19:27

cheny19

v2.2.1-beta

77a4393

V2.2.1-beta Pre-release

Pre-release

Bug fix:

Fixed the bug that might generate negative read length when using the log-normal distribution for simulation.

Assets 2

04 Dec 22:32

cheny19

v2.2.0

b3b8578

V2.2.0

This version has been tested on Python 2.7 and Python 3.6 with the latest compatible packages repectively. In this release, we made a few big changes, and the pre-trained model profiles on our ftp site are not compatible anymore, but users are still welcome to use the fasta files for training. We will provide pre-trained models soon.

Major changes:

Use Kernel Density Estimation (KDE) instead of Empirical cumulative density function (ECDF) to simulate the length distribution of reads (aligned and unaligned)
Removed the bining strategy in simulating the align ratio on each reads, and the length distribution of simulated reads are more smooth
Introduce --median_len and --sd_len options. Users can use these two options to control the median read length and the standard deviation, and the read lengths will follow lognormal distribution instead of the empirical length distribution from training reads

Note:

For ONT reads, the median length and mean length are quite different. The read length generally follow lognormal distribution, so please refer to wikipedia for details about these two parameters. The values are --median_len 5642 and --sd_len 1.015 for R9 1D reads, which is also roughly the same for other libraries.

Assets 2

24 May 00:24

cheny19

v2.1.0

afad587

V2.1.0

Changes:

The model fitting stage is run in Python now and supports multiprocessing. R is no longer required for the whole NanoSim pipeline
Improvement on runtime in model fitting stage

Notice:
We noticed that the proposed mixture model may not be well fitted for indels inferred from minimap2 and NanoSim will throw out warnings when it does not fit well. Users can still use the best available parameters to simulate, and the overall error rate will not be hugely affected.

We are working on new models for minimap2, stay tuned.

Assets 2

27 Apr 00:15

cheny19

v2.0.0

0eebd4f

NanoSim v2.0.0

Major changes:

Added -a option to specify the aligner, Minimap2 or LAST. Minimap2 is used as the default aligner in read analysis stage. Users can also feed in their own maf file or sam file (with MD string).

HTSeq is used to parse alignment files.

Tested on Python 2.7 and Python3.6

Assets 2

16 Mar 20:11

cheny19

v1.3.0

8e317bf

v1.3.0

Bug fixes:

Handels lowercase bases in reference genome

New features:

Outputs a separate file containing mismatch rate, insertion rate, and deletion rate
Introducing random seed generator as a parameter that users can generate two identical outputs.

Assets 2

18 Jul 17:23

cheny19

v1.2.0

785cba2

NanoSim v1.2.0

Speed improvement:

reading in reference genome, useful when simulating large genomes (human).
converting ambiguous bases

Assets 3

31 Mar 23:02

cheny19

v1.1.0

aec1111

v1.1.0

Bug fix:

For multi-chromosome genomes, total length of each read is fixed to be smaller than the largest chromosome
head/tail unassigned in extreme cases

Compatibility:
Thanks to @karel-brinda NanoSim works now with Python 2.6, 2.7, 3.2, 3.3., 3.4, 3.5 and 3.6

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: bcgsc/NanoSim

v2.5.0

Simulating transcriptome ONT reads

Simulating transcriptome ONT reads

V2.2.1-beta

V2.2.0

V2.1.0

NanoSim v2.0.0

v1.3.0

NanoSim v1.2.0

v1.1.0