Releases: FoxoTech/methylprep
Releases · FoxoTech/methylprep
v1.7.0 100% sesame/minfi comptable and works with parquet format
What's Changed
- Feature/v1.7.0 -- adds parquet support by @marcmaxson in #114
- Feature/v1.6.3 by @marcmaxson in #108
- Feature/v1.6.1 -- minor bug fixes by @marcmaxson
- Feature/v1.5.7 -- unlocks pandas (v 1.3.0+ supported now) by @marcmaxson in #96
- patch to help read_geo_processed.py detect txt csv files better by @marcmaxson in #88
- Update pOOBAH pval calculation by @notmaurox in #107
- correct both addressA and addressB for OOB probes when channel swapping by @notmaurox in #111
- Addition of negative control based pvalue calculation by @notmaurox in #109
- Bump lxml from 4.7.1 to 4.9.1 by @dependabot in #113
- Bump urllib3 from 1.26.4 to 1.26.5 by @dependabot in #86
New Contributors
- @dependabot made their first contribution in #86
- @notmaurox made their first contribution in #107
Full Changelog: v1.5.0...v1.7.0
version 1.5.0 - complete mouse array support and sesame/minfi compatability confirmed
v1.5.0
- MAJOR refactor/overhaul of all the internal classes. This was necessary to fully support the mouse array.
- new SigSet class object that mirror's sesame's SigSet and SigDF object.
- Combines idats, manifest, and sample sheet into one object that is inherited by SampleDataContainer
- RawDataset, MethylationDataset, ProbeSubtype all deprecated and replaced by SigSet
- SampleDataContainer class is now basically the SigSet plus all pipeline processing settings
- new mouse manifest covers all probes and matches sesame's output
- Processing will work even if a batch of IDATs have differing probe counts for same array_type, though those
differing probes in question may not be saved. - unit tests confirm that methylprep, sesame, and minfi beta values output match to within 1% of each other now. Note that the intermediate stages of processing (after NOOB and after DYE) do not match
with sesame in this version. Can be +/- 100 intensity units, likely due to differences in order of
steps and/or oob/mask probes used.
- new SigSet class object that mirror's sesame's SigSet and SigDF object.
GEO support, composite data sets, meta data, extra probes
Adds:
- Improved documentation
- option to save Control / SNP probes
- GEO: download a bunch of data sets and only save samples that match a pattern
- GEO: load and use preprocessed data
- GEO: parse MiniML meta data into dataframe for analysis later
GEO/ArrayExpress data ingester, batch_size, and bug fixes
- the CLI now includes a
download
option. Supply the GEO ID or ArrayExpress ID and it will locate the files, download the idats, process them, and build a dataframe of the associated meta data. This dataframe format should be compatible with methylcheck and methylize. - When processing large batches of raw .idat files, specify --batch_size to break the processing up into smaller batches so the computer's memory won't overload. This is off by default when using
process
but is ON when usingdownload
and set to batch_size of 100. - Now includes support for older 27k arrays.
Functional processing 1.0
Provides a command line interface for processing methylation data (a batch of idat
files and associated sample sheet csv).