Skip to content

Releases: FoxoTech/methylprep

v1.7.0 100% sesame/minfi comptable and works with parquet format

14 Jul 16:34
8a540ce
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.5.0...v1.7.0

version 1.5.0 - complete mouse array support and sesame/minfi compatability confirmed

07 Jul 19:46
4ce78af
Compare
Choose a tag to compare

v1.5.0

  • MAJOR refactor/overhaul of all the internal classes. This was necessary to fully support the mouse array.
    • new SigSet class object that mirror's sesame's SigSet and SigDF object.
      • Combines idats, manifest, and sample sheet into one object that is inherited by SampleDataContainer
    • RawDataset, MethylationDataset, ProbeSubtype all deprecated and replaced by SigSet
    • SampleDataContainer class is now basically the SigSet plus all pipeline processing settings
    • new mouse manifest covers all probes and matches sesame's output
    • Processing will work even if a batch of IDATs have differing probe counts for same array_type, though those
      differing probes in question may not be saved.
    • unit tests confirm that methylprep, sesame, and minfi beta values output match to within 1% of each other now. Note that the intermediate stages of processing (after NOOB and after DYE) do not match
      with sesame in this version. Can be +/- 100 intensity units, likely due to differences in order of
      steps and/or oob/mask probes used.

GEO support, composite data sets, meta data, extra probes

24 Jan 19:38
fbb1fa9
Compare
Choose a tag to compare

Adds:

  • Improved documentation
  • option to save Control / SNP probes
  • GEO: download a bunch of data sets and only save samples that match a pattern
  • GEO: load and use preprocessed data
  • GEO: parse MiniML meta data into dataframe for analysis later

GEO/ArrayExpress data ingester, batch_size, and bug fixes

09 Sep 17:13
f81c757
Compare
Choose a tag to compare
  • the CLI now includes a download option. Supply the GEO ID or ArrayExpress ID and it will locate the files, download the idats, process them, and build a dataframe of the associated meta data. This dataframe format should be compatible with methylcheck and methylize.
  • When processing large batches of raw .idat files, specify --batch_size to break the processing up into smaller batches so the computer's memory won't overload. This is off by default when using process but is ON when using download and set to batch_size of 100.
  • Now includes support for older 27k arrays.

Functional processing 1.0

26 Jun 20:39
9290b39
Compare
Choose a tag to compare

Provides a command line interface for processing methylation data (a batch of idat files and associated sample sheet csv).