Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle compression with Apache commons-compress #16

Open
magicDGS opened this issue Jul 19, 2018 · 0 comments
Open

Handle compression with Apache commons-compress #16

magicDGS opened this issue Jul 19, 2018 · 0 comments

Comments

@magicDGS
Copy link
Member

Using Apache commons-compress will allow to write/read files compressed in several formats, including snappy and gzip. Some advantages of using it would be:

  • Better gzip support. Due to some issues of java.util.zip.GZIPInputStream (unsafe use of available()), previous version of HTSJDK was broken while reading compressed trible files. Using the commons-compress version might be more robust to this.
  • Read/Write compressed FASTQ files compressed with bz2 or other, not only hard-coded ones.
  • Better format detection, not only based on the extension, might be possible (I did not explore this enough)
  • Usage of a well-mainained interface for compression. This allows to implement bgzip IO as a compressor and support other algorithms and optimizations (e.g., GKL could implement te commons-compress interfaces).

Opinitions on this, @samtools/htsjdk-next-maintainers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant