Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp BioGenerics.Automa.State: Recoverable readers #103

Open
jakobnissen opened this issue Jul 29, 2022 · 1 comment · May be fixed by BioJulia/BioGenerics.jl#17
Open

Revamp BioGenerics.Automa.State: Recoverable readers #103

jakobnissen opened this issue Jul 29, 2022 · 1 comment · May be fixed by BioJulia/BioGenerics.jl#17

Comments

@jakobnissen
Copy link
Member

BioGenerics.jl currently has an object BioGenerics.Automa.State, which is used to track the state of Readers. It has several problems:

  1. It's in BioGenerics. Why? It's clearly an Automa thing, and it even relies on Automa internals.
  2. It contains linenum, despite offering no guarantees that downstream users keep track of it correctly. I believe that, if this is kept track of by readers, it should be in the Reader object themselves.
  3. It contains an unneeded filled bool value which I would rather get rid of, if possible
  4. It does NOT contain the stream position, which is arguably the most important state of all! That means readers are unrecoverable: If you reach a bad FASTA record, you can't reset the reader or have it tell you its position. Ideally, when encountering malformed data, readers should report the error, then reset itself to the last correct position.

So: Fix these issues. This is a breaking change and so should go in the upcoming breaking change, as well as requiring a breaking change of BioGenerics.

@TransGirlCodes
Copy link
Member

I agree. I'm also not opposed to just breaking BioGenerics or doing without it full stop. As time has gone on, I've become more of the attitude trying to anticipate what readers should look like in advance, defining that in BioGenerics and then making every format reader adhere to that, is a bad idea. When we can just let each format package define it's reader, records, and functions. Well defined and with good interfaces like iteration etc, it should be trivial to write generic code on top of them anyway, and they don't need to care about BioGenerics.

@kescobo kescobo changed the title Revamb BioGenerics.Automa.State: Recoverable readers Revamp BioGenerics.Automa.State: Recoverable readers Aug 3, 2022
@jakobnissen jakobnissen linked a pull request Oct 14, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants