[RFC]: Broaden characters allowed in FASTA sequences #75
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR allows all printable ASCII characters in FASTA sequences. That is, all bytes represented by the characters
'!':'~'
but not>
. Like before, horizontal whitespace, i.e.\t\v
and space is allowed inside sequences, but are not considered part of the sequence.I think this character set is the broadest possible set that is practically parseable. Expanding it further would mean allowing non-printable characters, which would be a complete mess, or Unicode, which would be another complete mess.
This PR is meant just to toss the idea out there, for debate. I have no strong intuition it is actually a good idea.