Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaking changes for V1 #95

Merged
merged 60 commits into from
Mar 9, 2023
Merged

Breaking changes for V1 #95

merged 60 commits into from
Mar 9, 2023

Conversation

jakobnissen
Copy link
Member

@jakobnissen jakobnissen commented Jul 15, 2022

Superseded by #119

Todo

Closes #71
Closes #80
Closes #82
Closes #91
Closes #111
Closes #115
Closes #116

@codecov
Copy link

codecov bot commented Jul 16, 2022

Codecov Report

Patch coverage: 92.50% and project coverage change: +4.92 🎉

Comparison is base (7c6a696) 89.48% compared to head (071b549) 94.40%.

❗ Current head 071b549 differs from pull request most recent head d2b4a1c. Consider uploading reports for the commit d2b4a1c to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #95      +/-   ##
==========================================
+ Coverage   89.48%   94.40%   +4.92%     
==========================================
  Files          14       15       +1     
  Lines        1683     1735      +52     
==========================================
+ Hits         1506     1638     +132     
+ Misses        177       97      -80     
Flag Coverage Δ
unittests 94.40% <92.50%> (+4.92%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/Automa.jl 100.00% <ø> (ø)
src/action.jl 97.43% <ø> (ø)
src/dot.jl 95.78% <ø> (ø)
src/precond.jl 86.36% <ø> (+2.27%) ⬆️
src/codegen.jl 90.60% <84.78%> (-0.98%) ⬇️
src/workload.jl 88.88% <88.88%> (ø)
src/re.jl 93.56% <94.73%> (+0.88%) ⬆️
src/machine.jl 81.81% <96.15%> (+2.77%) ⬆️
src/tokenizer.jl 96.92% <96.72%> (+4.87%) ⬆️
ext/AutomaStream.jl 96.92% <96.92%> (ø)
... and 9 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Instead of doing `re.actions[:exit] = [:foo]`, do `onexit(re, :foo)`. It's
cleaner, and avoids messing with internal fields of struct.
There is no reason to use `import` over `using`. On the contrary, it muddles
the difference between exported and unexported names, and also allows extending
foreign methods without warning.
Currently, Automa allows you to add a :final action to a regex like "a*", where
it is not possible to determine when reading a byte whether it is the final
byte matching the regex.
Current behaviour is to execute the action for every byte that could concievably
be the final byte whether or not it is.

Disallow this behaviour by throwing an error in re2nfa for :final actions in
looping regex.
Many potential users of Automa are not interested in parsing from IOs, but only
buffers. For those users, the IO-parsing functionality of Automa is not needed,
and so there is no need for dependency on TranscodingStreams.
NFAs with ambiguities often contain multiple ambiguities. Displaying the
simplest ambiguity when erroring makes debugging easier - especially compared
to when the shown ambiguity can never happen due to another ambiguity.
An oversight in the ambiguity check meant that actions placed on non-epsilon
edges were accidentally not included in the paths for validation.
MWE: `compile(onfinal!(re"a", :a) | onfinal!(re"a", :b))`

This breaks tokenizers, so we manually skip ambiguity check in tokenizers.
In the case of conflicting actions in tokenizers, this will cause the longest
matching token to be emitted.
The tokenizer has a completely new design and API.
* It's now much easier to use
* It's now lazy by default
* It's much faster, although not completely optimised. Its API is amenable to
  further optimisation
* It handles errors automatically

See issue BioJulia#116
Users should not have access to the module directly. Instead, export the RE
struct, and also allow users to construct regex with `RE(str)`.
Instead of buffering an entire line, simply keep track of the number of columns
cleared from the buffer. This reaches some more into TranscodingStreams privates,
but it's well tested.
@jakobnissen jakobnissen changed the base branch from master to v1 March 9, 2023 08:28
@jakobnissen jakobnissen merged commit e1ec103 into BioJulia:v1 Mar 9, 2023
@jakobnissen jakobnissen deleted the v1 branch July 19, 2023 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant