Breaking changes for V1 #95

jakobnissen · 2022-07-15T19:46:51Z

Superseded by #119

Todo

Review preconditions (see Taking preconditions seriously #102)
Get test coverage near 100%
Figure out if you can get CI on M1 mac
Update README.md

Closes #71
Closes #80
Closes #82
Closes #91
Closes #111
Closes #115
Closes #116

codecov · 2022-07-16T13:05:31Z

Codecov Report

Patch coverage: 92.50% and project coverage change: +4.92 🎉

Comparison is base (7c6a696) 89.48% compared to head (071b549) 94.40%.

❗ Current head 071b549 differs from pull request most recent head d2b4a1c. Consider uploading reports for the commit d2b4a1c to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master      #95      +/-   ##
==========================================
+ Coverage   89.48%   94.40%   +4.92%     
==========================================
  Files          14       15       +1     
  Lines        1683     1735      +52     
==========================================
+ Hits         1506     1638     +132     
+ Misses        177       97      -80

Flag	Coverage Δ
unittests	`94.40% <92.50%> (+4.92%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/Automa.jl	`100.00% <ø> (ø)`
src/action.jl	`97.43% <ø> (ø)`
src/dot.jl	`95.78% <ø> (ø)`
src/precond.jl	`86.36% <ø> (+2.27%)`	⬆️
src/codegen.jl	`90.60% <84.78%> (-0.98%)`	⬇️
src/workload.jl	`88.88% <88.88%> (ø)`
src/re.jl	`93.56% <94.73%> (+0.88%)`	⬆️
src/machine.jl	`81.81% <96.15%> (+2.77%)`	⬆️
src/tokenizer.jl	`96.92% <96.72%> (+4.87%)`	⬆️
ext/AutomaStream.jl	`96.92% <96.92%> (ø)`
... and 9 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Instead of doing `re.actions[:exit] = [:foo]`, do `onexit(re, :foo)`. It's cleaner, and avoids messing with internal fields of struct.

There is no reason to use `import` over `using`. On the contrary, it muddles the difference between exported and unexported names, and also allows extending foreign methods without warning.

Currently, Automa allows you to add a :final action to a regex like "a*", where it is not possible to determine when reading a byte whether it is the final byte matching the regex. Current behaviour is to execute the action for every byte that could concievably be the final byte whether or not it is. Disallow this behaviour by throwing an error in re2nfa for :final actions in looping regex.

Many potential users of Automa are not interested in parsing from IOs, but only buffers. For those users, the IO-parsing functionality of Automa is not needed, and so there is no need for dependency on TranscodingStreams.

NFAs with ambiguities often contain multiple ambiguities. Displaying the simplest ambiguity when erroring makes debugging easier - especially compared to when the shown ambiguity can never happen due to another ambiguity.

An oversight in the ambiguity check meant that actions placed on non-epsilon edges were accidentally not included in the paths for validation. MWE: `compile(onfinal!(re"a", :a) | onfinal!(re"a", :b))` This breaks tokenizers, so we manually skip ambiguity check in tokenizers. In the case of conflicting actions in tokenizers, this will cause the longest matching token to be emitted.

The tokenizer has a completely new design and API. * It's now much easier to use * It's now lazy by default * It's much faster, although not completely optimised. Its API is amenable to further optimisation * It handles errors automatically See issue BioJulia#116

Users should not have access to the module directly. Instead, export the RE struct, and also allow users to construct regex with `RE(str)`.

Instead of buffering an entire line, simply keep track of the number of columns cleared from the buffer. This reaches some more into TranscodingStreams privates, but it's well tested.

jakobnissen force-pushed the v1 branch 5 times, most recently from 6036343 to a1649be Compare July 24, 2022 08:22

jakobnissen mentioned this pull request Jul 24, 2022

Verify keys in RE.actions are correct on setindex! #100

Closed

jakobnissen force-pushed the v1 branch from 18a3059 to 64c0d29 Compare July 25, 2022 09:33

jakobnissen force-pushed the v1 branch from 6e5f508 to ec1748d Compare August 3, 2022 08:01

jakobnissen force-pushed the v1 branch 16 times, most recently from 3617ebf to ca4aad7 Compare March 1, 2023 18:57

jakobnissen force-pushed the v1 branch 5 times, most recently from 6883b0f to ea1f2b3 Compare March 8, 2023 17:32

jakobnissen added 16 commits March 8, 2023 18:58

Disallow direct modification of actions field

c26a88e

Instead of doing `re.actions[:exit] = [:foo]`, do `onexit(re, :foo)`. It's cleaner, and avoids messing with internal fields of struct.

Use using over import

97448f8

There is no reason to use `import` over `using`. On the contrary, it muddles the difference between exported and unexported names, and also allows extending foreign methods without warning.

Update FASTA example

4cec9bf

WIP: Tokenizer: Remove deprecated method

5bb87df

Make TranscodingStreams an optional dependency

0cca005

Many potential users of Automa are not interested in parsing from IOs, but only buffers. For those users, the IO-parsing functionality of Automa is not needed, and so there is no need for dependency on TranscodingStreams.

Error with shortest known ambiguity

da2320f

NFAs with ambiguities often contain multiple ambiguities. Displaying the simplest ambiguity when erroring makes debugging easier - especially compared to when the shown ambiguity can never happen due to another ambiguity.

Rewrite tokenizer

55d81c9

The tokenizer has a completely new design and API. * It's now much easier to use * It's now lazy by default * It's much faster, although not completely optimised. Its API is amenable to further optimisation * It handles errors automatically See issue BioJulia#116

Rename generate_validator_function

cb7fe5f

Export regex struct instead of module

603a783

Users should not have access to the module directly. Instead, export the RE struct, and also allow users to construct regex with `RE(str)`.

Tweak: Allow | and & ops b/w chars/str and RE

ef05382

Remove report_col kwarg

87d1f83

Instead of buffering an entire line, simply keep track of the number of columns cleared from the buffer. This reaches some more into TranscodingStreams privates, but it's well tested.

Add SnoopPrecompile

e5b22a5

Rewrite documentation

7f2db10

Add figures to docs

4f3acbe

jakobnissen force-pushed the v1 branch from 353bb6d to 4f3acbe Compare March 8, 2023 17:58

jakobnissen added 2 commits March 8, 2023 19:15

Bump CI version to Julia 1.6

4229921

Support Julia version 1.6

4868df3

jakobnissen force-pushed the v1 branch from 690970d to 4868df3 Compare March 8, 2023 18:16

jakobnissen added 5 commits March 8, 2023 22:52

Make generate_buffer_validator goto into kwarg

2683fce

SnoopPrecompile more stuff

9eec66c

Various small fixes to documentation

e0b5e55

Update README.md

6c4e244

Add documentation preview

d2b4a1c

jakobnissen force-pushed the v1 branch from 071b549 to d2b4a1c Compare March 9, 2023 08:25

jakobnissen changed the base branch from master to v1 March 9, 2023 08:28

jakobnissen merged commit e1ec103 into BioJulia:v1 Mar 9, 2023

jakobnissen mentioned this pull request Mar 9, 2023

Breaking changes for v1 #119

Merged

jakobnissen deleted the v1 branch July 19, 2023 11:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Breaking changes for V1 #95

Breaking changes for V1 #95

jakobnissen commented Jul 15, 2022 •

edited

Loading

codecov bot commented Jul 16, 2022 •

edited

Loading

Breaking changes for V1 #95

Breaking changes for V1 #95

Conversation

jakobnissen commented Jul 15, 2022 • edited Loading

Todo

codecov bot commented Jul 16, 2022 • edited Loading

Codecov Report

jakobnissen commented Jul 15, 2022 •

edited

Loading

codecov bot commented Jul 16, 2022 •

edited

Loading