Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(lib): add forbid_unsafe feature to disable unsafe code #413

Merged
merged 29 commits into from
Sep 13, 2024
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
dc7b212
add default 'unsafe' feature, remove 'read_byte_unchecked'
davidkern Aug 18, 2024
ef7e5ba
feature flag 'slice_unchecked'
davidkern Aug 18, 2024
6ddaef9
feature flag remaining unsafety
davidkern Aug 18, 2024
c1ac617
fix misnamed feature flags
davidkern Aug 18, 2024
d4a6877
fix a few more feature flags
davidkern Aug 18, 2024
13cd8bc
lints and formatting
davidkern Aug 19, 2024
884d9ba
correct the auto-correct
davidkern Aug 19, 2024
42e43be
Rename feature to forbid_unsafe so unsafe code is the default
davidkern Aug 19, 2024
223e3b5
reintroduce 'read_byte_unchecked'
davidkern Aug 19, 2024
ac3d019
don't use --all-features as the default, as that removes unsafe code …
davidkern Aug 19, 2024
7cd591a
add safe path to codegen and fix regression on unsafe path
davidkern Aug 19, 2024
f97c2d4
enable 'forbid_unsafe' on logos-derive/codegen if enabled on 'logos' …
davidkern Aug 19, 2024
e2e5ed4
fmt
davidkern Aug 19, 2024
8c13adf
additionally run benchmarks against forbid_unsafe
davidkern Aug 21, 2024
b4a2ceb
add --features forbid_unsafe to the new benchmarks
davidkern Aug 21, 2024
40d2cab
handle case of base branch not supporting forbid_unsafe feature
davidkern Aug 21, 2024
0997c34
include comment if comparing against defaults instead of before
davidkern Aug 21, 2024
87ed40f
troubleshoot why the two benchmark runs seem to be the same
davidkern Aug 21, 2024
c888fc1
try renaming the baselines
davidkern Aug 21, 2024
c169614
seems baseline names may only include underscores
davidkern Aug 21, 2024
fc950bf
and display the correct results for forbid_unsafe
davidkern Aug 21, 2024
af82cc9
test default features and forbid unsafe in test matrix
davidkern Aug 21, 2024
a446f68
add forbid_unsafe to integration test
davidkern Aug 21, 2024
3a44e83
Feature `forbid_unsafe` tests and benchmarks
davidkern Aug 21, 2024
d2ff9f5
add safety note to source trait
davidkern Sep 9, 2024
0242426
add safety note to source trait
davidkern Sep 9, 2024
7fcd8c6
wordsmithing
davidkern Sep 11, 2024
35a24bf
include note on feature unification
davidkern Sep 11, 2024
45a3b68
fmt
davidkern Sep 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/coverage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:

- name: Generate code coverage
run: |
cargo +nightly tarpaulin --verbose --all-features --workspace --timeout 120 --out Xml
cargo +nightly tarpaulin --verbose --features debug --workspace --timeout 120 --out Xml

- name: Upload to codecov.io
uses: codecov/codecov-action@v4
Expand Down
40 changes: 36 additions & 4 deletions .github/workflows/rustbench.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,32 +29,59 @@ jobs:
version: latest

- name: Run Benchmarks on changes
run: cargo bench --workspace --bench bench -- --save-baseline changes
run: |
cargo bench --workspace --bench bench -- --save-baseline default_changes
cargo bench --workspace --bench bench --features forbid_unsafe -- --save-baseline forbid_unsafe_changes

- uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.base.sha }}
clean: false

- name: Run Benchmarks before changes
run: cargo bench --workspace --bench bench -- --save-baseline before
run: |
cargo bench --workspace --bench bench -- --save-baseline default_before

# Skip benchmarking for forbid_unsafe feature if the PR base commit doesn't have the feature
if [[ "$(grep forbid_unsafe Cargo.toml)" ]]; then
cargo bench --workspace --bench bench --features forbid_unsafe -- --save-baseline forbid_unsafe_before
fi

- name: Compare benchmarks
run: |
echo 'results<<EOF' >> $GITHUB_OUTPUT
critcmp before changes >> $GITHUB_OUTPUT
critcmp default_before default_changes >> $GITHUB_OUTPUT
echo 'EOF' >> $GITHUB_OUTPUT
id: compare

- name: Compare benchmarks for forbid-unsafe
run: |
echo 'results<<EOF' >> $GITHUB_OUTPUT
if [[ "$(grep forbid_unsafe Cargo.toml)" ]]; then
critcmp forbid_unsafe_before forbid_unsafe_changes >> $GITHUB_OUTPUT
else
echo 'NOTE: PR base commit does not support the "forbid_unsafe" feature;' >> $GITHUB_OUTPUT
echo 'comparing forbid_unsafe against the default features on base instead.' >> $GITHUB_OUTPUT
critcmp default_before forbid_unsafe_changes >> $GITHUB_OUTPUT
fi
echo 'EOF' >> $GITHUB_OUTPUT
id: compare-forbid-unsafe

- name: Comment PR with benchmarks
uses: thollander/actions-comment-pull-request@v2
continue-on-error: true
with:
message: |
Benchmark results:
Benchmark results with default features:
```
${{ steps.compare.outputs.results }}
```

Benchmark results with feature "forbid_unsafe":
```
${{ steps.compare-forbid-unsafe.outputs.results }}
```

comment_tag: benchmarks

id: comment
Expand All @@ -66,3 +93,8 @@ jobs:
echo '```' >> $GITHUB_STEP_SUMMARY
echo '${{ steps.compare.outputs.results }}' >> $GITHUB_STEP_SUMMARY
echo '```' >> $GITHUB_STEP_SUMMARY

echo '### Benchmark results with forbid-unsafe' >> $GITHUB_STEP_SUMMARY
echo '```' >> $GITHUB_STEP_SUMMARY
echo '${{ steps.compare-forbid-unsafe.outputs.results }}' >> $GITHUB_STEP_SUMMARY
echo '```' >> $GITHUB_STEP_SUMMARY
2 changes: 1 addition & 1 deletion .github/workflows/rustdoc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,4 @@ jobs:
uses: Swatinem/rust-cache@v2

- name: Check rustdoc build
run: RUSTDOCFLAGS='--cfg docsrs' cargo +nightly doc --all-features -Zunstable-options -Zrustdoc-scrape-examples
run: RUSTDOCFLAGS='--cfg docsrs' cargo +nightly doc --features debug -Zunstable-options -Zrustdoc-scrape-examples
7 changes: 5 additions & 2 deletions .github/workflows/rustlib.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:
uses: Swatinem/rust-cache@v2

- name: Check rustdoc build
run: RUSTDOCFLAGS='--cfg docsrs' cargo +nightly doc --all-features -Zunstable-options -Zrustdoc-scrape-examples
run: RUSTDOCFLAGS='--cfg docsrs' cargo +nightly doc --features debug -Zunstable-options -Zrustdoc-scrape-examples

tests:
name: Tests
Expand All @@ -36,6 +36,9 @@ jobs:
- macos-latest
- ubuntu-latest
- windows-latest
features:
- "" # default features
- "--features forbid_unsafe"

runs-on: ${{ matrix.os }}
steps:
Expand All @@ -52,4 +55,4 @@ jobs:
uses: actions-rs/cargo@v1
with:
command: test
args: --workspace --verbose
args: --workspace --verbose ${{ matrix.features }}
2 changes: 1 addition & 1 deletion .github/workflows/rustlints.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
components: clippy

- name: Check clippy
run: cargo clippy --all-features -- -D warnings
run: cargo clippy --features debug -- -D warnings

rustfmt:
name: Rustfmt
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/rustmsrv.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,4 +24,4 @@ jobs:
version: ^0.15.1

- name: Check MSRV
run: cargo msrv verify -- cargo check --workspace --all-features
run: cargo msrv verify -- cargo check --workspace --features debug
4 changes: 3 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,11 @@ default = ["export_derive", "std"]
export_derive = ["logos-derive"]
# Should the crate use the standard library?
std = []
# Use safe alternatives for unsafe code (may impact performance)?
forbid_unsafe = ["logos-derive?/forbid_unsafe"]

[package.metadata.docs.rs]
all-features = true
features = ["debug"]
cargo-args = ["-Zunstable-options", "-Zrustdoc-scrape-examples"]
rustdoc-args = ["--cfg", "docsrs"]

Expand Down
1 change: 1 addition & 0 deletions book/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
+ [Using callbacks](./callbacks.md)
+ [Common regular expressions](./common-regex.md)
+ [Debugging](./debugging.md)
+ [Unsafe Code](./unsafe.md)
+ [Examples](./examples.md)
+ [Brainfuck interpreter](./examples/brainfuck.md)
+ [JSON parser](./examples/json.md)
Expand Down
2 changes: 1 addition & 1 deletion book/src/contributing/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ configuration to the one used by [docs.rs](https://docs.rs/logos/latest/logos/):

```bash
RUSTDOCFLAGS="--cfg docsrs" cargo +nightly doc \
--all-features \
--features debug \
-Zunstable-options \
-Zrustdoc-scrape-examples \
--no-deps \
Expand Down
32 changes: 32 additions & 0 deletions book/src/unsafe.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Unsafe Code

By default, **Logos** uses unsafe code to avoid unnecessary bounds checks while
accessing slices of the input `Source`.

This unsafe code also exists in the code generated by the `Logos` derive macro,
which generates a deterministic finite automata (DFA). Reasoning about the correctness
of this generated code can be difficult - if the derivation of the DFA in `Logos`
is correct, then this generated code will be correct and any mistakes in implementation
would be caught given sufficient fuzz testing.

Use of unsafe code is the default as this typically provides the fastest parser.

## Disabling Unsafe Code

However, for applications accepting untrusted input in a trusted context, this
may not be a sufficient correctness justification.

For those applications which cannot tolerate unsafe code, the feature `forbid-unsafe`
may be enabled. This replaces unchecked accesses in the `Logos` crate with safe,
checked alternatives which will panic on out-of-bounds access rather than cause
undefined behavior. Additionally, code generated by the macro will not use the
unsafe keyword, so generated code may be used in a crates using the
`#![forbid(unsafe_code)]` attribute.

Disabling unsafe code will generally result in a slower parser overall, though rarely
may end up slightly faster.
jeertmans marked this conversation as resolved.
Show resolved Hide resolved

There are too many variables to consider between compiler optimizations, the specific
grammar being parsed, and the target processor to make definitive statements about
performance of safe-only code. The automated benchmarks of this crate shows around a
10% slowdown in safe-only code as of the time of this writing.
jeertmans marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 2 additions & 0 deletions logos-codegen/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ rstest = "0.18.2"
debug = []
# Exports out internal methods for fuzzing
fuzzing = []
# Don't use or generate unsafe code
forbid_unsafe = []

[lib]
bench = false
Expand Down
12 changes: 10 additions & 2 deletions logos-codegen/src/generator/context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -59,12 +59,20 @@ impl Context {
self.available.saturating_sub(self.at)
}

pub fn read_byte_unchecked(&mut self) -> TokenStream {
pub fn read_byte(&mut self) -> TokenStream {
let at = self.at;

self.advance(1);

quote!(lex.read_byte_unchecked(#at))
#[cfg(not(feature = "forbid_unsafe"))]
{
quote!(unsafe { lex.read_byte_unchecked(#at) })
}

#[cfg(feature = "forbid_unsafe")]
{
quote!(lex.read_byte(#at))
}
}

pub fn read(&mut self, len: usize) -> TokenStream {
Expand Down
4 changes: 2 additions & 2 deletions logos-codegen/src/generator/fork.rs
Original file line number Diff line number Diff line change
Expand Up @@ -136,9 +136,9 @@ impl<'a> Generator<'a> {
let min_read = self.meta[this].min_read;

if ctx.remainder() >= max(min_read, 1) {
let read = ctx.read_byte_unchecked();
let read = ctx.read_byte();

return (quote!(byte), quote!(let byte = unsafe { #read };));
return (quote!(byte), quote!(let byte = #read;));
}

match min_read {
Expand Down
2 changes: 2 additions & 0 deletions logos-derive/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ logos-codegen = {version = "0.14.1", path = "../logos-codegen"}
[features]
# Enables debug messages
debug = ["logos-codegen/debug"]
# Don't use or generate unsafe code
forbid_unsafe = ["logos-codegen/forbid_unsafe"]

[lib]
bench = false
Expand Down
5 changes: 5 additions & 0 deletions src/internal.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,13 @@ pub trait LexerInternal<'source> {
fn read_at<T: Chunk<'source>>(&self, n: usize) -> Option<T>;

/// Unchecked read a byte at current position, offset by `n`.
#[cfg(not(feature = "forbid_unsafe"))]
unsafe fn read_byte_unchecked(&self, n: usize) -> u8;

/// Checked read a byte at current position, offset by `n`.
#[cfg(feature = "forbid_unsafe")]
fn read_byte(&self, n: usize) -> u8;

/// Test a chunk at current position with a closure.
fn test<T: Chunk<'source>, F: FnOnce(T) -> bool>(&self, test: F) -> bool;

Expand Down
Loading