Improved unicode support in mutator, flattener, and more #2662
+146
−93
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
solc, foundry, hardhat, everything reports source_map offsets denominated in bytes (solidity#14733 is a false positive). And (almost) all slither detectors/properly properly index the source code byte-wise. Great.
But some tools, notably the mutator and flattener, use per-byte offsets to index strings per-character. Once this PR is merged, they will not.
Summary of changes:
source_mapping.content
to index source code correctly and used this property instead of manual indexing in flat/mutate toolssrc_mapping
,source_code
, andutf8
encodings to ensure we aren't applying byte-offsets to strings anywhere else"utf-8"
and"utf8"
to just the latterNote that the last 3 of these were merged into this branch from PR#2648 bc most of the changes in that PR would have needed to be duplicated in this one.