Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Provide helpful error msg when deserializing big timestamp #494

Merged
merged 6 commits into from
Jun 14, 2024

Conversation

Xanewok
Copy link
Contributor

@Xanewok Xanewok commented Jun 6, 2024

Fixes #435

In place of the existing U64OrUsize we provide a newtype Timestamp around u64 with a custom Deserialize impl that leniently first deserializes to a number/string and then attempts to parse the value to see if it's within the valid range.

We keep providing From and Into for the new type, so the existing code keeps using the impls whenever possible.

Copy link

changeset-bot bot commented Jun 6, 2024

🦋 Changeset detected

Latest commit: 913edb7

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@nomicfoundation/edr Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Comment on lines 397 to 401
if let Some(hex_str) = value.strip_prefix("0x") {
u64::from_str_radix(hex_str, 16).map_err(|_err| error_msg())
} else {
value.parse::<u64>().map_err(|_err| error_msg())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deviates from what used to happen. We don't want to allow any type of string here, but one that conforms to the Ethereum format: ^0x([1-9a-f]+[0-9a-f]*|0)$ (also see this).

I recommend using the U64 type to deserialize the string here, as that already implements this logic. All you'd have to do is to first check the number of bytes in the hex string after 0x (this part is required). If it exceeds 8 bytes, then return the custom error.

Copy link
Contributor Author

@Xanewok Xanewok Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked ruint's types and it doesn't have a special case for disallowing leading zeroes, apart from supporting also octal o and binary b out of the box. See from_str et al. in https://docs.rs/crate/ruint/latest/source/src/string.rs.

For example, this passes on main:

    #[test]
    fn test_deserialize_quantity() {
        let json = r#""0x01""#;

        let mut deserializer = serde_json::Deserializer::from_str(json);
        let quantity = deserialize_quantity(&mut deserializer).unwrap();

        assert_eq!(quantity, U256::from(0x01));
    }

Did this uncover a bug in our other deserializers as well? If so, do you want me to fix this as part of this PR or separately?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also seeing some more inconsistencies, for example we only have Option<u64> using default Deserialize in AccountOverrideOptions for eth_call

pub nonce: Option<u64>,

but the official API schema says it accepts only a hex string of the quantity format you mentioned:

image

Maybe it'd make sense to separate a work item of "go over all of our (de)serialize logic and make sure we're compliant"?

On the other hand per the common networking adage of "be conservative in what you send, be liberal in what you accept" and the fact that we're supposed to be a development platform, I'm not sure if us not being as strict as the spec is necessary the bad thing here.

I'll add it for the agenda for tomorrow's sync; maybe @fvictorio you have some more context or opinions on this.

Copy link
Member

@Wodann Wodann Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://github.com/recmo/uint/blob/main/src/support/serde.rs#L136 for the serde implementations of ruint. They have additional logic on top of the FromStr implementation.

It also already handles visit_u64 and visit_u128, so it voids the need for any of the additional code you wrote, apart from the helpful error message.

Copy link
Member

@Wodann Wodann Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, this passes on main:

    #[test]
    fn test_deserialize_quantity() {
        let json = r#""0x01""#;

        let mut deserializer = serde_json::Deserializer::from_str(json);
        let quantity = deserialize_quantity(&mut deserializer).unwrap();

        assert_eq!(quantity, U256::from(0x01));
    }

Did this uncover a bug in our other deserializers as well? If so, do you want me to fix this as part of this PR or separately?

The example you wrote is valid. 0x01 is a single byte equal to 1. There are no leading zeroes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Wodann are you sure? This would also accept 0X prefix as well as octal 0o/0O and binary 0b/0B which I'm pretty sure we do not want to accept, given that we accept to conform only to the spec 0x: https://github.com/recmo/uint/blob/7b8b0d313a4960091c07eb3356c6a2512a46b585/src/string.rs#L200-L210

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to prolong this but since we're already splitting hairs here, it's just worth noting that it'd not be consistent with what we do elsewhere, which is String::deserialize, then check for 0x and then call U64::from_str (on a value that's guaranteed to start with 0x, so okay).

Maybe just being consistent with the rest of the code would be okay, where we do the flow outlined above like everywhere else, and we could focus on compliance/correctness in a separate PR? WDYT?

Copy link
Contributor Author

@Xanewok Xanewok Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I forgot that we support direct numbers, which is where your suggestion for U64::deserialize came from, but we should not accept the different modes, so I tried to make it more explicit what we support and why in the code and in the error message in 7635fc0.

Since we also accept JS numbers as the extension to the regular JSON-RPC API, it'd be weird not to support them as a regular string (so also a base 10 number). I've added a comment and made it more explicit about the base, let me know what you think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only skimmed this discussion (let me know if there's something in particular I should pay attention to) but this caught my eye:

This would also accept 0X prefix as well as octal 0o/0O and binary 0b/0B which I'm pretty sure we do not want to accept

I checked and the released version of EDR accepts all of these (which is wrong by the spec, and geth doesn't accept):

send("eth_getBlockByNumber", [0, false])
send("eth_getBlockByNumber", ["0", false])
send("eth_getBlockByNumber", ["0b0", false])
send("eth_getBlockByNumber", ["0o0", false])

I will open an issue about this, but we don't need to do anything about that in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crates/edr_provider/src/requests/serde.rs Outdated Show resolved Hide resolved
@Xanewok Xanewok force-pushed the refactor/improve-error-message-for-big-timestamps branch from d927e4d to 1d5d2dc Compare June 10, 2024 12:56
@Xanewok
Copy link
Contributor Author

Xanewok commented Jun 10, 2024

Reshuffled the tests to be first (to double-check what we do on main already) and force-pushed with expanded deserialize_timestamp test; hope you don't mind 🙏

Too permissive derialization for the timestamp along with accepting
leading zeroes in other values should be done in a separate PR.
crates/edr_provider/src/requests/serde.rs Outdated Show resolved Hide resolved
"evm_increaseTime",
["0x1111111111111111111111111111111111111"],
"Timestamp must be a positive integer or a string not exceeding 2^64 - 1"
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add 4 test cases more here?

  1. Using exactly 2^64 produces the error too (using both hex-prefixed strings and decimal numbers)
  2. Using exactly 2^64-1 doesn't produce the error (both hex-prefixed and decimal)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that for the evm_setNextBlockTimestamp.

However, I now realized that evm_increaseTime actually internally converts the value to an i64... Does that mean we should support negative as well? This also means supporting up to 2**63-1. Is this something we want to pursue as part of this PR?

Copy link
Member

@Wodann Wodann Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, evm_increaseTime was allowed to receive negative values in the past, but this was a bug. We fixed that in EDR, so no need for allowing negative values in the JSON-RPC.

The reason that we use an i64 in ProviderData is that the fn block_time_offset_seconds can return a negative value, if the first block is mined in the past.

Potentially we want to change that in the future and return an error. This might have been designed this way in Hardhat on purpose though, so I'll defer to @fvictorio for historic context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially we want to change that in the future and return an error.

No, this is not an error condition, it happens all the time. I don't remember exactly why; I think it's because in Hardhat the config is eagerly loaded (which then uses now() for the initial timestamp) but the node is created lazily after the first request, so normally the time offset will be -N, where N is the number of seconds that happened between the start of the task and the first request.

I'm not sure about the i64 part, but I'm happy with just having those tests for evm_setNextBlockTimestamp, so no need to block on that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks the situation may not be obvious on the i64 front, so I'll wait for @Wodann here. When everyone agrees, I think we could finally merge this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Xanewok Xanewok merged commit 245fd07 into main Jun 14, 2024
41 checks passed
@Xanewok Xanewok deleted the refactor/improve-error-message-for-big-timestamps branch June 14, 2024 13:47
@Wodann Wodann added this to the EDR v0.4.1 milestone Jul 11, 2024
@Wodann Wodann removed their assignment Jul 11, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve error message for large block timestamps
3 participants