Skip to content

DateTimeComponents.Format parse throws an exception when timeZoneId() is UTC #528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

DmitryNekrasov
Copy link
Contributor

@DmitryNekrasov DmitryNekrasov commented May 16, 2025

TimeZone parser implementation using Finite State Automaton

This PR introduces a robust parser for timezone identifiers using a finite state automaton approach.

The parser handles several timezone representation formats including special named timezones (UTC, GMT, UT, Z), fixed offsets with different notation styles (+01:00, +0100, +01) and combined formats (UTC+01:00, GMT-05:00).

At the core of the implementation is a state machine that transitions through different states as it processes the input character by character. The automaton begins in a START state and transitions through states such as AFTER_PREFIX (after parsing UTC/GMT/UT), AFTER_SIGN (after + or -), AFTER_HOUR, and AFTER_COLON based on the input characters.

Automata

The parser stores the last valid index encountered during parsing, which enables it to provide meaningful error messages and handle cases where the timezone identifier is followed by other characters.

Comprehensive test coverage validates the parser's behavior across various use cases, including handling of edge cases like invalid offset values (e.g., +25:00), malformed offsets (e.g., +1:), and rejection of non-standard identifiers. The tests confirm that the parser correctly validates timezone identifiers according to expected standards while appropriately rejecting invalid formats.

@DmitryNekrasov DmitryNekrasov marked this pull request as draft May 16, 2025 11:41
@DmitryNekrasov
Copy link
Contributor Author

@dkhalanskyjb Dmitry, hello! Could you please look at the added tests, do they correctly document the expected behavior and fully cover possible scenarios? Thank you!

@DmitryNekrasov DmitryNekrasov self-assigned this May 16, 2025
@DmitryNekrasov DmitryNekrasov added the bug Something isn't working label May 16, 2025
@DmitryNekrasov DmitryNekrasov marked this pull request as ready for review May 19, 2025 13:26

fun validateHH() = validateTimeComponent(2, 18)
fun validateH() = validateTimeComponent(1, 9)
fun validateMM() = validateTimeComponent(2, 59) { minutes -> hours < 18 || minutes == 0 }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are also seconds (UTC+011530, UTC+01:15:30 can both be passed to TimeZone.of), but before implementing their support, let's have a discussion tomorrow about which behaviors we want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants