DateTimeComponents.Format parse throws an exception when timeZoneId() is UTC #528

DmitryNekrasov · 2025-05-16T11:41:01Z

TimeZone parser implementation using Finite State Automaton

This PR introduces a robust parser for timezone identifiers using a finite state automaton approach.

The parser handles several timezone representation formats including special named timezones (UTC, GMT, UT, Z), fixed offsets with different notation styles (+01:00, +0100, +01) and combined formats (UTC+01:00, GMT-05:00).

At the core of the implementation is a state machine that transitions through different states as it processes the input character by character. The automaton begins in a START state and transitions through states such as AFTER_PREFIX (after parsing UTC/GMT/UT), AFTER_SIGN (after + or -), AFTER_HOUR, and AFTER_COLON based on the input characters.

The parser stores the last valid index encountered during parsing, which enables it to provide meaningful error messages and handle cases where the timezone identifier is followed by other characters.

Comprehensive test coverage validates the parser's behavior across various use cases, including handling of edge cases like invalid offset values (e.g., +25:00), malformed offsets (e.g., +1:), and rejection of non-standard identifiers. The tests confirm that the parser correctly validates timezone identifiers according to expected standards while appropriately rejecting invalid formats.

DmitryNekrasov · 2025-05-16T11:44:37Z

@dkhalanskyjb Dmitry, hello! Could you please look at the added tests, do they correctly document the expected behavior and fully cover possible scenarios? Thank you!

core/common/test/TimeZoneTest.kt

core/common/src/internal/format/parser/ParserOperation.kt

dkhalanskyjb · 2025-05-20T12:16:16Z

core/common/src/internal/format/parser/ParserOperation.kt

+
+            fun validateHH() = validateTimeComponent(2, 18)
+            fun validateH() = validateTimeComponent(1, 9)
+            fun validateMM() = validateTimeComponent(2, 59) { minutes -> hours < 18 || minutes == 0 }


There are also seconds (UTC+011530, UTC+01:15:30 can both be passed to TimeZone.of), but before implementing their support, let's have a discussion tomorrow about which behaviors we want.

DmitryNekrasov added 10 commits May 16, 2025 14:55

#444: Add testSpecialNamedTimezones

d38a5a6

#444: Add testFixedOffsets

3714a12

#444: Add testUTCGMTWithOffsets

8ba6552

#444: Add testTimezoneDBIdentifiers

7b6f6e5

#444: Remove import

93ba0db

#444: Update date on license

d087dbb

#444: Rollback formatting

1229eee

#444: Fix compilation

e55e74a

#444: Fix formatting

5f6e5c4

#444: Fix formatting

67e0d49

DmitryNekrasov requested a review from dkhalanskyjb May 16, 2025 11:41

DmitryNekrasov marked this pull request as draft May 16, 2025 11:41

dkhalanskyjb reviewed May 16, 2025

View reviewed changes

core/common/test/TimeZoneTest.kt Outdated Show resolved Hide resolved

DmitryNekrasov self-assigned this May 16, 2025

DmitryNekrasov added the bug Something isn't working label May 16, 2025

DmitryNekrasov added 10 commits May 19, 2025 12:26

#444: Fix timeZoneId check in assertTimeZoneIdCanBeParsed

ffe3097

#444: Remove "SYSTEM" from test

6fe71ba

#444: Add reject tests

5ef14dc

#444: Add "Z" parsing

801ebf7

#444: Add validateTimezone method

672ffdd

#444: Refactor validateTimezone

a90171d

#444: Refactoring

04706c3

#444: Add validateTimezone call

ea983ee

#444: Refactoring

7950367

#444: Add test testParseUntilRightBound, fix bugs

d8a1cd7

DmitryNekrasov requested review from fzhinkin and dkhalanskyjb May 19, 2025 13:26

DmitryNekrasov marked this pull request as ready for review May 19, 2025 13:26

#444: Refactoring

5b5efe3

dkhalanskyjb requested changes May 20, 2025

View reviewed changes

core/common/src/internal/format/parser/ParserOperation.kt Show resolved Hide resolved

core/common/src/internal/format/parser/ParserOperation.kt Outdated Show resolved Hide resolved

DmitryNekrasov added 5 commits May 20, 2025 11:38

#444: Add test cases to validate +H format offset

85deaa0

#444: Add +H format validation

cc0b7fe

#444: Fix upper bound for offset

690bd58

#444: Fix +18:01 offset case

12a9665

#444: Refactor +18:01 offset case

fcfe719

DmitryNekrasov requested a review from dkhalanskyjb May 20, 2025 08:42

dkhalanskyjb reviewed May 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DateTimeComponents.Format parse throws an exception when timeZoneId() is UTC #528

DateTimeComponents.Format parse throws an exception when timeZoneId() is UTC #528

DmitryNekrasov commented May 16, 2025 •

edited

Loading

DmitryNekrasov commented May 16, 2025

dkhalanskyjb May 20, 2025

DateTimeComponents.Format parse throws an exception when timeZoneId() is UTC #528

Are you sure you want to change the base?

DateTimeComponents.Format parse throws an exception when timeZoneId() is UTC #528

Conversation

DmitryNekrasov commented May 16, 2025 • edited Loading

TimeZone parser implementation using Finite State Automaton

DmitryNekrasov commented May 16, 2025

dkhalanskyjb May 20, 2025

Choose a reason for hiding this comment

DmitryNekrasov commented May 16, 2025 •

edited

Loading