You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
\w is equivalent to [\p{L}\p{Mn}\p{Nd}\p{Pc}] in .NET instead of [\p{Alpha}\p{M}\p{Nd}\p{Pc}\p{Join_Control}]:
It incorrectly uses GC=Letter instead of Alphabetic=Yes; the latter includes more code points!
It doesn't match all of GC=Mark, only GC=Nonspacing_Mark
It doesn't match Join_Control=Yes
AFAIK there's nothing we can do other than emitting a warning: \p{Alpha} doesn't work in .NET, so we can't polyfill it. But a warning adds noise and doesn't help much when there isn't a straightforward fix.
The text was updated successfully, but these errors were encountered:
Aloso
changed the title
(and by extension and ) don't implement Unicode properly; is equivalent to instead of (see Rust regex' documentation).
.NET: \w (and by extension \b and \B) don't conform to Unicode
Mar 28, 2023
\w
is equivalent to[\p{L}\p{Mn}\p{Nd}\p{Pc}]
in .NET instead of[\p{Alpha}\p{M}\p{Nd}\p{Pc}\p{Join_Control}]
:GC=Letter
instead ofAlphabetic=Yes
; the latter includes more code points!GC=Mark
, onlyGC=Nonspacing_Mark
Join_Control=Yes
AFAIK there's nothing we can do other than emitting a warning:
\p{Alpha}
doesn't work in .NET, so we can't polyfill it. But a warning adds noise and doesn't help much when there isn't a straightforward fix.The text was updated successfully, but these errors were encountered: