8354273: Restore even more pointless unicode characters to ASCII #24567

magicus · 2025-04-10T10:18:08Z

As a follow-up to JDK-8354213, I found some additional places where unicode characters are unnecessarily used instead of pure ASCII.

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8354273: Restore even more pointless unicode characters to ASCII (Bug - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24567/head:pull/24567
$ git checkout pull/24567

Update a local copy of the PR:
$ git checkout pull/24567
$ git pull https://git.openjdk.org/jdk.git pull/24567/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24567

View PR using the GUI difftool:
$ git pr show -t 24567

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24567.diff

Using Webrev

Link to Webrev Comment

bridgekeeper · 2025-04-10T10:19:06Z

👋 Welcome back ihse! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2025-04-10T10:20:03Z

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

openjdk · 2025-04-10T10:20:46Z

@magicus The following labels will be automatically applied to this pull request:

client
core-libs
i18n

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

mlbridge · 2025-04-10T10:23:55Z

Webrevs

01: Full - Incremental (876708c2)
00: Full (d9527eb9)

magicus · 2025-04-10T10:36:34Z

src/java.xml/share/legal/xmlxsd.md

@@ -26,7 +26,7 @@ modifications:
    [$year-of-document] World Wide Web Consortium.
    https://www.w3.org/copyright/software-license-2023/"

-Disclaimers §anchor


This is an incorrectly copied piece of html; compare how the very same license is handled in e.g. src/java.xml/share/legal/schema10part1.md. The § is the non-ascii character that triggered my detection of this, but the entire "anchor" string is incorrect here.

prrace · 2025-04-16T04:39:22Z

src/java.xml/share/legal/xhtml11.md

@@ -47,7 +47,7 @@ The notice is:
 "Copyright © 2023 W3C®. This software or document includes material copied from
 or derived from [title and URI of the W3C document]."

-Disclaimers §anchor


Did that come from an upstream file ?

No, it is copy/pasted from a textual rendering of the html file specified in the URL above. This is what you get if you naïvely select the text in Firefox and press Ctrl-C. The §anchor part is not rendered on screen.

prrace · 2025-04-16T04:40:35Z

test/jdk/java/awt/geom/Path2D/GetBounds2DPrecisionTest.java

@@ -189,7 +189,7 @@ private static String toUniformString(double value) {
        int DIGIT_COUNT = 40;
        String str = decimal.toPlainString();
        if (str.length() >= DIGIT_COUNT) {
-            str = str.substring(0,DIGIT_COUNT-1)+"…";
+            str = str.substring(0,DIGIT_COUNT-1)+"...";
        }


How did you test this ? Please say more than tiers 1-3 .. because this test isn't run until tier4.

I did not test tier4. Will do so now. Thanks!

eirbjo · 2025-04-18T06:36:34Z

While the changes here look okay, I think the issue/PR title could be improved.

The replacement of Unicode "En Dash" with ASCII hypen-minus and the similar relacement of the Unicode "Horizontal Ellipsis" with three ASCII periods are not really "restoring" much, and these unicode characters are hardly "pointless" as they may carry different semantic meaning, behavior and rendering.

It's a valid chioce to normalize them into ASCII though, but perhaps a title like "Replace even more Unicode characters with ASCII" would be more "fair" to these poor Unicode characters :-)

8354273: Restore even more pointless unicode characters to ASCII

d9527eb

openjdk bot added the rfr Pull request is ready for review label Apr 10, 2025

openjdk bot added client [email protected] core-libs [email protected] i18n [email protected] labels Apr 10, 2025

Remove incorrectly copied "§anchor"

876708c

magicus commented Apr 10, 2025

View reviewed changes

prrace reviewed Apr 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

8354273: Restore even more pointless unicode characters to ASCII #24567

8354273: Restore even more pointless unicode characters to ASCII #24567

magicus commented Apr 10, 2025 •

edited by openjdk bot

Loading

bridgekeeper bot commented Apr 10, 2025

openjdk bot commented Apr 10, 2025

openjdk bot commented Apr 10, 2025

mlbridge bot commented Apr 10, 2025 •

edited

Loading

magicus Apr 10, 2025

prrace Apr 16, 2025

magicus Apr 16, 2025

prrace Apr 16, 2025

magicus Apr 16, 2025

eirbjo commented Apr 18, 2025 •

edited

Loading

8354273: Restore even more pointless unicode characters to ASCII #24567

Are you sure you want to change the base?

8354273: Restore even more pointless unicode characters to ASCII #24567

Conversation

magicus commented Apr 10, 2025 • edited by openjdk bot Loading

Progress

Issue

Reviewing

bridgekeeper bot commented Apr 10, 2025

openjdk bot commented Apr 10, 2025

openjdk bot commented Apr 10, 2025

mlbridge bot commented Apr 10, 2025 • edited Loading

Webrevs

magicus Apr 10, 2025

Choose a reason for hiding this comment

prrace Apr 16, 2025

Choose a reason for hiding this comment

magicus Apr 16, 2025

Choose a reason for hiding this comment

prrace Apr 16, 2025

Choose a reason for hiding this comment

magicus Apr 16, 2025

Choose a reason for hiding this comment

eirbjo commented Apr 18, 2025 • edited Loading

magicus commented Apr 10, 2025 •

edited by openjdk bot

Loading

mlbridge bot commented Apr 10, 2025 •

edited

Loading

eirbjo commented Apr 18, 2025 •

edited

Loading