Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use emoji presentation for emoji found in normal text #1107

Open
gnprice opened this issue Dec 6, 2024 · 2 comments
Open

Use emoji presentation for emoji found in normal text #1107

gnprice opened this issue Dec 6, 2024 · 2 comments
Labels
a-content Parsing and rendering Zulip HTML content, notably message contents
Milestone

Comments

@gnprice
Copy link
Member

gnprice commented Dec 6, 2024

This is a follow-up to:

That issue was narrowed to cover situations where we explicitly know we're working with an emoji. Still open are situations where a literal emoji character appears as part of some user-generated text:

  • TextNode in message content, corresponding to a text element in the HTML. As far as I know there's no longer a way to generate a message with emoji in a text node, though there used to be — compare this test message yesterday, where the only emoji is in an emoji span, to the message from 2020 it's quoting, where the same Markdown source produced some literal emoji following an emoji span.
    • … Oh, here's one way: write a literal emoji character inside a code span or code block. See these example messages.
  • Topics, channel names, users' names, organization names, and other places where more-or-less-arbitrary plain text appears.

Like #1104, this doesn't affect most emoji — newer emoji, including those added in the emoji boom, have only emoji presentations. Probably ❤ U+2764 HEAVY BLACK HEART, aka :heart: in Zulip, is the most conspicuous affected example: currently we show it as ❤︎ (the text presentation), but we should instead show ❤️ (the emoji presentation).

Implementation

Probably the way to control this is through the ordering of font choices: put an emoji font first, before plain-text fonts. The tricky part is that we'll need a reduced emoji font (at least compared to Noto Color Emoji): one that doesn't have glyphs for characters like U+0020 SPACE and U+0030..0039 DIGIT ZERO..DIGIT NINE, which are perfectly normal characters for ordinary text and should still be shown in text presentation.

For details, see #1104 (comment) and #1104 (comment) .

One further thought beyond those comments: for drawing the line in a principled way between characters like U+0020 SPACE that should stay in text presentation despite appearing in Noto Color Emoji, and characters like U+2764 HEAVY BLACK HEART that should get the emoji presentation, one candidate is to use the code point's Unicode General_Category value:

  • for code points in category So (Symbol, Other) that appear in the emoji font, assume they should get the emoji presentation;
  • for code points in any other category, assume they shouldn't, even if they're in the emoji font.

That rule gives the right answer for U+0020 SPACE, U+2764 HEAVY BLACK HEART, and all the other examples I looked at. More study would be needed to validate the rule before running with it.

@gnprice gnprice added the a-content Parsing and rendering Zulip HTML content, notably message contents label Dec 6, 2024
@gnprice gnprice added this to the M7: Future milestone Dec 6, 2024
@gnprice
Copy link
Member Author

gnprice commented Dec 6, 2024

I think having any emoji in these situations is fairly uncommon. Combined with most emoji not being affected (as discussed above), this seems pretty uncommon so I think it's a low-priority issue.

As the implementation section (and the linked comments) discusses, this also may be a fairly involved task to carry out.

@gnprice
Copy link
Member Author

gnprice commented Dec 6, 2024

Zulip web appears to have the same issue — again, when a literal emoji does make it through. See this example message and the one after it for literal emoji in message content; or this for an emoji in a topic.

OTOH Zulip web does respect U+FE0F VARIATION SELECTOR-16 when it appears after a literal emoji, which is supposed to request emoji presentation. See this message for such an emoji in message content and this one for a topic.

(These observations of Zulip web are what I see at the moment in Chrome on my Linux desktop.)

By contrast our current behavior (even after #1108) is that we show the text presentation even when the emoji code point is followed by U+FE0F VARIATION SELECTOR-16; see those same example messages.

So Zulip web's behavior when a literal emoji does get through is what Unicode TR 51 prescribes for contexts like "plain web pages", but better would be to follow what's prescribed for "informal environments like texting and chats". The latter is what Zulip web already does for almost all emoji inside message content.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a-content Parsing and rendering Zulip HTML content, notably message contents
Projects
Status: No status
Development

No branches or pull requests

1 participant