Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance the PDF glyph handling to handle advanced kerning and ligatures (#2058) #2059

Conversation

speckyspooky
Copy link
Contributor

No description provided.

@speckyspooky speckyspooky added the Enhancement Small change to improve the current supported functionality label Feb 23, 2025
@speckyspooky speckyspooky added this to the 4.19 milestone Feb 23, 2025
@speckyspooky speckyspooky self-assigned this Feb 23, 2025
@speckyspooky speckyspooky requested a review from hvbtup February 24, 2025 05:09
Copy link
Contributor

@hvbtup hvbtup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may feel like nit-picking, but enabling kerning and ligatures involves a little overhead and of course it changes the width of text, thus possibly resulting in different line-breaking.

There is probably a reason why this is not enabled by default in OpenPDF.
So I think it should not be enabled by default in BIRT, instead it should be an option for the PDF emitter.

@speckyspooky
Copy link
Contributor Author

Ok, make sense, so I will take a look into the emitter configuration.

@hvbtup
Copy link
Contributor

hvbtup commented Feb 24, 2025

Furthermore, enabling kerning and ligatures must be considered in the font width calculation. I don't know if OpenPDF takes this into account.

Besides, it is somewhat strange that OpenPDF has a LayoutProcessor class with static methods.
So using kerning and ligatures is all or nothing and cannot be decided per font instance!?

Decades ago I added kerning and ligature support for the ReportLab Python PDF creation library in my "wordaxe" repo https://deco-cow.sourceforge.net/. However, I only ever tested this for latin languages.

@speckyspooky
Copy link
Contributor Author

I wouldn't see it to negative with the handling of openPDF.
Yes, there is may be a reason way the glyph handling is not enabled by default. I assume that it is more a performance topic instead of handling topic. Because the glyphs was handled before based of the different fonts but not correct add all.

The according article/reference-document of openPDF explain the topic and reference to different examples.
That a static way is used can be a reaction due to the complexity of openPDF and font-handling.

Please be aware the font handling of BIRT isn't easy too and I invested lot of time to figure out why the config-switch wasn't working.

@speckyspooky
Copy link
Contributor Author

@hvbtup
The usage of user property or PDF-configuration isn't possible because the font handling ist complete based at layout-engine level. I mean it is a completely different processing stream instead of the output rendering.

So if we would need a kind of configuration switch we cannot use a user-property/PDF-renderer-config
we would need a kind of global configuration like the ECMA-script-switch-property based on JVM-level.

@wimjongman & @merks
What is your opinion to this siwtch option.

  • Switch: Yes/No
  • Global switch?

@hvbtup
Copy link
Contributor

hvbtup commented Feb 25, 2025

If we change things here, we should at least:

  • Test and document what happens for an extreme text like "AV AV AV AV AV AV AV AV AV AV AV AV AV" when this text is used in a right-aligned paragraph or a justified paragraph.
  • Test what happens when text is copied from the PDF into a text editor using Windows-1252 encoding when the text contains ligatures like ffl or ff.
  • Test the tagged output with PAC 2024 if we create PDF/UA (the idea is similar to the previous test, but the input is based on the tagged PDF. Hint: I saw that LayoutProcessor.java contains a method to output text with a boolean argument actualText (probably meant to create an ActualText attribute in the PDF).

I don't say that all of this must work, but we should at least know what works and what not.
And if some of these tests show issues, then IMHO the default should be "no kerning and ligatures".

@speckyspooky
Copy link
Contributor Author

There are mixed some details, the copy of text to editor is only a hint of text display but isn't a test case.
PDF/UA will work due to embedded fonts, I don't see problems there.
The only Thing would be the case with AV and here till my current testing I have seen no effect.

@hvbtup
Copy link
Contributor

hvbtup commented Feb 25, 2025

I agree with Thomas insofar as it it impossible to change the setting per font.
This is because the LayoutProcessor class uses static properties to store state and BIRT is multi-threaded.
If BIRT is used in a multi-threaded fashion, then calling a LayoutProcessor static method in thread A influences the rendering in thread B.
Thus LayoutProcessor is not thread-safe.
A switch should be based on a system property to make it clear that it cannot be changed at runtime.

@hvbtup
Copy link
Contributor

hvbtup commented Feb 25, 2025

Temp.zip

A test report showing that enabling kerning breaks right-aligment and centered alignment, while justified text works.
This report is also a test case for #2057

@wimjongman
Copy link
Contributor

wimjongman commented Feb 25, 2025

@wimjongman & @merks What is your opinion to this siwtch option.

Guys, I am just baffled with your knowledge about this topic. 🙏 Reading you comments, I think the logical thing to do is to make it a global switch which is off by default.

Great work!! ⭐⭐⭐⭐⭐

@speckyspooky speckyspooky requested a review from hvbtup February 25, 2025 22:31
@speckyspooky
Copy link
Contributor Author

So, I figured out an alternative instead of a global switch with a system property.
I have added a new font-configuration-tag "font-kerning" to the fontsConfig*.xml.
With this configuration the kerning can be enabled or disabled (default: disabled/unused).

This switch can be configured at every font-xml because it is for all setups/configurations the same internal logic,
means all possible configuration will be checked at all font-config-files.

Let me know I can add in addition a system property but I think the config solution should be good enough.

@wimjongman
Copy link
Contributor

How do people set this property? Does it have easy access?

Copy link
Contributor

@hvbtup hvbtup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove your personal directories from the config file.
I think the wording should be improved.

Apart from that:
The overall code looks good to me and I like the idea of configuring this in fontsConfig.xml.

@hvbtup
Copy link
Contributor

hvbtup commented Feb 26, 2025

How do people set this property? Does it have easy access?

The fontsConfig.xml file is supposed to be editable, e.g. to add custom TTF font paths, so I would say: Yes, easy enough.

@speckyspooky speckyspooky force-pushed the enhance_pdf_advanced_glyph_handling_2058 branch 2 times, most recently from 0f2bc1f to 401bfcd Compare February 26, 2025 19:46
@speckyspooky speckyspooky requested a review from hvbtup February 26, 2025 20:06
@@ -54,10 +63,14 @@ public class FontMappingManager {
this.fontAliases.putAll(parent.getFontAliases());
this.fontEncodings.putAll(parent.getFontEncodings());
this.compositeFonts.putAll(parent.getCompositeFonts());
if (!this.fontKerningAdvancedUsage)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what you mean is: if it is not explicitly defined, get it from parent.
But then the variable should probably be a Boolean object and the test should be fontKerningAdvancedUsae == null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The property is not the Object Boolaen it is the type boolean which is also located at parent side therefore this compare (and without null)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the point is that you can't distinguish the three states:

  • Explicitly set to false.
  • Explicitly set to true.
  • Not set at all.

with a boolean value that can only have two states...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, may be I'm really slowly today, my property definition is done with an initial value and this is "false".
and a boolean (type) must have a value so which moment should switch false to "null". (I try only to understand this point.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 66:
My first thought that your intention was "the property should inherit from parent only if not set explicitly".
Then the case "this should not use advanced kerning, even though the parent does" cannot work.
But now I wonder if the condition is necessary at all?
In lines 62 .. 65 the other properties are copied from parent without a condition. So why do you use a condition in line 66?

}
this.fontEncodings.putAll(config.fontEncodings);
this.searchSequences.putAll(config.searchSequences);
this.fontAliases.putAll(config.fontAliases);
if (!this.fontKerningAdvancedUsage)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

Copy link
Contributor

@hvbtup hvbtup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In PdfPage.java, if I understand it correctly, the LayoutProcessor.enableKernLiga() is called without checking the option in fontsConfig.xml.
This makes the option pointless!?
And since the LayoutProcessor is basically a singleton, I think it makes sense to enable it (or not, depending on the option) at best exactly once, and not inside a frequently called method like PdfPage.drawText.

@speckyspooky speckyspooky force-pushed the enhance_pdf_advanced_glyph_handling_2058 branch from 401bfcd to 57fb373 Compare February 27, 2025 19:21
@speckyspooky speckyspooky force-pushed the enhance_pdf_advanced_glyph_handling_2058 branch from 57fb373 to 05bdc9d Compare February 27, 2025 19:50
@speckyspooky speckyspooky requested a review from hvbtup February 28, 2025 09:39
Copy link
Contributor

@hvbtup hvbtup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand the two if statements in FontMappingManager.java, but if it works, so what...

@speckyspooky speckyspooky merged commit 4bb9439 into eclipse-birt:master Feb 28, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Small change to improve the current supported functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PDF, advanced glyp handling: Khmer Unicode (Hanuman Font) Not Rendering Correctly
4 participants