Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this support chinese unicode? #22

Open
twjjack opened this issue Aug 4, 2017 · 3 comments
Open

Is this support chinese unicode? #22

twjjack opened this issue Aug 4, 2017 · 3 comments

Comments

@twjjack
Copy link

twjjack commented Aug 4, 2017

Hi all,

Thanks! It is working perfectly to get the English wordings but it is not working when the RTF contains Chinese characters which are being store in unicode.

Here is my code:
$rtf = '{\rtf1\ansi\ansicpg1252\uc0\deff0{\fonttbl {\f0\fswiss\fcharset0\fprq2 Arial;} {\f1\fnil\fcharset0\fprq2 SimSun;} {\f2\froman\fcharset2\fprq2 Symbol;}} {\colortbl;\red0\green0\blue0;\red255\green255\blue255;} {\stylesheet{\s0\itap0\f0\fs24 [Normal];}{*\cs10\additive Default Paragraph Font;}} {*\generator TX_RTF32 11.0.401.501;} \deftab1134\paperw11907\paperh16443\margl567\margt567\margr567\margb567\pard\itap0\plain\f1\fs20\loch\f1\hich\f1\u20320\u22909\u21527\par }';

$result = $reader->Parse($rtf);
$formatter = new RtfHtml();
$test = $formatter->Format($reader->root);

and it give me this result:
◊u22909◊par

I am expecting to get the result of \u20320\u22909\u21527\ which I can then translated it back to Chinese character.

Is there any one here have similar issue and what is the solution?

Cheers,
Jack

@sipryan
Copy link
Contributor

sipryan commented Nov 4, 2017

Support for unicode characters was recently added. Please recheck if the problem persists. Thanks

@humblecoder
Copy link

Attempting to parse mixed English/Cantonese documents. Cantonese is being garbled. Also receiving a great deal of output such as

    ...
    WORD rtf (1)
    WORD adeflang (1025)
    WORD ansi (1)
    WORD ansicpg (1252)
    WORD uc (1)
    WORD adeff (0)
    WORD deff (0)
    ...

@sipryan
Copy link
Contributor

sipryan commented Oct 2, 2018

Unfortunately far eastern languages (UTF-16 & UTF-32) not yet implemented !
you can help us by uploading your RTF file
thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants