Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error rendering on Windows laptop with home directory name in Chinese #8530

Open
paciorek opened this issue Feb 1, 2024 · 17 comments
Open
Assignees
Labels
bug Something isn't working pandoc pandoc-lua Issues with our Lua helper functions, filters, etc. in Pandoc windows
Milestone

Comments

@paciorek
Copy link

paciorek commented Feb 1, 2024

Bug description

I'm trying to help a student who seems to have the same problem on Windows reported in issue #4103 .
(Let me know if I should reopen that issue instead.)

We've run the commands suggested there at the end the thread by @cderv and here are the results. Any suggestions that I can I try with the student?

PS C:\Users\王茜舒> Get-ItemPropertyValue -Path "HKLM:\SYSTEM\CurrentControlSet\Control\Nls\CodePage" -Name ACP
936
PS C:\Users\王茜舒> Get-ItemPropertyValue -Path "HKCU:\SYSTEM\CurrentControlSet\Control\Nls\CodePage" -Name ACP
Get-ItemPropertyValue : 找不到路径“HKCU:\SYSTEM\CurrentControlSet\Control\Nls\CodePage”,因为该路径不存在。
所在位置 行:1 字符: 1
+ Get-ItemPropertyValue -Path "HKCU:\SYSTEM\CurrentControlSet\Control\N ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (HKCU:\SYSTEM\Cu...ol\Nls\CodePage:String) [Get-ItemPropertyValue], Item
   NotFoundException
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetItemPropertyValueCommand

PS C:\Users\王茜舒> Get-ChildItem -Path "HKLM:\SYSTEM\CurrentControlSet\Control\Nls\CodePage"


    Hive: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage


Name                           Property
----                           --------
EUDCCodeRange                  932 : F040-F9FC
                               936 : AAA1-AFFE,F8A1-FEFE,A140-A7A0
                               949 : C9A1-C9FE,FEA1-FEFE
                               950 : FA40-FEFE,8E40-A0FE,8140-8DFE,C6A1-C8FE


PS C:\Users\王茜舒> Get-ChildItem -Path "HKCU:\SYSTEM\CurrentControlSet\Control\Nls\CodePage"
Get-ChildItem : 找不到路径“HKEY_CURRENT_USER\SYSTEM\CurrentControlSet\Control\Nls\CodePage”,因为该路径不存在。
所在位置 行:1 字符: 1
+ Get-ChildItem -Path "HKCU:\SYSTEM\CurrentControlSet\Control\Nls\CodeP ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (HKEY_CURRENT_US...ol\Nls\CodePage:String) [Get-ChildItem], ItemNotFound
   Exception
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetChildItemCommand

PS C:\Users\王茜舒> Get-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\Nls\CodePage"


10000             : c_10000.nls
10001             : c_10001.nls
10002             : c_10002.nls
10003             : c_10003.nls
10004             : c_10004.nls
10005             : c_10005.nls
10006             : c_10006.nls
10007             : c_10007.nls
10008             : c_10008.nls
10010             : c_10010.nls
10017             : c_10017.nls
10021             : c_10021.nls
10029             : c_10029.nls
10079             : c_10079.nls
10081             : c_10081.nls
10082             : c_10082.nls
1026              : c_1026.nls
1047              : c_1047.nls
1140              : c_1140.nls
1141              : c_1141.nls
1142              : c_1142.nls
1143              : c_1143.nls
1144              : c_1144.nls
1145              : c_1145.nls
1146              : c_1146.nls
1147              : c_1147.nls
1148              : c_1148.nls
1149              : c_1149.nls
1250              : c_1250.nls
1251              : c_1251.nls
1252              : c_1252.nls
1253              : c_1253.nls
1254              : c_1254.nls
1255              : c_1255.nls
1256              : c_1256.nls
1257              : c_1257.nls
1258              : c_1258.nls
1361              : c_1361.nls
20000             : c_20000.nls
20001             : c_20001.nls
20002             : c_20002.nls
20003             : c_20003.nls
20004             : c_20004.nls
20005             : c_20005.nls
20105             : c_20105.nls
20106             : c_20106.nls
20107             : c_20107.nls
20108             : c_20108.nls
20127             : c_20127.nls
20261             : c_20261.nls
20269             : c_20269.nls
20273             : c_20273.nls
20277             : c_20277.nls
20278             : c_20278.nls
20280             : c_20280.nls
20284             : c_20284.nls
20285             : c_20285.nls
20290             : c_20290.nls
20297             : c_20297.nls
20420             : c_20420.nls
20423             : c_20423.nls
20424             : c_20424.nls
20833             : c_20833.nls
20838             : c_20838.nls
20866             : c_20866.nls
20871             : c_20871.nls
20880             : c_20880.nls
20905             : c_20905.nls
20924             : c_20924.nls
20932             : c_20932.nls
20936             : c_20936.nls
20949             : c_20949.nls
21025             : c_21025.nls
21027             : c_21027.nls
21866             : c_21866.nls
28591             : C_28591.NLS
28592             : C_28592.NLS
28593             : c_28593.nls
28594             : C_28594.NLS
28595             : C_28595.NLS
28596             : C_28596.NLS
28597             : C_28597.NLS
28598             : c_28598.nls
28599             : c_28599.nls
28603             : c_28603.nls
28605             : c_28605.nls
37                : c_037.nls
38598             : c_28598.nls
437               : c_437.nls
500               : c_500.nls
50220             : c_is2022.dll
50221             : c_is2022.dll
50222             : c_is2022.dll
50225             : c_is2022.dll
50227             : c_is2022.dll
50229             : c_is2022.dll
51949             : c_20949.nls
52936             : c_is2022.dll
54936             : c_g18030.dll
55000             : c_gsm7.dll
55001             : c_gsm7.dll
55002             : c_gsm7.dll
55003             : c_gsm7.dll
55004             : c_gsm7.dll
57002             : c_iscii.dll
57003             : c_iscii.dll
57004             : c_iscii.dll
57005             : c_iscii.dll
57006             : c_iscii.dll
57007             : c_iscii.dll
57008             : c_iscii.dll
57009             : c_iscii.dll
57010             : c_iscii.dll
57011             : c_iscii.dll
708               : c_708.nls
720               : c_720.nls
737               : c_737.nls
775               : c_775.nls
850               : c_850.nls
852               : c_852.nls
855               : c_855.nls
857               : c_857.nls
858               : c_858.nls
860               : c_860.nls
861               : c_861.nls
862               : c_862.nls
863               : c_863.nls
864               : c_864.nls
865               : c_865.nls
866               : c_866.nls
869               : c_869.nls
870               : c_870.nls
874               : c_874.nls
875               : c_875.nls
932               : c_932.nls
936               : c_936.nls
949               : c_949.nls
950               : c_950.nls
AllowDeprecatedCP : 1111573537
OEMHAL            : vgaoem.fon
ACP               : 936
OEMCP             : 936
MACCP             : 10008
PSPath            : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePag
                    e
PSParentPath      : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls
PSChildName       : CodePage
PSDrive           : HKLM
PSProvider        : Microsoft.PowerShell.Core\Registry



PS C:\Users\王茜舒> Get-ItemProperty -Path "HKCU:\SYSTEM\CurrentControlSet\Control\Nls\CodePage"
Get-ItemProperty : 找不到路径“HKCU:\SYSTEM\CurrentControlSet\Control\Nls\CodePage”,因为该路径不存在。
所在位置 行:1 字符: 1
+ Get-ItemProperty -Path "HKCU:\SYSTEM\CurrentControlSet\Control\Nls\Co ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (HKCU:\SYSTEM\Cu...ol\Nls\CodePage:String) [Get-ItemProperty], ItemNotFo
   undException
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.GetItemPropertyCommand

Steps to reproduce

Rendering any basic Quarto document causes the problem.

Expected behavior

One should see the rendered doc.

Actual behavior

processing file: 333.qmd
                                                                                                            
output file: [333.knit.md](http://333.knit.md/)

pandoc 
  to: html
  output-file: 333.html
  standalone: true
  section-divs: true
  html-math-method: mathjax
  wrap: none
  default-image-extension: png
  
metadata
  document-css: false
  link-citations: true
  date-format: long
  lang: en
  title: testfile
  editor: visual
  
Error running filter D:/下载/RStudio/resources/app/bin/quarto/share/filters/main.lua:
[string "..."]:268: cannot open file 'C:\Users\??????\AppData\Local\Temp\quarto-session65e9675c\8949e2c2\39889c71' (Invalid argument)
stack traceback:
	[string "..."]:268: in function 'io.lines'
	[string "..."]:1501: in field 'processDependencies'
	.../RStudio/resources/app/bin/quarto/share/filters/main.lua:5386: in function <.../RStudio/resources/app/bin/quarto/share/filters/main.lua:5383>
	[C]: in ?
	[C]: in method 'walk'
	.../RStudio/resources/app/bin/quarto/share/filters/main.lua:171: in function 'run_emulated_filter'
	.../RStudio/resources/app/bin/quarto/share/filters/main.lua:449: in local 'callback'
	.../RStudio/resources/app/bin/quarto/share/filters/main.lua:454: in upvalue 'run_emulated_filter_chain'
	.../RStudio/resources/app/bin/quarto/share/filters/main.lua:495: in function <.../RStudio/resources/app/bin/quarto/share/filters/main.lua:476>
stack traceback:
	.../RStudio/resources/app/bin/quarto/share/filters/main.lua:171: in function 'run_emulated_filter'
	.../RStudio/resources/app/bin/quarto/share/filters/main.lua:449: in local 'callback'
	.../RStudio/resources/app/bin/quarto/share/filters/main.lua:454: in upvalue 'run_emulated_filter_chain'
	.../RStudio/resources/app/bin/quarto/share/filters/main.lua:495: in function <.../RStudio/resources/app/bin/quarto/share/filters/main.lua:476>

Your environment

  • OS: Windows 11
  • Rstudio: 2023.12.1-403

Quarto check output

D:\Desktop\test>quarto check
A new release of Deno is available: 1.28.2 → 1.40.2 Run `deno upgrade` to install it.

[>] Checking versions of quarto binary dependencies...
      Pandoc version 3.1.1: OK
      Dart Sass version 1.55.0: OK
[>] Checking versions of quarto dependencies......OK
[>] Checking Quarto installation......OK
      Version: 1.3.450
      Path: D:\下载\RStudio\resources\app\bin\quarto\bin
      CodePage: unknown

(|) Checking basic markdown render....Error running filter D:/下载/RStudio/resources/app/bin/quarto/share/filters/main.lua:
[string "..."]:268: cannot open file 'C:\Users\??????\AppData\Local\Temp\quarto-session3fcbaa4b\381da427\5f379e51' (Invalid argument)
stack traceback:
        [string "..."]:268: in function 'io.lines'
        [string "..."]:1501: in field 'processDependencies'
        .../RStudio/resources/app/bin/quarto/share/filters/main.lua:5386: in function <.../RStudio/resources/app/bin/quarto/share/filters/main.lua:5383>
        [C]: in ?
        [C]: in method 'walk'
        .../RStudio/resources/app/bin/quarto/share/filters/main.lua:171: in function 'run_emulated_filter'
        .../RStudio/resources/app/bin/quarto/share/filters/main.lua:449: in local 'callback'
        .../RStudio/resources/app/bin/quarto/share/filters/main.lua:454: in upvalue 'run_emulated_filter_chain'
        .../RStudio/resources/app/bin/quarto/share/filters/main.lua:495: in function <.../RStudio/resources/app/bin/quarto/share/filters/main.lua:476>
stack traceback:
        .../RStudio/resources/app/bin/quarto/share/filters/main.lua:171: in function 'run_emulated_filter'
        .../RStudio/resources/app/bin/quarto/share/filters/main.lua:449: in local 'callback'
        .../RStudio/resources/app/bin/quarto/share/filters/main.lua:454: in upvalue 'run_emulated_filter_chain'
        .../RStudio/resources/app/bin/quarto/share/filters/main.lua:495: in function <.../RStudio/resources/app/bin/quarto/share/filters/main.lua:476>
[>] Checking basic markdown render....OK
@paciorek paciorek added the bug Something isn't working label Feb 1, 2024
@cscheid cscheid added the windows label Feb 1, 2024
@cscheid cscheid added the triaged-to Issues that were not self-assigned, signals that an issue was assigned to someone. label Feb 1, 2024
@cderv
Copy link
Collaborator

cderv commented Feb 1, 2024

Thanks for the report @paciorek !

It seems the codepage use is 936 on this computer. Quarto check does return unknown though for some reason. So I don't know why this isn't read properly.

Can you update to Quarto 1.4 latest stable release and run quarto check again ?

@dragonstyle do you still have local windows 11 with codepage 936 available ? Otherwise, I'll try to set one up.

@rpbartczuk
Copy link

rpbartczuk commented Feb 9, 2024

Hi, have very similar issue. I updated quarto and it doesn't read my codepage.
C:\Program Files\Quarto\bin>chcp Active code page: 852

C:\Program Files\Quarto\bin>quarto check
Quarto 1.4.549
[>] Checking versions of quarto binary dependencies...
      Pandoc version 3.1.11: OK
      Dart Sass version 1.69.5: OK
      Deno version 1.37.2: OK
[>] Checking versions of quarto dependencies......OK
[>] Checking Quarto installation......OK
      Version: 1.4.549
      Path: C:\Program Files\Quarto\bin
      CodePage: unknown

[>] Checking tools....................OK
      TinyTeX: (not installed)
      Chromium: (not installed)

[>] Checking LaTeX....................OK
      Tex:  (not detected)


(|) Checking basic markdown render....Error running filter C:/Program Files/Quarto/share/filters/main.lua:
[string "..."]:267: cannot open file 'C:\Users\Rafa?\AppData\Local\Temp\quarto-sessiona3371d6a\60be6b22\128fbc86' (Invalid argument)
stack traceback:
        [string "..."]:267: in function 'io.lines'
        [string "..."]:1593: in field 'processDependencies'
        C:/Program Files/Quarto/share/filters/main.lua:7347: in field 'Meta'
        C:/Program Files/Quarto/share/filters/main.lua:240: in function 'run_emulated_filter'
        C:/Program Files/Quarto/share/filters/main.lua:936: in local 'callback'
        C:/Program Files/Quarto/share/filters/main.lua:954: in upvalue 'run_emulated_filter_chain'
        C:/Program Files/Quarto/share/filters/main.lua:990: in function <C:/Program Files/Quarto/share/filters/main.lua:987>
[>] Checking basic markdown render....OK

@cderv
Copy link
Collaborator

cderv commented Feb 9, 2024

@rpbartczuk what is you username here ? C:\Users\Rafa?. I believe ? is for another character that is not correctly read ?

@eitsupi
Copy link
Contributor

eitsupi commented Feb 10, 2024

I saw a similar error on a Japanese version of Windows (user name contains multibyte characters).

I think they (Pandoc?) are trying to interpret a non-UTF-8 string as UTF-8 and not interpreting the path correctly.

@dragonstyle
Copy link
Collaborator

I am able to reproduce an error when I place the path to the quarto-cli in a directory with unicode characters (using codepage 936). I haven't yet pinned down the issue, though it appears to be a file that we are passing to pandoc that is perhaps encoded incorrectly. A basic pandoc render in the path works fine.

It's likely that the code page isn't be displayed because if there is a render exception, we clear the code page from the cache (where it is read). I'm guessing this is causing the cached code page to disappear, and perhaps the check command is using that cached value rather than computing it.

@cderv cderv removed the triaged-to Issues that were not self-assigned, signals that an issue was assigned to someone. label Feb 20, 2024
@cderv cderv added this to the v1.5 milestone Feb 20, 2024
@dragonstyle
Copy link
Collaborator

This will reproduce in pure pandoc when the pandoc executable is placed within a unicode character path on a file system with non-english code page:

C:\Users\ct\你好>chcp
Active code page: 936
C:\Users\ct\你好>dir
 Volume in drive C has no label.
 Volume Serial Number is D466-B618

 Directory of C:\Users\ct\你好

02/20/2024  12:52 PM    <DIR>          .
02/20/2024  12:52 PM    <DIR>          ..
12/16/2023  03:20 AM       214,419,968 pandoc.exe
02/20/2024  12:47 PM                64 test.lua
02/20/2024  10:46 AM                 0 test.md

test.lua

function Pandoc(doc)
  package.path = ""
  require("foo")
end

test.md

Command

C:\Users\ct\你好>pandoc.exe test.md -L test.lua

@cscheid
Copy link
Collaborator

cscheid commented Feb 21, 2024

(cc @tarleb)

The issue here is that Lua's package.path built-in includes non-utf8 characters.

Those confuse pandoc.path functions, which assume UTF-8 and ultimately corrupt path strings.

@cscheid
Copy link
Collaborator

cscheid commented Feb 21, 2024

@cderv Charles and I are thinking that we should run the Windows test suite on a non-standard code page, even if we only do it once a week or so. The root cause here is that we're not actually seeing the behavior regress on these code pages, and we'd like to prevent it in the future.

@cderv
Copy link
Collaborator

cderv commented Feb 21, 2024

@cderv Charles and I are thinking that we should run the Windows test suite on a non-standard code page, even if we only do it once a week or so. The root cause here is that we're not actually seeing the behavior regress on these code pages, and we'd like to prevent it in the future.

It makes sense, I can add nightly run for that special usage. (We could also run it for each pre-release tag created). Is this just different codepage or also some path tweaking with special character ?

We can sync directly and I'll add to the CI updates to do for 1.5.

@cscheid
Copy link
Collaborator

cscheid commented Feb 21, 2024

Is this just different codepage or also some path tweaking with special character ?

We currently don't think we will be able to fully support Quarto installed on a path with non-ascii characters (it's a combination of Lua, Pandoc, and Windows bugs that we simply can't work around in generality right now). But we believe that there might be more bugs lurking if we were to even run the test suite on non-standard code pages, and we would like to support that use case well.

So we should start fixing the simpler cases first.

@dragonstyle
Copy link
Collaborator

dragonstyle commented Feb 22, 2024

Work in progress here:

@dragonstyle
Copy link
Collaborator

The work in progress addresses the most core issues with the following configuration:

  • Windows OS Code Page 936

  • User home directory includes unicode characters

  • Place Quarto within the user home directory

  • Run tests

  • Known issue with python path handling - we are failing to initialize the logger in log.py#19 likely due to incorrect path encoding

  • Known issue can occur when attempting to read temp file path (this has been transient so I haven't pinned down when it happens yet)

@cderv cderv added the early-in-release An issue that should be worked on early in the release (likely due to risk) label Nov 6, 2024
@cscheid cscheid modified the milestones: v1.7, Future Dec 12, 2024
@cscheid cscheid added pandoc pandoc-lua Issues with our Lua helper functions, filters, etc. in Pandoc and removed early-in-release An issue that should be worked on early in the release (likely due to risk) labels Dec 12, 2024
@hongyuanjia
Copy link

I encountered a similar issue and found a possible workaround through some trial and error. On Windows, simply setting the environment variables %TEMP% and %TMP% to a path that contains only ASCII characters seems to resolve the problem. Specifically, you can create a new folder (e.g., C:\Temp) and point these two environment variables to that new path.
Related issue #8530 #4103

@cderv
Copy link
Collaborator

cderv commented Jan 6, 2025

Thanks a lot for sharing this @hongyuanjia ! IIUC, you are saying that only changing the path to temporary folder on windows solved the issue for you ?

I am asking for confirmation because this would mean it is not related completely to username, or at least it is an easier fix to ensure a proper temporary path with only ASCII characters. 🤔

@hongyuanjia
Copy link

Yes, that's right. Only changing the temporary paths can fix the problems.

@cderv
Copy link
Collaborator

cderv commented Jan 6, 2025

TEMP on windows is set in %LOCALAPPDATA%/Temp which is set in USERHOME directory

$env:TEMP
C:\Users\chris\AppData\Local\Temp
❯ $env:LOCALAPPDATA
C:\Users\chris\AppData\Local

So it is part of known problem with temp path. Can you share you initial TEMP Path ? And where is quarto install ?

This will give us more details to tests and keep trying fixing this at some point. Thank you!

@hongyuanjia
Copy link

It was from one of my student's laptop. The original TEMP path is the Windows default one, i.e. C:\Users\曾晓蕊\AppData\Local\Temp. Quarto was installed at the default place.

Care may be needed to determine where to set the new TEMP path. I set the new TEMP path to C:\temp and also changed the folder permissions to allow writing for all users. I recalled that the installer of Quarto requires administrator privileges. May be this will not be an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pandoc pandoc-lua Issues with our Lua helper functions, filters, etc. in Pandoc windows
Projects
None yet
Development

No branches or pull requests

8 participants