-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better default minisearch tokenizer for Chinese documents #4049
Comments
Have you tried doing vitepress build then run vitepress preview? |
我尝试构建过,你可以预览我的生产链接,https://web.leyen.me |
lucaong/minisearch#201 (comment) -- This comment kind of works, but still needs improvement I guess. There are also some other people doing this - https://github.com/search?q=vitepress+segmenter+language:JavaScript+OR+language:TypeScript+NOT+is:fork&type=code Not sure but bm25 parameters might help too - https://github.com/search?q=vitepress+searchOptions+bm25+language:JavaScript+OR+language:TypeScript+NOT+is:fork&type=code (I haven't checked how they work yet.) |
I'm keeping this open. There should be some defaults here instead of needing Chinese users to manually configure it. |
I tried it out and found that when there was no title, the problem recurred, not just in Chinese.demo |
With current logic titles are needed. There should be a h1 per page. There is a PR open to make it more robust in handling such content, will see. |
Describe the bug
搜索不到内容
Reproduction
中文内容
Expected behavior
System Info
Additional context
No response
Validations
The text was updated successfully, but these errors were encountered: