This repository was archived by the owner on May 3, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
thinkberg/wiki2text
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This simple project can be used to convert wikipedia dumps to plain text. usage: java -Xmx2G -Dfile.encoding=UTF-8 -jar wiki2text-1.0-jar-with-dependencies.jar nlwiki-20120203-pages-articles.xml.bz2 > nl.txt
About
Convert Wikipedia dumps to plain text for data analysis or similar
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published