Skip to content
/ typoscan Public

Utility to create to-do lists out of Wikipedia dumps for AWB.

License

Notifications You must be signed in to change notification settings

scfc/typoscan

Repository files navigation

typoscan is a utility to create to-do lists from Wikipedia
dumps for AutoWikiBrowser.

It retrieves a list of regular expressions to scan for from
http://en.wikipedia.org/wiki/WP:AWB/T, reads a dump file on
STDIN, and outputs a list of all page titles that match on
any of the regular expressions to STDOUT.  Diagnostic output
is directed to STDERR.

AWB ignores some article parts like <nowiki> tags, so there
may be false positives.


Licence
=======

typoscan is licensed under the GNU General Public License
version 3 or later.  You can find a copy of the licence in
COPYING.

The test data files (tests/*.xml, tests/typos-patterns.wiki)
are licensed under the Creative Commons
Attribution-ShareAlike 3.0 Unported License.  You can find a
copy of the licence at
http://creativecommons.org/licenses/by-sa/3.0/.

For the article files (tests/*.xml), you can find the list
of contributors at
http://en.wikipedia.org/w/index.php?oldid=$REVISIONID where
$REVISIONID is the value of /page/revision/id.  For
tests/typos-pattern.wiki, you can find it at
http://en.wikipedia.org/w/index.php?oldid=559776106.

About

Utility to create to-do lists out of Wikipedia dumps for AWB.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published