Module pywikipedia.windows_chars
Script to replace bad Windows-1252 (cp1252) characters with
HTML entities on ISO 8859-1 wikis. Don't run this script on a UTF-8 wiki.
Syntax: python windows_chars.py [pageTitle] [file[:filename]] [sql[:filename]]
Command line options:
-file:XYZ reads a list of pages, which can for exampagee be gotten through
Looxix's robot. XYZ is the name of the file from which the
list is taken. If XYZ is not given, the user is asked for a
filename.
Page titles should be in [[double-square brackets]].
-sql:XYZ reads a local SQL cur dump, available at
http://download.wikimedia.org/. Searches for pages with
Windows-1252 characters, and tries to repair them on the live
wiki. Example:
python windows_chars.py -sql:20040711_cur_table.sql.sql -lang:es
- Imported modules:
-
pywikipedia.config
,
pywikipedia.pagegenerators
,
re
,
pywikipedia.replace
,
sys
,
pywikipedia.wikipedia
- Imported variables:
-
__version__
,
msg
,
replacements