Package pywikipedia :: Module extract_wikilinks
[show private | hide private]
[frames | no frames]

Module pywikipedia.extract_wikilinks

Script to extract all wiki page names a certain HTML file points to in
interwiki-link format

The output can be used as input to interwiki.py.

This script takes a single file name argument, the file should be a HTML file
as captured from one of the wikipedia servers.

Arguments:
-bare       Extract as internal links: [[Title]] instead of [[Family:xx:Title]]
-sorted     Print the pages sorted alphabetically (default: the order in which
            they occur in the HTML file)

Variable Summary
str __version__ = '$Id: extract_wikilinks.py,v 1.7 2005/06/0...
bool complete = True
list fn = []
list list = []
SRE_Pattern R = /wiki/(.*?)*
bool sorted = False

Imported modules:
codecs, re, sys, pywikipedia.wikipedia
Variable Details

__version__

Type:
str
Value:
'$Id: extract_wikilinks.py,v 1.7 2005/06/07 17:49:42 wikipedian Exp $'\
                                                                       

complete

Type:
bool
Value:
True                                                                   

fn

Type:
list
Value:
[]                                                                     

list

Type:
list
Value:
[]                                                                     

R

Type:
SRE_Pattern
Value:
/wiki/(.*?)" *                                                         

sorted

Type:
bool
Value:
False                                                                  

Generated by Epydoc 2.1 on Sun Jul 03 17:07:34 2005 http://epydoc.sf.net