Home | Trees | Index | Help |
|
---|
Package pywikipedia :: Module wikipedia |
|
Library to get and put pages on a MediaWiki. Contents of the library (objects and functions to be used outside, situation late August 2004) Classes: Page: A MediaWiki page __init__: Page(xx,Title) - the page with title Title on language xx: linkname: The name of the page, in a form suitable for an interwiki link urlname: The name of the page, in a form suitable for a URL catname: The name of the page, with the namespace part removed section: The section of the page (the part of the name after '#') sectionFreeLinkname: The name without the section part aslink: The name of the page in the form [[Title]] or [[lang:Title]] site: The wiki where this page is in encoding: The encoding the page is in get (*): The text of the page exists (*): True if the page actually exists, false otherwise isRedirectPage (*): True if the page is a redirect, false otherwise isEmpty (*): True if the page has 4 characters or less content, not counting interwiki and category links interwiki (*): The interwiki links from the page (list of Pages) categories (*): The categories the page is in (list of Pages) rawcategories (*): Like categories, but if the link contains a |, the part after the | is included. linkedPages (*): The normal pages linked from the page (list of Pages) imagelinks (*): The pictures on the page (list of strings) templates(*): All templates referenced on the page (list of strings) getRedirectTarget (*): The page the page redirects to isCategory: True if the page is a category, false otherwise isImage: True if the page is an image, false otherwise isDisambig (*): True if the page is a disambiguation page getReferences: The pages linking to the page namespace: The namespace in which the page is put(newtext): Saves the page delete: Deletes the page (requires being logged in) (*): This loads the page if it has not been loaded before Other functions: getall(xx,Pages): Get all pages in Pages (where Pages is a list of Pages, and xx: the language the pages are on) setAction(text): Use 'text' instead of "Wikipedia python library" in summaries allpages(): Get all page titles in one's home language as Pages (or all pages from 'Start' if allpages(start='Start') is used). checkLogin(): gives True if the bot is logged in on the home language, False otherwise argHandler(text): Checks whether text is an argument defined on wikipedia.py (these are -family, -lang, and -log) translate(xx, dict): dict is a dictionary, giving text depending on language, xx is a language. Returns the text in the most applicable language for the xx: wiki output(text): Prints the text 'text' in the encoding of the user's console. input(text): Asks input from the user, printing the text 'text' first. showDiff(oldtext, newtext): Prints the differences between oldtext and newtext on the screen getLanguageLinks(text,xx): get all interlanguage links in wikicode text 'text' in the form xx:pagename removeLanguageLinks(text): gives the wiki-code 'text' without any interlanguage links. replaceLanguageLinks(oldtext, new): in the wiki-code 'oldtext' remove the language links and replace them by the language links in new, a dictionary with the languages as keys and either Pages or linknames as values getCategoryLinks(text,xx): get all category links in text 'text' (links in the form xx:pagename) removeCategoryLinks(text,xx): remove all category links in 'text' replaceCategoryLinks(oldtext,new): replace the category links in oldtext by those in new (new a list of category Pages) stopme(): Put this on a bot when it is not or not any more communicating with the Wiki. It will remove the bot from the list of running processes, and thus not slow down other bot threads any more.
Classes | |
---|---|
GetAll |
|
MyURLopener |
|
Page |
A page on the wiki. |
Site |
|
Throttle |
|
WikimediaXmlHandler |
Exceptions | |
---|---|
EditConflict |
There has been an edit conflict while uploading the page |
Error |
Wikipedia error |
IsNotRedirectPage |
Wikipedia page is not a redirect page |
IsRedirectPage |
Wikipedia page is a redirect page |
LockedPage |
Wikipedia page is locked |
NoNamespace |
Wikipedia page is not in a special namespace |
NoPage |
Wikipedia page does not exist |
NoSuchEntity |
No entity exist for this character |
NotLoggedIn |
Anonymous editing Wikipedia is not possible |
PageInList |
Trying to add page to list that is already included |
PageNotFound |
Page not found in list |
SectionError |
The section specified by # does not exist |
Function Summary | |
---|---|
Convert a unicode name into ascii name with entities | |
Generator which yields all articles in the home language in alphanumerical order, starting at a given page. | |
Takes a commandline parameter, converts it to unicode, and returns it unless it is one of the global parameters as -lang or -log. | |
Create a suitable string encoding all category links for a wikipedia page. | |
checkLogin(site)
| |
Import the named family. | |
getall(site,
pages,
throttle)
| |
Returns a list of category links. | |
Get the contents of page 'name' from the 'site' wiki Do not use this directly; for 99% of the possible ideas you can use the Page object instead. | |
Returns a dictionary of other language links mentioned in the text in the form {code:pagename}. | |
getSite(code,
fam,
user)
| |
Low-level routine to get a URL from the wiki. | |
html2unicode(name,
site,
altsite)
| |
Create a suitable string encoding all interwiki links for a wikipedia page. | |
Try to check whether s is in the form "xx:link" where xx: is a known language. | |
Convert an interwiki link name of a page to the proper name to be used in a URL for that page. | |
The character encoding used by the home wiki | |
Generator which yields new articles subsequently. | |
normalWhitespace(text)
| |
Upload 'text' on page 'name' to the 'site' wiki. | |
redirectRe(site)
| |
Given the wiki-text of a page, return that page with all category links removed. | |
removeEntity(name)
| |
Given the wiki-text of a page, return that page with all interwiki links removed. | |
Replace the category links given in the wikitext given in oldtext by the new links given in new. | |
Replace the interwiki language links given in the wikitext given in oldtext by the new links given in new. | |
Set a summary to use for changed page submissions | |
space2underline(name)
| |
underline2space(name)
| |
Replace escaped HTML-special characters by their originals | |
We have a unicode string. | |
unicodeName(name,
site,
altsite)
| |
UnicodeToAsciiHtml(s)
| |
Convert a url-name of a page into a proper name for an interwiki link the argument 'insite' specifies the target wiki | |
url2unicode(percentname,
site)
| |
This can encode a query so that it can be sent as a query using a http POST request |
codecs
,
pywikipedia.config
,
datetime
,
difflib
,
htmlentitydefs
,
httplib
,
locale
,
math
,
pywikipedia.mediawiki_messages
,
os
,
re
,
socket
,
sys
,
time
,
traceback
,
urllib
,
warnings
,
xml
set
__version__
,
action
,
edittime
,
generators
,
get_throttle
,
put_throttle
,
Rmorespaces
,
Rmoreunderlines
Function Details |
---|
addEntity(name)Convert a unicode name into ascii name with entities |
allpages(start='!', site=None, namespace=0, throttle=True)Generator which yields all articles in the home language in alphanumerical order, starting at a given page. By default, it starts at '!', so it should yield all pages. The objects returned by this generator are all Page()s. |
argHandler(arg, moduleName)Takes a commandline parameter, converts it to unicode, and returns it unless it is one of the global parameters as -lang or -log. If it is a global parameter, processes it and returns None. moduleName should be the name of the module calling this function. This is required because the -help option loads the module's docstring and because the module name will be used for the filename of the log. |
categoryFormat(links, insite=None)Create a suitable string encoding all category links for a wikipedia page. 'links' should be a list of category pagelink objects. The string is formatted for inclusion in insite. |
Family(fam=None, fatal=True)Import the named family. |
getCategoryLinks(text, site, raw=False)Returns a list of category links. in the form {code:pagename}. Do not call this routine directly, use Page objects instead |
getEditPage(site, name, read_only=False, do_quote=True, get_redirect=False, throttle=True)Get the contents of page 'name' from the 'site' wiki Do not use this directly; for 99% of the possible ideas you can use the Page object instead. Arguments: site - the wiki site name - the page name read_only - If true, doesn't raise LockedPage exceptions. do_quote - ??? (TODO: what is this for?) get_redirect - Get the contents, even if it is a redirect page This routine returns a unicode string containing the wiki text. |
getLanguageLinks(text, insite=None)Returns a dictionary of other language links mentioned in the text in the form {code:pagename}. Do not call this routine directly, use Page objects instead |
getUrl(site, path)Low-level routine to get a URL from the wiki. site is a Site object, path is the absolute path. Returns the HTML text of the page converted to unicode. |
interwikiFormat(links, insite=None)Create a suitable string encoding all interwiki links for a wikipedia page. 'links' should be a dictionary with the language names as keys, and either Page objects or the link-names of the pages as values. The string is formatted for inclusion in insite (defaulting to your own). |
isInterwikiLink(s, site=None)Try to check whether s is in the form "xx:link" where xx: is a known language. In such a case we are dealing with an interwiki link. |
link2url(name, site, insite=None)Convert an interwiki link name of a page to the proper name to be used in a URL for that page. code should specify the language for the link |
myencoding()The character encoding used by the home wiki |
newpages(number=10, repeat=False, site=None)Generator which yields new articles subsequently. It starts with the article created 'number' articles ago (first argument). When these are all yielded it fetches NewPages again. If there is no new page, it blocks until there is one, sleeping between subsequent fetches of NewPages. The objects yielded are dictionairies. The keys are date (datetime object), title (pagelink), length (int) user_login (only if user is logged in, string), comment (string) and user_anon (if user is not logged in, string). The throttling is important here, so always enabled. |
putPage(site, name, text, comment=None, watchArticle=False, minorEdit=True, newPage=False, token=None, gettoken=False)Upload 'text' on page 'name' to the 'site' wiki. Use of this routine can normally be avoided; use Page.put instead. |
removeCategoryLinks(text, site)Given the wiki-text of a page, return that page with all category links removed. |
removeLanguageLinks(text, site=None)Given the wiki-text of a page, return that page with all interwiki links removed. If a link to an unknown language is encountered, a warning is printed. |
replaceCategoryLinks(oldtext, new, site=None)Replace the category links given in the wikitext given in oldtext by the new links given in new. 'new' should be a list of category pagelink objects. |
replaceLanguageLinks(oldtext, new, site=None)Replace the interwiki language links given in the wikitext given in oldtext by the new links given in new. 'new' should be a dictionary with the language names as keys, and either Page objects or the link-names of the pages as values. |
setAction(s)Set a summary to use for changed page submissions |
unescape(s)Replace escaped HTML-special characters by their originals |
unicode2html(x, encoding)We have a unicode string. We can attempt to encode it into the desired format, and if that doesn't work, we encode the unicode into html # entities. If it does work, we return it unchanged. |
url2link(percentname, insite, site)Convert a url-name of a page into a proper name for an interwiki link the argument 'insite' specifies the target wiki |
urlencode(query)This can encode a query so that it can be sent as a query using a http POST request |
Home | Trees | Index | Help |
|
---|
Generated by Epydoc 2.1 on Sun Jul 03 17:07:35 2005 | http://epydoc.sf.net |