Site
¶
- class mwclient.client.Site(host, path='/w/', ext='.php', pool=None, retry_timeout=30, max_retries=25, wait_callback=<function Site.<lambda>>, clients_useragent=None, max_lag=3, compress=True, force_login=True, do_init=True, httpauth=None, reqs=None, consumer_token=None, consumer_secret=None, access_token=None, access_secret=None, client_certificate=None, custom_headers=None, scheme='https')[source]¶
A MediaWiki site identified by its hostname.
Examples
>>> import mwclient >>> wikipedia_site = mwclient.Site('en.wikipedia.org') >>> wikia_site = mwclient.Site('vim.wikia.com', path='/')
- Parameters:
host (str) – The hostname of a MediaWiki instance. Must not include a scheme (e.g. https://) - use the scheme argument instead.
path (str) – The instances script path (where the index.php and api.php scripts are located). Must contain a trailing slash (/). Defaults to /w/.
ext (str) – The file extension used by the MediaWiki API scripts. Defaults to .php.
pool (requests.Session) – A preexisting
Session
to be used when executing API requests.retry_timeout (int) – The number of seconds to sleep for each past retry of a failing API request. Defaults to 30.
max_retries (int) – The maximum number of retries to perform for failing API requests. Defaults to 25.
wait_callback (Callable) – A callback function to be executed for each failing API request.
clients_useragent (str) – A prefix to be added to the default mwclient user-agent. Should follow the pattern ‘{tool_name}/{tool_version} ({contact})’. Check the User-Agent policy for more information.
max_lag (int) – A maxlag parameter to be used in index.php calls. Consult the documentation for more information. Defaults to 3.
compress (bool) – Whether to request and accept gzip compressed API responses. Defaults to True.
force_login (bool) – Whether to require authentication when editing pages. Set to False to allow unauthenticated edits. Defaults to True.
do_init (bool) – Whether to automatically initialize the
Site
on initialization. When set to False, theSite
must be initialized manually using thesite_init()
method. Defaults to True.httpauth (Union[tuple[basestring, basestring], requests.auth.AuthBase]) – An authentication method to be used when making API requests. This can be either an authentication object as provided by the
requests
library, or a tuple in the form {username, password}. Usernames and passwords provided as text strings are encoded as UTF-8. If dealing with a server that cannot handle UTF-8, please provide the username and password already encoded with the appropriate encoding.reqs (Dict[str, Any]) – Additional arguments to be passed to the
requests.Session.request()
method when performing API calls. If the timeout key is empty, a default timeout of 30 seconds is added.consumer_token (str) – OAuth1 consumer key for owner-only consumers.
consumer_secret (str) – OAuth1 consumer secret for owner-only consumers.
access_token (str) – OAuth1 access key for owner-only consumers.
access_secret (str) – OAuth1 access secret for owner-only consumers.
client_certificate (Union[str, tuple[str, str]]) – A client certificate to be added to the session.
custom_headers (Dict[str, str]) – A dictionary of custom headers to be added to all API requests.
scheme (str) – The URI scheme to use. This should be either http or https in most cases. Defaults to https.
- Raises:
RuntimeError – The authentication passed to the httpauth parameter is invalid. You must pass either a tuple or a
requests.auth.AuthBase
object.errors.OAuthAuthorizationError – The OAuth authorization is invalid.
errors.LoginError – Login failed, the reason can be obtained from e.code and e.info (where e is the exception object) and will be one of the API:Login errors. The most common error code is “Failed”, indicating a wrong username or password.
- allcategories(start=None, prefix=None, dir='ascending', limit=None, generator=True, end=None)[source]¶
Retrieve all categories on the wiki as a generator.
- allimages(start=None, prefix=None, minsize=None, maxsize=None, limit=None, dir='ascending', sha1=None, sha1base36=None, generator=True, end=None)[source]¶
Retrieve all images on the wiki as a generator.
- alllinks(start=None, prefix=None, unique=False, prop='title', namespace='0', limit=None, generator=True, end=None)[source]¶
Retrieve a list of all links on the wiki as a generator.
- allpages(start=None, prefix=None, namespace='0', filterredir='all', minsize=None, maxsize=None, prtype=None, prlevel=None, limit=None, dir='ascending', filterlanglinks='all', generator=True, end=None)[source]¶
Retrieve all pages on the wiki as a generator.
- allusers(start=None, prefix=None, group=None, prop=None, limit=None, witheditsonly=False, activeusers=False, rights=None, end=None)[source]¶
Retrieve all users on the wiki as a generator.
- api(action, http_method='POST', *args, **kwargs)[source]¶
Perform a generic API call and handle errors.
All arguments will be passed on.
- Parameters:
action (str) – The MediaWiki API action to be performed.
http_method (str) – The HTTP method to use.
Example
To get coordinates from the GeoData MediaWiki extension at English Wikipedia:
>>> site = Site('en.wikipedia.org') >>> result = site.api('query', prop='coordinates', titles='Oslo|Copenhagen') >>> for page in result['query']['pages'].values(): ... if 'coordinates' in page: ... print('{} {} {}'.format(page['title'], ... page['coordinates'][0]['lat'], ... page['coordinates'][0]['lon'])) Oslo 59.95 10.75 Copenhagen 55.6761 12.5683
- Returns:
The raw response from the API call, as a dictionary.
- ask(query, title=None)[source]¶
Ask a query against Semantic MediaWiki.
API doc: https://semantic-mediawiki.org/wiki/Ask_API
- Parameters:
query (str) – The SMW query to be executed.
- Returns:
Generator for retrieving all search results, with each answer as a dictionary. If the query is invalid, an APIError is raised. A valid query with zero results will not raise any error.
Examples
>>> query = "[[Category:my cat]]|[[Has name::a name]]|?Has property" >>> for answer in site.ask(query): >>> for title, data in answer.items() >>> print(title) >>> print(data)
- blocks(start=None, end=None, dir='older', ids=None, users=None, limit=None, prop='id|user|by|timestamp|expiry|reason|flags')[source]¶
Retrieve blocks as a generator.
API doc: https://www.mediawiki.org/wiki/API:Blocks
- Returns:
- Generator yielding dicts, each dict containing:
user: The username or IP address of the user
id: The ID of the block
timestamp: When the block was added
expiry: When the block runs out (infinity for indefinite blocks)
reason: The reason they are blocked
- allowusertalk: Key is present (empty string) if the user is allowed to
edit their user talk page
by: the administrator who blocked the user
- nocreate: key is present (empty string) if the user’s ability to create
accounts has been disabled.
- Return type:
mwclient.listings.List
See also
When using the
users
filter to search for blocked users, only one block per given user will be returned. If you want to retrieve the entire block log for a specific user, you can use theSite.logevents()
method withtype=block
andtitle='User:JohnDoe'
.
- checkuserlog(user=None, target=None, limit=10, dir='older', start=None, end=None)[source]¶
Retrieve checkuserlog items as a generator.
- chunk_upload(file, filename, ignorewarnings, comment, text)[source]¶
Upload a file to the site in chunks.
This method is called by Site.upload if you are connecting to a newer MediaWiki installation, so it’s normally not necessary to call this method directly.
- Parameters:
file (file-like object) – File object or stream to upload.
params (dict) – Dict containing upload parameters.
- clientlogin(cookies=None, **kwargs)[source]¶
Login to the wiki using a username and password. The method returns True if it’s a success or the returned response if it’s a multi-steps login process you started. In case of failure it raises some Errors.
- Example for classic username / password clientlogin request:
>>> try: ... site.clientlogin(username='myusername', password='secret') ... except mwclient.errors.LoginError as e: ... print('Could not login to MediaWiki: %s' % e)
- Parameters:
cookies (dict) – Custom cookies to include with the log-in request.
**kwargs (dict) –
Custom vars used for clientlogin as: - loginmergerequestfields - loginpreservestate - loginreturnurl, - logincontinue - logintoken - *: additional params depending on the available auth requests.
to log with classic username / password, you need to add username and password
See https://www.mediawiki.org/wiki/API:Login#Method_2._clientlogin
- Raises:
LoginError (mwclient.errors.LoginError) – Login failed, the reason can be obtained from e.code and e.info (where e is the exception object) and will be one of the API:Login errors. The most common error code is “Failed”, indicating a wrong username or password.
MaximumRetriesExceeded – API call to log in failed and was retried until all retries were exhausted. This will not occur if the credentials are merely incorrect. See MaximumRetriesExceeded for possible reasons.
APIError – An API error occurred. Rare, usually indicates an internal server error.
- email(user, text, subject, cc=False)[source]¶
Send email to a specified user on the wiki.
>>> try: ... site.email('SomeUser', 'Some message', 'Some subject') ... except mwclient.errors.NoSpecifiedEmail: ... print('User does not accept email, or has no email address.')
- Parameters:
user (str) – User name of the recipient
text (str) – Body of the email
subject (str) – Subject of the email
cc (bool) – True to send a copy of the email to yourself (default is False)
- Returns:
Dictionary of the JSON response
- Raises:
NoSpecifiedEmail (mwclient.errors.NoSpecifiedEmail) – User doesn’t accept email
EmailError (mwclient.errors.EmailError) – Other email errors
- expandtemplates(text, title=None, generatexml=False)[source]¶
Takes wikitext (text) and expands templates.
API doc: https://www.mediawiki.org/wiki/API:Expandtemplates
- Parameters:
text (str) – Wikitext to convert.
title (str) – Title of the page.
generatexml (bool) – Generate the XML parse tree. Defaults to False.
- exturlusage(query, prop=None, protocol='http', namespace=None, limit=None)[source]¶
- Retrieve the list of pages that link to a particular domain or URL,
as a generator.
This API call mirrors the Special:LinkSearch function on-wiki.
Query can be a domain like ‘bbc.co.uk’. Wildcards can be used, e.g. ‘*.bbc.co.uk’. Alternatively, a query can contain a full domain name and some or all of a URL: e.g. ‘*.wikipedia.org/wiki/*’
See <https://meta.wikimedia.org/wiki/Help:Linksearch> for details.
- Returns:
- Generator yielding dicts, each dict containing:
url: The URL linked to.
ns: Namespace of the wiki page
pageid: The ID of the wiki page
title: The page title.
- Return type:
mwclient.listings.List
- get(action, *args, **kwargs)[source]¶
Perform a generic API call using GET.
This is just a shorthand for calling api() with http_method=’GET’. All arguments will be passed on.
- Parameters:
action (str) – The MediaWiki API action to be performed.
- Returns:
The raw response from the API call, as a dictionary.
- get_token(type, force=False, title=None)[source]¶
Request a MediaWiki access token of the given type.
- Parameters:
type (str) – The type of token to request.
force (bool) – Force the request of a new token, even if a token of that type has already been cached.
title (str) – The page title for which to request a token. Only used for MediaWiki versions below 1.24.
- Returns:
A MediaWiki token of the requested type.
- Raises:
errors.APIError – A token of the given type could not be retrieved.
- handle_api_result(info, kwargs=None, sleeper=None)[source]¶
Checks the given API response, raising an appropriate exception or sleeping if necessary.
- Parameters:
info (dict) – The API result.
kwargs (dict) – Additional arguments to be passed when raising an
errors.APIError
.sleeper (sleep.Sleeper) – A
Sleeper
instance to use when sleeping.
- Returns:
False if the given API response contains an exception, else True.
- logevents(type=None, prop=None, start=None, end=None, dir='older', user=None, title=None, limit=None, action=None)[source]¶
Retrieve logevents as a generator.
- login(username=None, password=None, cookies=None, domain=None)[source]¶
Login to the wiki using a username and bot password. The method returns nothing if the login was successful, but raises and error if it was not. If you use mediawiki >= 1.27 and try to login with normal account (not botpassword account), you should use clientlogin instead, because login action is deprecated since 1.27 with normal account and will stop working in the near future. See these pages to learn more:
- Note: at least until v1.33.1, botpasswords accounts seem to not have
“userrights” permission. If you need to update user’s groups, this permission is required so you must use client login with a user who has userrights permission (a bureaucrat for eg.).
- Parameters:
username (str) – MediaWiki username
password (str) – MediaWiki password
cookies (dict) – Custom cookies to include with the log-in request.
domain (str) – Sends domain name for authentication; used by some MediaWiki plug-ins like the ‘LDAP Authentication’ extension.
- Raises:
LoginError (mwclient.errors.LoginError) – Login failed, the reason can be obtained from e.code and e.info (where e is the exception object) and will be one of the API:Login errors. The most common error code is “Failed”, indicating a wrong username or password.
MaximumRetriesExceeded – API call to log in failed and was retried until all retries were exhausted. This will not occur if the credentials are merely incorrect. See MaximumRetriesExceeded for possible reasons.
APIError – An API error occurred. Rare, usually indicates an internal server error.
- parse(text=None, title=None, page=None, prop=None, redirects=False, mobileformat=False)[source]¶
Parses the given content and returns parser output.
- Parameters:
text (str) – Text to parse.
title (str) – Title of page the text belongs to.
page (str) – The name of a page to parse. Cannot be used together with text and title.
prop (str) – Which pieces of information to get. Multiple alues should be separated using the pipe (|) character.
redirects (bool) – Resolve the redirect, if the given page is a redirect. Defaults to False.
mobileformat (bool) – Return parse output in a format suitable for mobile devices. Defaults to False.
- Returns:
The parse output as generated by MediaWiki.
- patrol(rcid=None, revid=None, tags=None)[source]¶
Patrol a page or a revision. Either
rcid
orrevid
(but not both) must be given. Thercid
andrevid
arguments may be obtained using theSite.recentchanges()
function.API doc: https://www.mediawiki.org/wiki/API:Patrol
- Parameters:
rcid (int) – The recentchanges ID to patrol.
revid (int) – The revision ID to patrol.
tags (str) – Change tags to apply to the entry in the patrol log. Multiple tags can be given, by separating them with the pipe (|) character.
- Returns:
The API response as a dictionary containing:
rcid (int): The recentchanges id.
nsid (int): The namespace id.
title (str): The page title.
- Return type:
Dict[str, Any]
- Raises:
errors.APIError – The MediaWiki API returned an error.
Notes
autopatrol
rights are required in order to use this function.revid
requires at least MediaWiki 1.22.tags
requires at least MediaWiki 1.27.
- post(action, *args, **kwargs)[source]¶
Perform a generic API call using POST.
This is just a shorthand for calling api() with http_method=’POST’. All arguments will be passed on.
- Parameters:
action (str) – The MediaWiki API action to be performed.
- Returns:
The raw response from the API call, as a dictionary.
- random(namespace, limit=20)[source]¶
Retrieve a generator of random pages from a particular namespace.
limit specifies the number of random articles retrieved. namespace is a namespace identifier integer.
Generator contains dictionary with namespace, page ID and title.
- raw_api(action, http_method='POST', retry_on_error=True, *args, **kwargs)[source]¶
Send a call to the API.
- Parameters:
action (str) – The MediaWiki API action to perform.
http_method (str) – The HTTP method to use in the request.
retry_on_error (bool) – Whether to retry API call on connection errors.
*args (Tuple[str, Any]) – Arguments to be passed to the api.php script as data.
**kwargs (Any) – Arguments to be passed to the api.php script as data.
- Returns:
The API response.
- Raises:
errors.APIDisabledError – The MediaWiki API is disabled for this instance.
errors.InvalidResponse – The API response could not be decoded from JSON.
errors.MaximumRetriesExceeded – The API request failed and the maximum number of retries was exceeded.
requests.exceptions.HTTPError – Received an invalid HTTP response, or a status code in the 4xx range.
requests.exceptions.ConnectionError – Encountered an unexpected error while performing the API request.
requests.exceptions.Timeout – The API request timed out.
- raw_call(script, data, files=None, retry_on_error=True, http_method='POST')[source]¶
Perform a generic request and return the raw text.
In the event of a network problem, or an HTTP response with status code 5XX, we’ll wait and retry the configured number of times before giving up if retry_on_error is True.
requests.exceptions.HTTPError is still raised directly for HTTP responses with status codes in the 4XX range, and invalid HTTP responses.
- Parameters:
script (str) – Script name, usually ‘api’.
data (dict) – Post data
files (dict) – Files to upload
retry_on_error (bool) – Retry on connection error
http_method (str) – The HTTP method, defaults to ‘POST’
- Returns:
The raw text response.
- Raises:
errors.MaximumRetriesExceeded – The API request failed and the maximum number of retries was exceeded.
requests.exceptions.HTTPError – Received an invalid HTTP response, or a status code in the 4xx range.
requests.exceptions.ConnectionError – Encountered an unexpected error while performing the API request.
requests.exceptions.Timeout – The API request timed out.
- raw_index(action, http_method='POST', *args, **kwargs)[source]¶
Sends a call to index.php rather than the API.
- Parameters:
action (str) – The MediaWiki API action to perform.
http_method (str) – The HTTP method to use in the request.
*args (Tuple[str, Any]) – Arguments to be passed to the index.php script as data.
**kwargs (Any) – Arguments to be passed to the index.php script as data.
- Returns:
The API response.
- Raises:
errors.MaximumRetriesExceeded – The API request failed and the maximum number of retries was exceeded.
requests.exceptions.HTTPError – Received an invalid HTTP response, or a status code in the 4xx range.
requests.exceptions.ConnectionError – Encountered an unexpected error while performing the API request.
requests.exceptions.Timeout – The API request timed out.
- recentchanges(start=None, end=None, dir='older', namespace=None, prop=None, show=None, limit=None, type=None, toponly=None)[source]¶
List recent changes to the wiki, à la Special:Recentchanges.
- require(major, minor, revision=None, raise_error=True)[source]¶
Check whether the current wiki matches the required version.
- Parameters:
major (int) – The required major version.
minor (int) – The required minor version.
revision (int) – The required revision.
raise_error (bool) – Whether to throw an error if the version of the current wiki is below the required version. Defaults to True.
- Returns:
- False if the version of the current wiki is below the required version, else
True. If either raise_error=True or the site is uninitialized and raise_error=None then nothing is returned.
- Raises:
errors.MediaWikiVersionError – The current wiki is below the required version and raise_error=True.
RuntimeError – It raise_error is None and the version attribute is unset This is usually done automatically on construction of the
Site
, unless do_init=False is passed to the constructor. After instantiation, thesite_init()
functon can be used to retrieve and set the version.NotImplementedError – If the revision argument was passed. The logic for this is currently unimplemented.
- revisions(revids, prop='ids|timestamp|flags|comment|user')[source]¶
Get data about a list of revisions.
See also the Page.revisions() method.
API doc: https://www.mediawiki.org/wiki/API:Revisions
Example: Get revision text for two revisions:
>>> for revision in site.revisions([689697696, 689816909], prop='content'): ... print(revision['*'])
- Parameters:
revids (list) – A list of (max 50) revisions.
prop (str) – Which properties to get for each revision.
- Returns:
A list of revisions
- search(search, namespace='0', what=None, redirects=False, limit=None)[source]¶
Perform a full text search.
API doc: https://www.mediawiki.org/wiki/API:Search
Example
>>> for result in site.search('prefix:Template:Citation/'): ... print(result.get('title'))
- Parameters:
search (str) – The query string
namespace (int) – The namespace to search (default: 0)
what (str) – Search scope: ‘text’ for fulltext, or ‘title’ for titles only. Depending on the search backend, both options may not be available. For instance CirrusSearch doesn’t support ‘title’, but instead provides an “intitle:” query string filter.
redirects (bool) – Include redirect pages in the search (option removed in MediaWiki 1.23).
- Returns:
Search results iterator
- Return type:
mwclient.listings.List
- site_init()[source]¶
Populates the object with information about the current user and site. This is done automatically when creating the object, unless explicitly disabled using the do_init=False constructor argument.
- upload(file=None, filename=None, description='', ignore=False, file_size=None, url=None, filekey=None, comment=None)[source]¶
Upload a file to the site.
Note that one of file, filekey and url must be specified, but not more than one. For normal uploads, you specify file.
- Parameters:
file (str) – File object or stream to upload.
filename (str) – Destination filename, don’t include namespace prefix like ‘File:’
description (str) – Wikitext for the file description page.
ignore (bool) – True to upload despite any warnings.
file_size (int) – Deprecated in mwclient 0.7
url (str) – URL to fetch the file from.
filekey (str) – Key that identifies a previous upload that was stashed temporarily.
comment (str) – Upload comment. Also used as the initial page text for new files if description is not specified.
Example
>>> client.upload(open('somefile', 'rb'), filename='somefile.jpg', description='Some description')
- Returns:
JSON result from the API.
- Raises:
- usercontributions(user, start=None, end=None, dir='older', namespace=None, prop=None, show=None, limit=None, uselang=None)[source]¶
List the contributions made by a given user to the wiki.
- static version_tuple_from_generator(string, prefix='MediaWiki ')[source]¶
Return a version tuple from a MediaWiki Generator string.
Example
>>> Site.version_tuple_from_generator("MediaWiki 1.5.1") (1, 5, 1)
- Parameters:
string (str) – The MediaWiki Generator string.
prefix (str) – The expected prefix of the string.
- Returns:
A tuple containing the individual elements of the given version number.