Site

class mwclient.client.Site(host, path='/w/', ext='.php', pool=None, retry_timeout=30, max_retries=25, wait_callback=<function Site.<lambda>>, clients_useragent=None, max_lag=3, compress=True, force_login=True, do_init=True, httpauth=None, reqs=None, consumer_token=None, consumer_secret=None, access_token=None, access_secret=None, client_certificate=None, custom_headers=None, scheme='https')[source]

A MediaWiki site identified by its hostname.

Examples

>>> import mwclient
>>> wikipedia_site = mwclient.Site('en.wikipedia.org')
>>> wikia_site = mwclient.Site('vim.wikia.com', path='/')
Parameters:
  • host (str) – The hostname of a MediaWiki instance. Must not include a scheme (e.g. https://) - use the scheme argument instead.

  • path (str) – The instances script path (where the index.php and api.php scripts are located). Must contain a trailing slash (/). Defaults to /w/.

  • ext (str) – The file extension used by the MediaWiki API scripts. Defaults to .php.

  • pool (requests.Session) – A preexisting Session to be used when executing API requests.

  • retry_timeout (int) – The number of seconds to sleep for each past retry of a failing API request. Defaults to 30.

  • max_retries (int) – The maximum number of retries to perform for failing API requests. Defaults to 25.

  • wait_callback (Callable) – A callback function to be executed for each failing API request.

  • clients_useragent (str) – A prefix to be added to the default mwclient user-agent. Should follow the pattern ‘{tool_name}/{tool_version} ({contact})’. Check the User-Agent policy for more information.

  • max_lag (int) – A maxlag parameter to be used in index.php calls. Consult the documentation for more information. Defaults to 3.

  • compress (bool) – Whether to request and accept gzip compressed API responses. Defaults to True.

  • force_login (bool) – Whether to require authentication when editing pages. Set to False to allow unauthenticated edits. Defaults to True.

  • do_init (bool) – Whether to automatically initialize the Site on initialization. When set to False, the Site must be initialized manually using the site_init() method. Defaults to True.

  • httpauth (Union[tuple[basestring, basestring], requests.auth.AuthBase]) – An authentication method to be used when making API requests. This can be either an authentication object as provided by the requests library, or a tuple in the form {username, password}. Usernames and passwords provided as text strings are encoded as UTF-8. If dealing with a server that cannot handle UTF-8, please provide the username and password already encoded with the appropriate encoding.

  • reqs (Dict[str, Any]) – Additional arguments to be passed to the requests.Session.request() method when performing API calls. If the timeout key is empty, a default timeout of 30 seconds is added.

  • consumer_token (str) – OAuth1 consumer key for owner-only consumers.

  • consumer_secret (str) – OAuth1 consumer secret for owner-only consumers.

  • access_token (str) – OAuth1 access key for owner-only consumers.

  • access_secret (str) – OAuth1 access secret for owner-only consumers.

  • client_certificate (Union[str, tuple[str, str]]) – A client certificate to be added to the session.

  • custom_headers (Dict[str, str]) – A dictionary of custom headers to be added to all API requests.

  • scheme (str) – The URI scheme to use. This should be either http or https in most cases. Defaults to https.

Raises:
  • RuntimeError – The authentication passed to the httpauth parameter is invalid. You must pass either a tuple or a requests.auth.AuthBase object.

  • errors.OAuthAuthorizationError – The OAuth authorization is invalid.

  • errors.LoginError – Login failed, the reason can be obtained from e.code and e.info (where e is the exception object) and will be one of the API:Login errors. The most common error code is “Failed”, indicating a wrong username or password.

allcategories(start=None, prefix=None, dir='ascending', limit=None, generator=True, end=None)[source]

Retrieve all categories on the wiki as a generator.

allimages(start=None, prefix=None, minsize=None, maxsize=None, limit=None, dir='ascending', sha1=None, sha1base36=None, generator=True, end=None)[source]

Retrieve all images on the wiki as a generator.

Retrieve a list of all links on the wiki as a generator.

allpages(start=None, prefix=None, namespace='0', filterredir='all', minsize=None, maxsize=None, prtype=None, prlevel=None, limit=None, dir='ascending', filterlanglinks='all', generator=True, end=None)[source]

Retrieve all pages on the wiki as a generator.

allusers(start=None, prefix=None, group=None, prop=None, limit=None, witheditsonly=False, activeusers=False, rights=None, end=None)[source]

Retrieve all users on the wiki as a generator.

api(action, http_method='POST', *args, **kwargs)[source]

Perform a generic API call and handle errors.

All arguments will be passed on.

Parameters:
  • action (str) – The MediaWiki API action to be performed.

  • http_method (str) – The HTTP method to use.

Example

To get coordinates from the GeoData MediaWiki extension at English Wikipedia:

>>> site = Site('en.wikipedia.org')
>>> result = site.api('query', prop='coordinates', titles='Oslo|Copenhagen')
>>> for page in result['query']['pages'].values():
...     if 'coordinates' in page:
...         print('{} {} {}'.format(page['title'],
...             page['coordinates'][0]['lat'],
...             page['coordinates'][0]['lon']))
Oslo 59.95 10.75
Copenhagen 55.6761 12.5683
Returns:

The raw response from the API call, as a dictionary.

ask(query, title=None)[source]

Ask a query against Semantic MediaWiki.

API doc: https://semantic-mediawiki.org/wiki/Ask_API

Parameters:

query (str) – The SMW query to be executed.

Returns:

Generator for retrieving all search results, with each answer as a dictionary. If the query is invalid, an APIError is raised. A valid query with zero results will not raise any error.

Examples

>>> query = "[[Category:my cat]]|[[Has name::a name]]|?Has property"
>>> for answer in site.ask(query):
>>>     for title, data in answer.items()
>>>         print(title)
>>>         print(data)
blocks(start=None, end=None, dir='older', ids=None, users=None, limit=None, prop='id|user|by|timestamp|expiry|reason|flags')[source]

Retrieve blocks as a generator.

API doc: https://www.mediawiki.org/wiki/API:Blocks

Returns:

Generator yielding dicts, each dict containing:
  • user: The username or IP address of the user

  • id: The ID of the block

  • timestamp: When the block was added

  • expiry: When the block runs out (infinity for indefinite blocks)

  • reason: The reason they are blocked

  • allowusertalk: Key is present (empty string) if the user is allowed to

    edit their user talk page

  • by: the administrator who blocked the user

  • nocreate: key is present (empty string) if the user’s ability to create

    accounts has been disabled.

Return type:

mwclient.listings.List

See also

When using the users filter to search for blocked users, only one block per given user will be returned. If you want to retrieve the entire block log for a specific user, you can use the Site.logevents() method with type=block and title='User:JohnDoe'.

checkuserlog(user=None, target=None, limit=10, dir='older', start=None, end=None)[source]

Retrieve checkuserlog items as a generator.

chunk_upload(file, filename, ignorewarnings, comment, text)[source]

Upload a file to the site in chunks.

This method is called by Site.upload if you are connecting to a newer MediaWiki installation, so it’s normally not necessary to call this method directly.

Parameters:
  • file (file-like object) – File object or stream to upload.

  • params (dict) – Dict containing upload parameters.

clientlogin(cookies=None, **kwargs)[source]

Login to the wiki using a username and password. The method returns True if it’s a success or the returned response if it’s a multi-steps login process you started. In case of failure it raises some Errors.

Example for classic username / password clientlogin request:
>>> try:
...     site.clientlogin(username='myusername', password='secret')
... except mwclient.errors.LoginError as e:
...     print('Could not login to MediaWiki: %s' % e)
Parameters:
  • cookies (dict) – Custom cookies to include with the log-in request.

  • **kwargs (dict) –

    Custom vars used for clientlogin as: - loginmergerequestfields - loginpreservestate - loginreturnurl, - logincontinue - logintoken - *: additional params depending on the available auth requests.

    to log with classic username / password, you need to add username and password

    See https://www.mediawiki.org/wiki/API:Login#Method_2._clientlogin

Raises:
  • LoginError (mwclient.errors.LoginError) – Login failed, the reason can be obtained from e.code and e.info (where e is the exception object) and will be one of the API:Login errors. The most common error code is “Failed”, indicating a wrong username or password.

  • MaximumRetriesExceeded – API call to log in failed and was retried until all retries were exhausted. This will not occur if the credentials are merely incorrect. See MaximumRetriesExceeded for possible reasons.

  • APIError – An API error occurred. Rare, usually indicates an internal server error.

email(user, text, subject, cc=False)[source]

Send email to a specified user on the wiki.

>>> try:
...     site.email('SomeUser', 'Some message', 'Some subject')
... except mwclient.errors.NoSpecifiedEmail:
...     print('User does not accept email, or has no email address.')
Parameters:
  • user (str) – User name of the recipient

  • text (str) – Body of the email

  • subject (str) – Subject of the email

  • cc (bool) – True to send a copy of the email to yourself (default is False)

Returns:

Dictionary of the JSON response

Raises:
  • NoSpecifiedEmail (mwclient.errors.NoSpecifiedEmail) – User doesn’t accept email

  • EmailError (mwclient.errors.EmailError) – Other email errors

expandtemplates(text, title=None, generatexml=False)[source]

Takes wikitext (text) and expands templates.

API doc: https://www.mediawiki.org/wiki/API:Expandtemplates

Parameters:
  • text (str) – Wikitext to convert.

  • title (str) – Title of the page.

  • generatexml (bool) – Generate the XML parse tree. Defaults to False.

exturlusage(query, prop=None, protocol='http', namespace=None, limit=None)[source]
Retrieve the list of pages that link to a particular domain or URL,

as a generator.

This API call mirrors the Special:LinkSearch function on-wiki.

Query can be a domain like ‘bbc.co.uk’. Wildcards can be used, e.g. ‘*.bbc.co.uk’. Alternatively, a query can contain a full domain name and some or all of a URL: e.g. ‘*.wikipedia.org/wiki/*’

See <https://meta.wikimedia.org/wiki/Help:Linksearch> for details.

Returns:

Generator yielding dicts, each dict containing:
  • url: The URL linked to.

  • ns: Namespace of the wiki page

  • pageid: The ID of the wiki page

  • title: The page title.

Return type:

mwclient.listings.List

get(action, *args, **kwargs)[source]

Perform a generic API call using GET.

This is just a shorthand for calling api() with http_method=’GET’. All arguments will be passed on.

Parameters:

action (str) – The MediaWiki API action to be performed.

Returns:

The raw response from the API call, as a dictionary.

get_token(type, force=False, title=None)[source]

Request a MediaWiki access token of the given type.

Parameters:
  • type (str) – The type of token to request.

  • force (bool) – Force the request of a new token, even if a token of that type has already been cached.

  • title (str) – The page title for which to request a token. Only used for MediaWiki versions below 1.24.

Returns:

A MediaWiki token of the requested type.

Raises:

errors.APIError – A token of the given type could not be retrieved.

handle_api_result(info, kwargs=None, sleeper=None)[source]

Checks the given API response, raising an appropriate exception or sleeping if necessary.

Parameters:
  • info (dict) – The API result.

  • kwargs (dict) – Additional arguments to be passed when raising an errors.APIError.

  • sleeper (sleep.Sleeper) – A Sleeper instance to use when sleeping.

Returns:

False if the given API response contains an exception, else True.

logevents(type=None, prop=None, start=None, end=None, dir='older', user=None, title=None, limit=None, action=None)[source]

Retrieve logevents as a generator.

login(username=None, password=None, cookies=None, domain=None)[source]

Login to the wiki using a username and bot password. The method returns nothing if the login was successful, but raises and error if it was not. If you use mediawiki >= 1.27 and try to login with normal account (not botpassword account), you should use clientlogin instead, because login action is deprecated since 1.27 with normal account and will stop working in the near future. See these pages to learn more:

Note: at least until v1.33.1, botpasswords accounts seem to not have

“userrights” permission. If you need to update user’s groups, this permission is required so you must use client login with a user who has userrights permission (a bureaucrat for eg.).

Parameters:
  • username (str) – MediaWiki username

  • password (str) – MediaWiki password

  • cookies (dict) – Custom cookies to include with the log-in request.

  • domain (str) – Sends domain name for authentication; used by some MediaWiki plug-ins like the ‘LDAP Authentication’ extension.

Raises:
  • LoginError (mwclient.errors.LoginError) – Login failed, the reason can be obtained from e.code and e.info (where e is the exception object) and will be one of the API:Login errors. The most common error code is “Failed”, indicating a wrong username or password.

  • MaximumRetriesExceeded – API call to log in failed and was retried until all retries were exhausted. This will not occur if the credentials are merely incorrect. See MaximumRetriesExceeded for possible reasons.

  • APIError – An API error occurred. Rare, usually indicates an internal server error.

parse(text=None, title=None, page=None, prop=None, redirects=False, mobileformat=False)[source]

Parses the given content and returns parser output.

Parameters:
  • text (str) – Text to parse.

  • title (str) – Title of page the text belongs to.

  • page (str) – The name of a page to parse. Cannot be used together with text and title.

  • prop (str) – Which pieces of information to get. Multiple alues should be separated using the pipe (|) character.

  • redirects (bool) – Resolve the redirect, if the given page is a redirect. Defaults to False.

  • mobileformat (bool) – Return parse output in a format suitable for mobile devices. Defaults to False.

Returns:

The parse output as generated by MediaWiki.

patrol(rcid=None, revid=None, tags=None)[source]

Patrol a page or a revision. Either rcid or revid (but not both) must be given. The rcid and revid arguments may be obtained using the Site.recentchanges() function.

API doc: https://www.mediawiki.org/wiki/API:Patrol

Parameters:
  • rcid (int) – The recentchanges ID to patrol.

  • revid (int) – The revision ID to patrol.

  • tags (str) – Change tags to apply to the entry in the patrol log. Multiple tags can be given, by separating them with the pipe (|) character.

Returns:

The API response as a dictionary containing:

  • rcid (int): The recentchanges id.

  • nsid (int): The namespace id.

  • title (str): The page title.

Return type:

Dict[str, Any]

Raises:

errors.APIError – The MediaWiki API returned an error.

Notes

  • autopatrol rights are required in order to use this function.

  • revid requires at least MediaWiki 1.22.

  • tags requires at least MediaWiki 1.27.

post(action, *args, **kwargs)[source]

Perform a generic API call using POST.

This is just a shorthand for calling api() with http_method=’POST’. All arguments will be passed on.

Parameters:

action (str) – The MediaWiki API action to be performed.

Returns:

The raw response from the API call, as a dictionary.

random(namespace, limit=20)[source]

Retrieve a generator of random pages from a particular namespace.

limit specifies the number of random articles retrieved. namespace is a namespace identifier integer.

Generator contains dictionary with namespace, page ID and title.

raw_api(action, http_method='POST', retry_on_error=True, *args, **kwargs)[source]

Send a call to the API.

Parameters:
  • action (str) – The MediaWiki API action to perform.

  • http_method (str) – The HTTP method to use in the request.

  • retry_on_error (bool) – Whether to retry API call on connection errors.

  • *args (Tuple[str, Any]) – Arguments to be passed to the api.php script as data.

  • **kwargs (Any) – Arguments to be passed to the api.php script as data.

Returns:

The API response.

Raises:
  • errors.APIDisabledError – The MediaWiki API is disabled for this instance.

  • errors.InvalidResponse – The API response could not be decoded from JSON.

  • errors.MaximumRetriesExceeded – The API request failed and the maximum number of retries was exceeded.

  • requests.exceptions.HTTPError – Received an invalid HTTP response, or a status code in the 4xx range.

  • requests.exceptions.ConnectionError – Encountered an unexpected error while performing the API request.

  • requests.exceptions.Timeout – The API request timed out.

raw_call(script, data, files=None, retry_on_error=True, http_method='POST')[source]

Perform a generic request and return the raw text.

In the event of a network problem, or an HTTP response with status code 5XX, we’ll wait and retry the configured number of times before giving up if retry_on_error is True.

requests.exceptions.HTTPError is still raised directly for HTTP responses with status codes in the 4XX range, and invalid HTTP responses.

Parameters:
  • script (str) – Script name, usually ‘api’.

  • data (dict) – Post data

  • files (dict) – Files to upload

  • retry_on_error (bool) – Retry on connection error

  • http_method (str) – The HTTP method, defaults to ‘POST’

Returns:

The raw text response.

Raises:
raw_index(action, http_method='POST', *args, **kwargs)[source]

Sends a call to index.php rather than the API.

Parameters:
  • action (str) – The MediaWiki API action to perform.

  • http_method (str) – The HTTP method to use in the request.

  • *args (Tuple[str, Any]) – Arguments to be passed to the index.php script as data.

  • **kwargs (Any) – Arguments to be passed to the index.php script as data.

Returns:

The API response.

Raises:
recentchanges(start=None, end=None, dir='older', namespace=None, prop=None, show=None, limit=None, type=None, toponly=None)[source]

List recent changes to the wiki, à la Special:Recentchanges.

require(major, minor, revision=None, raise_error=True)[source]

Check whether the current wiki matches the required version.

Parameters:
  • major (int) – The required major version.

  • minor (int) – The required minor version.

  • revision (int) – The required revision.

  • raise_error (bool) – Whether to throw an error if the version of the current wiki is below the required version. Defaults to True.

Returns:

False if the version of the current wiki is below the required version, else

True. If either raise_error=True or the site is uninitialized and raise_error=None then nothing is returned.

Raises:
  • errors.MediaWikiVersionError – The current wiki is below the required version and raise_error=True.

  • RuntimeError – It raise_error is None and the version attribute is unset This is usually done automatically on construction of the Site, unless do_init=False is passed to the constructor. After instantiation, the site_init() functon can be used to retrieve and set the version.

  • NotImplementedError – If the revision argument was passed. The logic for this is currently unimplemented.

revisions(revids, prop='ids|timestamp|flags|comment|user')[source]

Get data about a list of revisions.

See also the Page.revisions() method.

API doc: https://www.mediawiki.org/wiki/API:Revisions

Example: Get revision text for two revisions:

>>> for revision in site.revisions([689697696, 689816909], prop='content'):
...     print(revision['*'])
Parameters:
  • revids (list) – A list of (max 50) revisions.

  • prop (str) – Which properties to get for each revision.

Returns:

A list of revisions

search(search, namespace='0', what=None, redirects=False, limit=None)[source]

Perform a full text search.

API doc: https://www.mediawiki.org/wiki/API:Search

Example

>>> for result in site.search('prefix:Template:Citation/'):
...     print(result.get('title'))
Parameters:
  • search (str) – The query string

  • namespace (int) – The namespace to search (default: 0)

  • what (str) – Search scope: ‘text’ for fulltext, or ‘title’ for titles only. Depending on the search backend, both options may not be available. For instance CirrusSearch doesn’t support ‘title’, but instead provides an “intitle:” query string filter.

  • redirects (bool) – Include redirect pages in the search (option removed in MediaWiki 1.23).

Returns:

Search results iterator

Return type:

mwclient.listings.List

site_init()[source]

Populates the object with information about the current user and site. This is done automatically when creating the object, unless explicitly disabled using the do_init=False constructor argument.

upload(file=None, filename=None, description='', ignore=False, file_size=None, url=None, filekey=None, comment=None)[source]

Upload a file to the site.

Note that one of file, filekey and url must be specified, but not more than one. For normal uploads, you specify file.

Parameters:
  • file (str) – File object or stream to upload.

  • filename (str) – Destination filename, don’t include namespace prefix like ‘File:’

  • description (str) – Wikitext for the file description page.

  • ignore (bool) – True to upload despite any warnings.

  • file_size (int) – Deprecated in mwclient 0.7

  • url (str) – URL to fetch the file from.

  • filekey (str) – Key that identifies a previous upload that was stashed temporarily.

  • comment (str) – Upload comment. Also used as the initial page text for new files if description is not specified.

Example

>>> client.upload(open('somefile', 'rb'), filename='somefile.jpg',
                  description='Some description')
Returns:

JSON result from the API.

Raises:
usercontributions(user, start=None, end=None, dir='older', namespace=None, prop=None, show=None, limit=None, uselang=None)[source]

List the contributions made by a given user to the wiki.

API doc: https://www.mediawiki.org/wiki/API:Usercontribs

users(users, prop='blockinfo|groups|editcount')[source]

Get information about a list of users.

API doc: https://www.mediawiki.org/wiki/API:Users

static version_tuple_from_generator(string, prefix='MediaWiki ')[source]

Return a version tuple from a MediaWiki Generator string.

Example

>>> Site.version_tuple_from_generator("MediaWiki 1.5.1")
(1, 5, 1)
Parameters:
  • string (str) – The MediaWiki Generator string.

  • prefix (str) – The expected prefix of the string.

Returns:

A tuple containing the individual elements of the given version number.

watchlist(allrev=False, start=None, end=None, namespace=None, dir='older', prop=None, show=None, limit=None)[source]

List the pages on the current user’s watchlist.

API doc: https://www.mediawiki.org/wiki/API:Watchlist