API documentation

Using Sphinx’s sphinx.ext.autodoc plugin, it is possible to auto-generate documentation of a Python module.

Tip

Avoid having in-function-signature type annotations with autodoc, by setting the following options:

# -- Options for autodoc ----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#configuration

# Automatically extract typehints when specified and place them in
# descriptions of the relevant function/method.
autodoc_typehints = "description"

# Don't show class signature with the class' name.
autodoc_class_signature = "separated"

An extensible library for opening URLs using a variety of protocols

The simplest way to use this module is to call the urlopen function, which accepts a string containing a URL or a Request object (described below). It opens the URL and returns the results as file-like object; the returned object has some extra methods described below.

The OpenerDirector manages a collection of Handler objects that do all the actual work. Each Handler implements a particular protocol or option. The OpenerDirector is a composite object that invokes the Handlers needed to open the requested URL. For example, the HTTPHandler performs HTTP GET and POST requests and deals with non-error returns. The HTTPRedirectHandler automatically deals with HTTP 301, 302, 303 and 307 redirect errors, and the HTTPDigestAuthHandler deals with digest authentication.

urlopen(url, data=None) – Basic usage is the same as original urllib. pass the url and optionally data to post to an HTTP URL, and get a file-like object back. One difference is that you can also pass a Request instance instead of URL. Raises a URLError (subclass of OSError); for HTTP errors, raises an HTTPError, which can also be treated as a valid response.

build_opener – Function that creates a new OpenerDirector instance. Will install the default handlers. Accepts one or more Handlers as arguments, either instances or Handler classes that it will instantiate. If one of the argument is a subclass of the default handler, the argument will be installed instead of the default.

install_opener – Installs a new opener as the default opener.

objects of interest:

OpenerDirector – Sets up the User Agent as the Python-urllib client and manages the Handler classes, while dealing with requests and responses.

Request – An object that encapsulates the state of a request. The state can be as simple as the URL. It can also include extra HTTP headers, e.g. a User-Agent.

BaseHandler –

internals: BaseHandler and parent _call_chain conventions

Example usage:

import urllib.request

# set up authentication info authinfo = urllib.request.HTTPBasicAuthHandler() authinfo.add_password(realm=’PDQ Application’,

uri=’https://mahler:8092/site-updates.py’, user=’klem’, passwd=’geheim$parole’)

proxy_support = urllib.request.ProxyHandler({“http” : “http://ahad-haam:3128”})

# build a new opener that adds authentication and caching FTP handlers opener = urllib.request.build_opener(proxy_support, authinfo,

urllib.request.CacheFTPHandler)

# install it urllib.request.install_opener(opener)

f = urllib.request.urlopen(’https://www.python.org/’)

class urllib.request.FancyURLopener(*args, **kwargs)

Derived class with handlers for errors we can handle (perhaps).

http_error_301(url, fp, errcode, errmsg, headers, data=None)

Error 301 – also relocated (permanently).

http_error_302(url, fp, errcode, errmsg, headers, data=None)

Error 302 – relocated (temporarily).

http_error_303(url, fp, errcode, errmsg, headers, data=None)

Error 303 – also relocated (essentially identical to 302).

http_error_307(url, fp, errcode, errmsg, headers, data=None)

Error 307 – relocated, but turn POST into error.

http_error_401(url, fp, errcode, errmsg, headers, data=None, retry=False)

Error 401 – authentication required. This function supports Basic authentication only.

http_error_407(url, fp, errcode, errmsg, headers, data=None, retry=False)

Error 407 – proxy authentication required. This function supports Basic authentication only.

http_error_default(url, fp, errcode, errmsg, headers)

Default error handling – don’t raise an exception.

prompt_user_passwd(host, realm)

Override this in a GUI environment!

class urllib.request.HTTPDigestAuthHandler(passwd=None)

An authentication protocol defined by RFC 2069

Digest authentication improves on basic authentication because it does not transmit passwords in the clear.

class urllib.request.HTTPErrorProcessor

Process HTTP error responses.

class urllib.request.URLopener(proxies=None, **x509)

Class to open URLs. This is a class rather than just a subroutine because we may need more than one set of global protocol-specific options. Note – this is a base class for those who don’t want the automatic handling of errors type 302 (relocated) and 401 (authorization needed).

addheader(*args)

Add a header to be used by the HTTP interface only e.g. u.addheader(‘Accept’, ‘sound/basic’)

http_error(url, fp, errcode, errmsg, headers, data=None)

Handle http errors.

Derived class can override this, or provide specific handlers named http_error_DDD where DDD is the 3-digit error code.

http_error_default(url, fp, errcode, errmsg, headers)

Default error handler: close the connection and raise OSError.

open(fullurl, data=None)

Use URLopener().open(file) instead of open(file, ‘r’).

open_data(url, data=None)

Use “data” URL.

open_file(url)

Use local file or FTP depending on form of URL.

open_ftp(url)

Use FTP protocol.

open_http(url, data=None)

Use HTTP protocol.

open_https(url, data=None)

Use HTTPS protocol.

open_local_file(url)

Use local file.

open_unknown(fullurl, data=None)

Overridable interface to open unknown URL type.

open_unknown_proxy(proxy, fullurl, data=None)

Overridable interface to open unknown URL type.

retrieve(url, filename=None, reporthook=None, data=None)

retrieve(url) returns (filename, headers) for a local object or (tempfilename, headers) for a remote object.

urllib.request.build_opener(*handlers)

Create an opener object from a list of handlers.

The opener will use several default handlers, including support for HTTP, FTP and when applicable HTTPS.

If any of the handlers passed as arguments are subclasses of the default handlers, the default handlers will not be used.

urllib.request.getproxies()

Return a dictionary of scheme -> proxy server URL mappings.

Scan the environment for variables named <scheme>_proxy; this seems to be the standard convention. If you need a different way, you can pass a proxies dictionary to the [Fancy]URLopener constructor.

urllib.request.pathname2url(pathname)

OS-specific conversion from a file system path to a relative URL of the ‘file’ scheme; not recommended for general use.

urllib.request.url2pathname(pathname)

OS-specific conversion from a relative URL of the ‘file’ scheme to a file system path; not recommended for general use.

urllib.request.urlcleanup()

Clean up temporary files from urlretrieve calls.

urllib.request.urlopen(url, data=None, timeout=<object object>, *, cafile=None, capath=None, cadefault=False, context=None)

Open the URL url, which can be either a string or a Request object.

data must be an object specifying additional data to be sent to the server, or None if no such data is needed. See Request for details.

urllib.request module uses HTTP/1.1 and includes a “Connection:close” header in its HTTP requests.

The optional timeout parameter specifies a timeout in seconds for blocking operations like the connection attempt (if not specified, the global default timeout setting will be used). This only works for HTTP, HTTPS and FTP connections.

If context is specified, it must be a ssl.SSLContext instance describing the various SSL options. See HTTPSConnection for more details.

The optional cafile and capath parameters specify a set of trusted CA certificates for HTTPS requests. cafile should point to a single file containing a bundle of CA certificates, whereas capath should point to a directory of hashed certificate files. More information can be found in ssl.SSLContext.load_verify_locations().

The cadefault parameter is ignored.

This function always returns an object which can work as a context manager and has the properties url, headers, and status. See urllib.response.addinfourl for more detail on these properties.

For HTTP and HTTPS URLs, this function returns a http.client.HTTPResponse object slightly modified. In addition to the three new methods above, the msg attribute contains the same information as the reason attribute — the reason phrase returned by the server — instead of the response headers as it is specified in the documentation for HTTPResponse.

For FTP, file, and data URLs and requests explicitly handled by legacy URLopener and FancyURLopener classes, this function returns a urllib.response.addinfourl object.

Note that None may be returned if no handler handles the request (though the default installed global OpenerDirector uses UnknownHandler to ensure this never happens).

In addition, if proxy settings are detected (for example, when a *_proxy environment variable like http_proxy is set), ProxyHandler is default installed and makes sure the requests are handled through the proxy.

urllib.request.urlretrieve(url, filename=None, reporthook=None, data=None)

Retrieve a URL into a temporary location on disk.

Requires a URL argument. If a filename is passed, it is used as the temporary file location. The reporthook argument should be a callable that accepts a block number, a read size, and the total file size of the URL target. The data argument should be valid URL encoded data.

If a filename is passed and the URL points to a local resource, the result is a copy from local file to new file.

Returns a tuple containing the path to the newly created data file as well as the resulting HTTPMessage object.