qs_codec package

Subpackages

Submodules

qs_codec.decode module

A query string decoder (parser).

qs_codec.decode.decode(value: str | ~typing.Dict[str, ~typing.Any] | None, options: ~qs_codec.models.decode_options.DecodeOptions = DecodeOptions(allow_dots=False, decode_dot_in_keys=False, allow_empty_lists=False, list_limit=20, charset=<Charset.UTF8: encoding='utf-8'>, charset_sentinel=False, comma=False, delimiter='&', depth=5, parameter_limit=1000, duplicates=<Duplicates.COMBINE: 1>, ignore_query_prefix=False, interpret_numeric_entities=False, parse_lists=True, strict_depth=False, strict_null_handling=False, raise_on_limit_exceeded=False, decoder=<bound method DecodeUtils.decode of <class 'qs_codec.utils.decode_utils.DecodeUtils'>>)) Dict[str, Any][source]

Decodes a query string into a Dict[str, Any].

Providing custom DecodeOptions will override the default behavior.

qs_codec.encode module

A query string encoder (stringifier).

qs_codec.encode.encode(value: ~typing.Any, options: ~qs_codec.models.encode_options.EncodeOptions = EncodeOptions(allow_dots=False, add_query_prefix=False, allow_empty_lists=False, indices=None, list_format=<ListFormat.INDICES: list_format_name='INDICES', generator=<function ListFormatGenerator.indices>>, charset=<Charset.UTF8: encoding='utf-8'>, charset_sentinel=False, delimiter='&', encode=True, encode_dot_in_keys=False, encode_values_only=False, format=<Format.RFC3986: format_name='RFC3986', formatter=<function Formatter.rfc3986>>, filter=None, skip_nulls=False, serialize_date=<function EncodeUtils.serialize_date>, encoder=<function EncodeOptions.encoder.<locals>.<lambda>>, strict_null_handling=False, comma_round_trip=None, sort=None)) str[source]

Encodes an object into a query string.

Providing custom EncodeOptions will override the default behavior.

Module contents

A query string encoding and decoding library for Python. Ported from qs_codec for JavaScript.

class qs_codec.Charset(*values)[source]

Bases: _CharsetDataMixin, Enum

Character set.

LATIN1 = _CharsetDataMixin(encoding='iso-8859-1')

ISO-8859-1 (Latin-1) character encoding.

UTF8 = _CharsetDataMixin(encoding='utf-8')

UTF-8 character encoding.

class qs_codec.DecodeOptions(allow_dots: bool = None, decode_dot_in_keys: bool = None, allow_empty_lists: bool = False, list_limit: int = 20, charset: ~qs_codec.enums.charset.Charset = Charset.UTF8, charset_sentinel: bool = False, comma: bool = False, delimiter: str | ~typing.Pattern[str] = '&', depth: int = 5, parameter_limit: int | float = 1000, duplicates: ~qs_codec.enums.duplicates.Duplicates = Duplicates.COMBINE, ignore_query_prefix: bool = False, interpret_numeric_entities: bool = False, parse_lists: bool = True, strict_depth: bool = False, strict_null_handling: bool = False, raise_on_limit_exceeded: bool = False, decoder: ~typing.Callable[[str | None, ~qs_codec.enums.charset.Charset | None], ~typing.Any] = <bound method DecodeUtils.decode of <class 'qs_codec.utils.decode_utils.DecodeUtils'>>)[source]

Bases: object

Options that configure the output of decode.

allow_dots: bool = None

Set to True to decode dot dict notation in the encoded input.

allow_empty_lists: bool = False

Set to True to allow empty list values inside dicts in the encoded input.

charset: Charset = _CharsetDataMixin(encoding='utf-8')

The character encoding to use when decoding the input.

charset_sentinel: bool = False

Some services add an initial utf8=✓ value to forms so that old InternetExplorer versions are more likely to submit the form as utf-8. Additionally, the server can check the value against wrong encodings of the checkmark character and detect that a query string or application/x-www-form-urlencoded body was not sent as utf-8, e.g. if the form had an accept-charset parameter or the containing page had a different character set.

qs_codec supports this mechanism via the charset_sentinel option. If specified, the utf-8 parameter will be omitted from the returned dict. It will be used to switch to LATIN1 or UTF8 mode depending on how the checkmark is encoded.

Important: When you specify both the charset option and the charset_sentinel option, the charset will be overridden when the request contains a utf-8 parameter from which the actual charset can be deduced. In that sense the charset will behave as the default charset rather than the authoritative charset.

comma: bool = False

Set to True to parse the input as a comma-separated value. Note: nested dict s, such as 'a={b:1},{c:d}' are not supported.

decode_dot_in_keys: bool = None

Set to True to decode dots in keys. Note: it implies allow_dots, so decode will error if you set decode_dot_in_keys to True, and allow_dots to False.

classmethod decoder(string: str | None, charset: Charset | None = Charset.UTF8) str | None

Decode a URL-encoded string.

For non-UTF8 charsets (specifically Charset.LATIN1), it replaces plus signs with spaces and applies a custom unescape for percent-encoded hex sequences. Otherwise, it defers to urllib.parse.unquote.

delimiter: str | Pattern[str] = '&'

The delimiter to use when splitting key-value pairs in the encoded input. Can be a str or a Pattern.

depth: int = 5

By default, when nesting dicts qs_codec will only decode up to 5 children deep. This depth can be overridden by setting the depth. The depth limit helps mitigate abuse when qs_codec is used to parse user input, and it is recommended to keep it a reasonably small number.

duplicates: Duplicates = 1

Change the duplicate key handling strategy.

ignore_query_prefix: bool = False

Set to True to ignore the leading question mark query prefix in the encoded input.

interpret_numeric_entities: bool = False

Set to True to interpret HTML numeric entities (&#...;) in the encoded input.

list_limit: int = 20

qs_codec will limit specifying indices in a list to a maximum index of 20. Any list members with an index of greater than 20 will instead be converted to a dict with the index as the key. This is needed to handle cases when someone sent, for example, a[999999999] and it will take significant time to iterate over this huge list. This limit can be overridden by passing a list_limit option.

parameter_limit: int | float = 1000

For similar reasons, by default qs_codec will only parse up to 1000 parameters. This can be overridden by passing a parameter_limit option.

parse_lists: bool = True

To disable list parsing entirely, set parse_lists to False.

raise_on_limit_exceeded: bool = False

Set to True to raise an error when the input contains more parameters than the list_limit.

strict_depth: bool = False

Set to True to raise an error when the input exceeds the depth limit.

strict_null_handling: bool = False

Set to True to decode values without = to None.

class qs_codec.Duplicates(*values)[source]

Bases: Enum

An enum of all available duplicate key handling strategies.

COMBINE = 1

Combine duplicate keys into a single key with an array of values.

FIRST = 2

Use the first value for duplicate keys.

LAST = 3

Use the last value for duplicate keys.

class qs_codec.EncodeOptions(allow_dots: bool = None, add_query_prefix: bool = False, allow_empty_lists: bool = False, indices: bool | None = None, list_format: ~qs_codec.enums.list_format.ListFormat = ListFormat.INDICES, charset: ~qs_codec.enums.charset.Charset = Charset.UTF8, charset_sentinel: bool = False, delimiter: str = '&', encode: bool = True, encode_dot_in_keys: bool = None, encode_values_only: bool = False, format: ~qs_codec.enums.format.Format = Format.RFC3986, filter: ~typing.Callable | ~typing.List[str | int] | None = None, skip_nulls: bool = False, serialize_date: ~typing.Callable[[~datetime.datetime], str | None] = <function EncodeUtils.serialize_date>, encoder: ~typing.Callable[[~typing.Any, ~qs_codec.enums.charset.Charset | None, ~qs_codec.enums.format.Format | None], str] = <property object>, strict_null_handling: bool = False, comma_round_trip: bool | None = None, sort: ~typing.Callable[[~typing.Any, ~typing.Any], int] | None = None)[source]

Bases: object

Options that configure the output of encode.

add_query_prefix: bool = False

Set to True to add a question mark ? prefix to the encoded output.

allow_dots: bool = None

Set to True to use dot dict notation in the encoded output.

allow_empty_lists: bool = False

Set to True to allow empty list s in the encoded output.

charset: Charset = _CharsetDataMixin(encoding='utf-8')

The character encoding to use.

charset_sentinel: bool = False

Set to True to announce the character by including an utf8=✓ parameter with the proper encoding of the checkmark, similar to what Ruby on Rails and others do when submitting forms.

comma_round_trip: bool | None = None

When list_format is set to ListFormat.COMMA, you can also set comma_round_trip option to True or False, to append [] on single-item lists, so that they can round trip through a parse.

delimiter: str = '&'

The delimiter to use when joining key-value pairs in the encoded output.

encode: bool = True

Set to False to disable encoding.

encode_dot_in_keys: bool = None

Encode dict keys using dot notation by setting encode_dot_in_keys to True. Caveat: When encode_values_only is True as well as encode_dot_in_keys, only dots in keys and nothing else will be encoded.

encode_values_only: bool = False

Encoding can be disabled for keys by setting the encode_values_only to True.

property encoder: Callable[[Any, Charset | None, Format | None], str]

Get the encoder function.

filter: Callable | List[str | int] | None = None

Use the filter option to restrict which keys will be included in the encoded output. If you pass a Callable, it will be called for each key to obtain the replacement value. If you pass a list, it will be used to select properties and list indices to be encoded.

format: Format = _FormatDataMixin(format_name='RFC3986', formatter=<function Formatter.rfc3986>)

The encoding format to use. The default format is Format.RFC3986 which encodes ' ' to %20 which is backward compatible. You can also set format to Format.RFC1738 which encodes ' ' to +.

indices: bool | None = None

Use list_format instead.

Type:

Deprecated

list_format: ListFormat = _ListFormatDataMixin(list_format_name='INDICES', generator=<function ListFormatGenerator.indices>)

The list encoding format to use.

serialize_date() str

Serialize a datetime object to an ISO 8601 string.

skip_nulls: bool = False

Set to True to completely skip encoding keys with None values.

sort: Callable[[Any, Any], int] | None = None

Set a Callable to affect the order of parameter keys.

strict_null_handling: bool = False

Set to True to distinguish between null values and empty strings. This way the encoded string None values will have no = sign.

class qs_codec.Format(*values)[source]

Bases: _FormatDataMixin, Enum

An enum of all supported URI component encoding formats.

RFC1738 = _FormatDataMixin(format_name='RFC1738', formatter=<function Formatter.rfc1738>)

RFC 1738.

RFC3986 = _FormatDataMixin(format_name='RFC3986', formatter=<function Formatter.rfc3986>)

RFC 3986.

class qs_codec.ListFormat(*values)[source]

Bases: _ListFormatDataMixin, Enum

An enum of all available list format options.

BRACKETS = _ListFormatDataMixin(list_format_name='BRACKETS', generator=<function ListFormatGenerator.brackets>)

Use brackets to represent list items, for example foo[]=123&foo[]=456&foo[]=789

COMMA = _ListFormatDataMixin(list_format_name='COMMA', generator=<function ListFormatGenerator.comma>)

Use commas to represent list items, for example foo=123,456,789

INDICES = _ListFormatDataMixin(list_format_name='INDICES', generator=<function ListFormatGenerator.indices>)

Use indices to represent list items, for example foo[0]=123&foo[1]=456&foo[2]=789

REPEAT = _ListFormatDataMixin(list_format_name='REPEAT', generator=<function ListFormatGenerator.repeat>)

Use a repeat key to represent list items, for example foo=123&foo=456&foo=789

class qs_codec.Sentinel(*values)[source]

Bases: _SentinelDataMixin, Enum

An enum of all available sentinel values.

CHARSET = _SentinelDataMixin(raw='✓', encoded='utf8=%E2%9C%93')

These are the percent-encoded utf-8 octets representing a checkmark, indicating that the request actually is utf-8 encoded.

ISO = _SentinelDataMixin(raw='&#10003;', encoded='utf8=%26%2310003%3B')

This is what browsers will submit when the character occurs in an application/x-www-form-urlencoded body and the encoding of the page containing the form is iso-8859-1, or when the submitted form has an accept-charset attribute of iso-8859-1. Presumably also with other charsets that do not contain the character, such as us-ascii.

class qs_codec.Undefined[source]

Bases: object

Singleton class to represent undefined values.

qs_codec.decode(value: str | ~typing.Dict[str, ~typing.Any] | None, options: ~qs_codec.models.decode_options.DecodeOptions = DecodeOptions(allow_dots=False, decode_dot_in_keys=False, allow_empty_lists=False, list_limit=20, charset=<Charset.UTF8: encoding='utf-8'>, charset_sentinel=False, comma=False, delimiter='&', depth=5, parameter_limit=1000, duplicates=<Duplicates.COMBINE: 1>, ignore_query_prefix=False, interpret_numeric_entities=False, parse_lists=True, strict_depth=False, strict_null_handling=False, raise_on_limit_exceeded=False, decoder=<bound method DecodeUtils.decode of <class 'qs_codec.utils.decode_utils.DecodeUtils'>>)) Dict[str, Any][source]

Decodes a query string into a Dict[str, Any].

Providing custom DecodeOptions will override the default behavior.

qs_codec.encode(value: ~typing.Any, options: ~qs_codec.models.encode_options.EncodeOptions = EncodeOptions(allow_dots=False, add_query_prefix=False, allow_empty_lists=False, indices=None, list_format=<ListFormat.INDICES: list_format_name='INDICES', generator=<function ListFormatGenerator.indices>>, charset=<Charset.UTF8: encoding='utf-8'>, charset_sentinel=False, delimiter='&', encode=True, encode_dot_in_keys=False, encode_values_only=False, format=<Format.RFC3986: format_name='RFC3986', formatter=<function Formatter.rfc3986>>, filter=None, skip_nulls=False, serialize_date=<function EncodeUtils.serialize_date>, encoder=<function EncodeOptions.encoder.<locals>.<lambda>>, strict_null_handling=False, comma_round_trip=None, sort=None)) str[source]

Encodes an object into a query string.

Providing custom EncodeOptions will override the default behavior.