qs_codec.utils package

Submodules

qs_codec.utils.decode_utils module

Utilities for decoding percent-encoded query strings and splitting composite keys into bracketed path segments.

This mirrors the semantics of the Node qs library:

  • Decoding handles both UTF-8 and Latin-1 code paths.

  • Key splitting keeps bracket groups balanced and optionally treats dots as path separators when allow_dots=True.

  • Top-level dot splitting uses a character-scanner that handles degenerate cases (leading ‘.’ starts a bracket segment; ‘.[’ is skipped; double dots preserve the first; trailing ‘.’ is preserved) and never treats literal percent-encoded sequences (e.g., ‘%2E’) as split points; only actual ‘.’ characters at depth 0 are split.

class qs_codec.utils.decode_utils.DecodeUtils[source]

Bases: object

Decode helpers compiled into a single, importable namespace.

All methods are classmethods so they are easy to stub/patch in tests, and the compiled regular expressions are created once per interpreter session.

HEX2_PATTERN: Pattern[str] = re.compile('%([0-9A-Fa-f]{2})')
UNESCAPE_PATTERN: Pattern[str] = re.compile('%u(?P<unicode>[0-9A-Fa-f]{4})|%(?P<hex>[0-9A-Fa-f]{2})', re.IGNORECASE)
classmethod decode(string: str | None, charset: Charset | None = Charset.UTF8, kind: DecodeKind = DecodeKind.VALUE) str | None[source]

Decode a URL-encoded scalar.

Notes

The kind parameter is accepted for API compatibility but is currently ignored; keys and values are decoded identically. It may be removed in a future major release.

Behavior: - Replace + with a literal space before decoding. - If charset is LATIN1, decode only %XX byte sequences (no %uXXXX). %uXXXX sequences are left as-is to mimic older browser/JS behavior. - Otherwise (UTF-8), defer to urllib.parse.unquote(). - Keys and values are decoded identically; whether a literal . acts as a key separator is decided later by the key-splitting logic.

Returns:

None when the input is None.

Return type:

Optional[str]

classmethod dot_to_bracket_top_level(s: str) str[source]

Convert top-level dot segments into bracket groups after percent-decoding.

Notes

  • In the normal decode path, the key has already been percent-decoded by the upstream scanner, so sequences like %2E/%2e are already literal . when this function runs. As a result, with allow_dots=True, any top-level . will be treated as a separator here. This is independent of decode_dot_in_keys (which only affects how encoded dots inside bracket segments are normalized later during object folding).

  • If a custom decoder returns raw tokens (i.e., bypasses percent-decoding), %2E/%2e may still appear here; those percent sequences are preserved verbatim and are not used as separators.

Rules

  • Only dots at depth == 0 split. Dots inside [] are preserved.

  • Degenerate cases: * leading . starts a bracket segment (.a behaves like [a]) * .[ is skipped so a.[b] behaves like a[b] * a..b preserves the first dot → a.[b] * trailing . is preserved and ignored by the splitter

Examples

‘user.email.name’ -> ‘user[email][name]’ ‘a[b].c’ -> ‘a[b][c]’ ‘a[.].c’ -> ‘a[.][c]’ ‘a%2E[b]’ -> ‘a%2E[b]’ (only if a custom decoder left it encoded)

classmethod split_key_into_segments(original_key: str, allow_dots: bool, max_depth: int, strict_depth: bool) List[str][source]

Split a composite key into balanced bracket segments.

  • If allow_dots is True, convert top-level dots to bracket groups using a character-scanner (a.b[c]a[b][c]), preserving dots inside brackets and degenerate cases.

  • The parent (non-bracket) prefix becomes the first segment, e.g. "a[b][c]"["a", "[b]", "[c]"].

  • Bracket groups are balanced using a counter so nested brackets within a single group (e.g. "[with[inner]]") are treated as one segment.

  • When max_depth <= 0, no splitting occurs; the key is returned as a single segment (qs semantics).

  • If there are more groups beyond max_depth and strict_depth is True, an IndexError is raised. Otherwise, the remainder is added as one final segment (again mirroring qs).

  • Unterminated ‘[’: the remainder after the first unmatched ‘[’ is captured as a single synthetic bracket segment.

Examples

max_depth=2: “a[b][c][d]” -> [“a”, “[b]”, “[c]”, “[[d]]”] unterminated: “a[b” -> [“a”, “[[b]”]

This runs in O(n) time over the key string.

classmethod unescape(string: str) str[source]

Emulate legacy JavaScript unescape behavior.

Replaces both %XX and %uXXXX escape sequences with the corresponding code points. This function is intentionally permissive and does not validate UTF-8; it is used to model historical behavior in Latin-1 mode.

Examples

>>> DecodeUtils.unescape("%u0041%20%42")
'A B'
>>> DecodeUtils.unescape("%7E")
'~'

qs_codec.utils.encode_utils module

A collection of encode utility methods used by the library.

class qs_codec.utils.encode_utils.EncodeUtils[source]

Bases: object

A collection of encode utility methods used by the library.

HEX_TABLE: Tuple[str, ...] = ('%00', '%01', '%02', '%03', '%04', '%05', '%06', '%07', '%08', '%09', '%0A', '%0B', '%0C', '%0D', '%0E', '%0F', '%10', '%11', '%12', '%13', '%14', '%15', '%16', '%17', '%18', '%19', '%1A', '%1B', '%1C', '%1D', '%1E', '%1F', '%20', '%21', '%22', '%23', '%24', '%25', '%26', '%27', '%28', '%29', '%2A', '%2B', '%2C', '%2D', '%2E', '%2F', '%30', '%31', '%32', '%33', '%34', '%35', '%36', '%37', '%38', '%39', '%3A', '%3B', '%3C', '%3D', '%3E', '%3F', '%40', '%41', '%42', '%43', '%44', '%45', '%46', '%47', '%48', '%49', '%4A', '%4B', '%4C', '%4D', '%4E', '%4F', '%50', '%51', '%52', '%53', '%54', '%55', '%56', '%57', '%58', '%59', '%5A', '%5B', '%5C', '%5D', '%5E', '%5F', '%60', '%61', '%62', '%63', '%64', '%65', '%66', '%67', '%68', '%69', '%6A', '%6B', '%6C', '%6D', '%6E', '%6F', '%70', '%71', '%72', '%73', '%74', '%75', '%76', '%77', '%78', '%79', '%7A', '%7B', '%7C', '%7D', '%7E', '%7F', '%80', '%81', '%82', '%83', '%84', '%85', '%86', '%87', '%88', '%89', '%8A', '%8B', '%8C', '%8D', '%8E', '%8F', '%90', '%91', '%92', '%93', '%94', '%95', '%96', '%97', '%98', '%99', '%9A', '%9B', '%9C', '%9D', '%9E', '%9F', '%A0', '%A1', '%A2', '%A3', '%A4', '%A5', '%A6', '%A7', '%A8', '%A9', '%AA', '%AB', '%AC', '%AD', '%AE', '%AF', '%B0', '%B1', '%B2', '%B3', '%B4', '%B5', '%B6', '%B7', '%B8', '%B9', '%BA', '%BB', '%BC', '%BD', '%BE', '%BF', '%C0', '%C1', '%C2', '%C3', '%C4', '%C5', '%C6', '%C7', '%C8', '%C9', '%CA', '%CB', '%CC', '%CD', '%CE', '%CF', '%D0', '%D1', '%D2', '%D3', '%D4', '%D5', '%D6', '%D7', '%D8', '%D9', '%DA', '%DB', '%DC', '%DD', '%DE', '%DF', '%E0', '%E1', '%E2', '%E3', '%E4', '%E5', '%E6', '%E7', '%E8', '%E9', '%EA', '%EB', '%EC', '%ED', '%EE', '%EF', '%F0', '%F1', '%F2', '%F3', '%F4', '%F5', '%F6', '%F7', '%F8', '%F9', '%FA', '%FB', '%FC', '%FD', '%FE', '%FF')

Hex table of all 256 characters

RFC1738_SAFE_CHARS: Set[int] = {40, 41, 45, 46, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 95, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 126}

0-9, A-Z, a-z, -, ., _, ~, (, )

RFC1738_SAFE_CHARS_ASCII: Tuple[bool, ...] = (False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, False, False, False, True, True, False, True, True, True, True, True, True, True, True, True, True, False, False, False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, True, False)
RFC1738_SAFE_POINTS: Set[int] = {40, 41, 42, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 95, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122}

0-9, A-Z, a-z, @, *, _, -, +, ., /, (, )

RFC1738_SAFE_POINTS_ASCII: Tuple[bool, ...] = (False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False, False)
SAFE_ALPHA: Set[int] = {48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122}

0-9, A-Z, a-z

SAFE_CHARS: Set[int] = {45, 46, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 95, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 126}

0-9, A-Z, a-z, -, ., _, ~

SAFE_CHARS_ASCII: Tuple[bool, ...] = (False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, False, True, True, True, True, True, True, True, True, True, True, False, False, False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, True, False)
SAFE_POINTS: Set[int] = {42, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 95, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122}

0-9, A-Z, a-z, @, *, _, -, +, ., /

SAFE_POINTS_ASCII: Tuple[bool, ...] = (False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False, False, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False, True, False, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, False, False, False, False, False)
classmethod encode(value: Any, charset: Charset | None = Charset.UTF8, format: Format | None = Format.RFC3986) str[source]

Encode a scalar value to a URL-encoded string.

  • Accepts numbers, Decimal, Enum, str, bool, and bytes. Any other type (including None) yields an empty string, matching the Node qs behavior.

  • For Charset.LATIN1, the output mirrors the JS %uXXXX + numeric entity trick so the result can be safely transported as latin-1.

  • Otherwise, values are encoded as UTF-8 using _encode_string.

classmethod escape(string: str, format: Format | None = Format.RFC3986) str[source]

Emulate the legacy JavaScript escaping behavior.

This function operates on UTF-16 code units to emulate JavaScript’s legacy %uXXXX behavior. Non-BMP code points are first expanded into surrogate pairs via _to_surrogates, then each code unit is processed.

  • Safe set: when format == Format.RFC1738, the characters ( and ) are additionally treated as safe. Otherwise, the RFC3986 safe set is used.

  • ASCII characters in the safe set are emitted unchanged.

  • Code units &lt; 256 are emitted as %XX.

  • Other code units are emitted as %uXXXX.

Reference: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/escape

static serialize_date(dt: datetime) str[source]

Serialize a datetime to ISO-8601 using datetime.isoformat().

qs_codec.utils.utils module

Utility helpers shared across the qs_codec decode/encode internals.

The functions in this module are intentionally small, allocation-aware, and careful about container mutation to match the behavior (and performance characteristics) of the original JavaScript qs library.

Key responsibilities: - Merging decoded key/value pairs into nested Python containers (merge) - Removing the library’s Undefined sentinel values (compact and helpers) - Minimal deep-equality for cycle detection guards (_dicts_are_equal) - Small helpers for list/value composition (combine, apply) - Primitive checks used by the encoder (is_non_nullish_primitive)

Notes: - Undefined marks entries that should be omitted from output structures. We remove these in place where possible to minimize allocations. - Many helpers accept both list and tuple; tuples are converted to lists on mutation because Python tuples are immutable. - Several routines use an object-identity visited set to avoid infinite recursion when user inputs contain cycles.

class qs_codec.utils.utils.Utils[source]

Bases: object

Namespace container for stateless utility routines.

All methods are `@staticmethod`s to keep call sites simple and to make the functions easy to reuse across modules without constructing objects.

static apply(val: List[Any] | Tuple[Any] | Any, fn: Callable) List[Any] | Any[source]

Map a callable over a value or sequence.

If val is a list/tuple, returns a list of mapped results; otherwise returns the single mapped value.

static combine(a: List[Any] | Tuple[Any] | Any, b: List[Any] | Tuple[Any] | Any, options: DecodeOptions | None = None) List[Any] | Dict[str, Any][source]

Concatenate two values, treating non-sequences as singletons.

If list_limit is exceeded, converts the list to an OverflowDict (a dict with numeric keys) to prevent memory exhaustion. When options is provided, its list_limit controls when a list is converted into an OverflowDict (a dict with numeric keys) to prevent unbounded growth. If options is None, the default list_limit from DecodeOptions is used. A negative list_limit is treated as “overflow immediately”: any non-empty combined result will be converted to OverflowDict. This helper never raises an exception when the limit is exceeded; even if DecodeOptions has raise_on_limit_exceeded set to True, combine will still handle overflow only by converting the list to OverflowDict.

static compact(root: Dict[str, Any]) Dict[str, Any][source]

Remove all Undefined sentinels from a nested container in place.

Traversal is iterative (explicit stack) to avoid deep recursion, and a per-object visited set prevents infinite loops on cyclic inputs.

Parameters:

root – Dictionary to clean. It is mutated and also returned.

Returns:

The same root object for chaining.

static is_non_nullish_primitive(val: Any, skip_nulls: bool = False) bool[source]

Return True if val is considered a primitive for encoding purposes.

Rules: - None and Undefined are not primitives. - Strings are primitives; if skip_nulls is True, the empty string is not. - Numbers, booleans, Enum, datetime, and timedelta are primitives. - Any non-container object is treated as primitive.

This mirrors the behavior expected by the original qs encoder.

static is_overflow(obj: Any) bool[source]

Check if an object is an OverflowDict.

static merge(target: Mapping[str, Any] | List[Any] | Tuple[Any] | None, source: Mapping[str, Any] | List[Any] | Tuple[Any] | Any | None, options: DecodeOptions | None = None) Dict[str, Any] | List[Any] | Tuple[Any] | Any[source]

Merge source into target in a qs-compatible way.

This function mirrors how the original JavaScript qs library builds nested structures while parsing query strings. It accepts mappings, sequences (list / tuple), and scalars on either side and returns a merged value.

Rules (high level)

  • If source is None: return target unchanged.

  • If source is not a mapping: * target is a sequence → append/extend, skipping Undefined. * target is a mapping → write items from the sequence under string indices (“0”, “1”, …). * otherwise → return [target, source] (skipping Undefined where applicable).

  • If source is a mapping: * target is not a mapping → if target is a sequence, coerce it to an index-keyed dict and merge; otherwise, concatenate as a list [target, source] while skipping Undefined. * target is a mapping → deep-merge keys; where keys collide, merge values recursively.

List handling

When a list that already contains Undefined must receive new values and options.parse_lists is False, the list is promoted to a dict with string indices so positions can be addressed deterministically. Otherwise, sentinels are simply removed as we go.

param target:

Existing value to merge into.

type target:

mapping | list | tuple | Any | None

param source:

Incoming value.

type source:

mapping | list | tuple | Any | None

param options:

Options that affect list promotion/handling.

type options:

DecodeOptions | None

returns:

The merged structure. May be the original target object when source is None.

rtype:

mapping | list | tuple | Any

static normalize_comma_elem(e: Any) str[source]

Normalize a value for inclusion in a comma-joined list.

Module contents