qs_codec.utils package

Submodules

qs_codec.utils.decode_utils module

Utilities for decoding percent‑encoded query strings and splitting composite keys into bracketed path segments.

This mirrors the semantics of the Node qs library:

  • Decoding handles both UTF‑8 and Latin‑1 code paths.

  • Key splitting keeps bracket groups balanced and optionally treats dots as path separators when allow_dots=True.

  • Top‑level dot splitting uses a character‑scanner that handles degenerate cases (leading ‘.’ starts a bracket segment; ‘.[’ is skipped; double dots preserve the first; trailing ‘.’ is preserved) and never treats literal percent‑encoded sequences (e.g., ‘%2E’) as split points; only actual ‘.’ characters at depth 0 are split.

class qs_codec.utils.decode_utils.DecodeUtils[source]

Bases: object

Decode helpers compiled into a single, importable namespace.

All methods are classmethods so they are easy to stub/patch in tests, and the compiled regular expressions are created once per interpreter session.

HEX2_PATTERN: Pattern[str] = re.compile('%([0-9A-Fa-f]{2})')
UNESCAPE_PATTERN: Pattern[str] = re.compile('%u(?P<unicode>[0-9A-Fa-f]{4})|%(?P<hex>[0-9A-Fa-f]{2})', re.IGNORECASE)
classmethod decode(string: str | None, charset: Charset | None = Charset.UTF8, kind: DecodeKind = DecodeKind.VALUE) str | None[source]

Decode a URL‑encoded scalar.

Notes

The kind parameter is accepted for API compatibility but is currently ignored; keys and values are decoded identically. It may be removed in a future major release.

Behavior: - Replace + with a literal space before decoding. - If charset is LATIN1, decode only %XX byte sequences (no %uXXXX). %uXXXX sequences are left as‑is to mimic older browser/JS behavior. - Otherwise (UTF‑8), defer to urllib.parse.unquote(). - Keys and values are decoded identically; whether a literal . acts as a key separator is decided later by the key‑splitting logic.

Returns:

None when the input is None.

Return type:

Optional[str]

classmethod dot_to_bracket_top_level(s: str) str[source]

Convert top-level dot segments into bracket groups after percent-decoding.

Notes

  • In the normal decode path, the key has already been percent-decoded by the upstream scanner, so sequences like %2E/%2e are already literal . when this function runs. As a result, with allow_dots=True, any top-level . will be treated as a separator here. This is independent of decode_dot_in_keys (which only affects how encoded dots inside bracket segments are normalized later during object folding).

  • If a custom decoder returns raw tokens (i.e., bypasses percent-decoding), %2E/%2e may still appear here; those percent sequences are preserved verbatim and are not used as separators.

Rules

  • Only dots at depth == 0 split. Dots inside [] are preserved.

  • Degenerate cases: * leading . starts a bracket segment (.a behaves like [a]) * .[ is skipped so a.[b] behaves like a[b] * a..b preserves the first dot → a.[b] * trailing . is preserved and ignored by the splitter

Examples

‘user.email.name’ -> ‘user[email][name]’ ‘a[b].c’ -> ‘a[b][c]’ ‘a[.].c’ -> ‘a[.][c]’ ‘a%2E[b]’ -> ‘a%2E[b]’ (only if a custom decoder left it encoded)

classmethod split_key_into_segments(original_key: str, allow_dots: bool, max_depth: int, strict_depth: bool) List[str][source]

Split a composite key into balanced bracket segments.

  • If allow_dots is True, convert top‑level dots to bracket groups using a character‑scanner (a.b[c]a[b][c]), preserving dots inside brackets and degenerate cases.

  • The parent (non‑bracket) prefix becomes the first segment, e.g. "a[b][c]"["a", "[b]", "[c]"].

  • Bracket groups are balanced using a counter so nested brackets within a single group (e.g. "[with[inner]]") are treated as one segment.

  • When max_depth <= 0, no splitting occurs; the key is returned as a single segment (qs semantics).

  • If there are more groups beyond max_depth and strict_depth is True, an IndexError is raised. Otherwise, the remainder is added as one final segment (again mirroring qs).

  • Unterminated ‘[’: the remainder after the first unmatched ‘[’ is captured as a single synthetic bracket segment.

Examples

max_depth=2: “a[b][c][d]” -> [“a”, “[b]”, “[c]”, “[[d]]”] unterminated: “a[b” -> [“a”, “[[b]”]

This runs in O(n) time over the key string.

classmethod unescape(string: str) str[source]

Emulate legacy JavaScript unescape behavior.

Replaces both %XX and %uXXXX escape sequences with the corresponding code points. This function is intentionally permissive and does not validate UTF‑8; it is used to model historical behavior in Latin‑1 mode.

Examples

>>> DecodeUtils.unescape("%u0041%20%42")
'A B'
>>> DecodeUtils.unescape("%7E")
'~'

qs_codec.utils.encode_utils module

A collection of encode utility methods used by the library.

class qs_codec.utils.encode_utils.EncodeUtils[source]

Bases: object

A collection of encode utility methods used by the library.

HEX_TABLE: Tuple[str, ...] = ('%00', '%01', '%02', '%03', '%04', '%05', '%06', '%07', '%08', '%09', '%0A', '%0B', '%0C', '%0D', '%0E', '%0F', '%10', '%11', '%12', '%13', '%14', '%15', '%16', '%17', '%18', '%19', '%1A', '%1B', '%1C', '%1D', '%1E', '%1F', '%20', '%21', '%22', '%23', '%24', '%25', '%26', '%27', '%28', '%29', '%2A', '%2B', '%2C', '%2D', '%2E', '%2F', '%30', '%31', '%32', '%33', '%34', '%35', '%36', '%37', '%38', '%39', '%3A', '%3B', '%3C', '%3D', '%3E', '%3F', '%40', '%41', '%42', '%43', '%44', '%45', '%46', '%47', '%48', '%49', '%4A', '%4B', '%4C', '%4D', '%4E', '%4F', '%50', '%51', '%52', '%53', '%54', '%55', '%56', '%57', '%58', '%59', '%5A', '%5B', '%5C', '%5D', '%5E', '%5F', '%60', '%61', '%62', '%63', '%64', '%65', '%66', '%67', '%68', '%69', '%6A', '%6B', '%6C', '%6D', '%6E', '%6F', '%70', '%71', '%72', '%73', '%74', '%75', '%76', '%77', '%78', '%79', '%7A', '%7B', '%7C', '%7D', '%7E', '%7F', '%80', '%81', '%82', '%83', '%84', '%85', '%86', '%87', '%88', '%89', '%8A', '%8B', '%8C', '%8D', '%8E', '%8F', '%90', '%91', '%92', '%93', '%94', '%95', '%96', '%97', '%98', '%99', '%9A', '%9B', '%9C', '%9D', '%9E', '%9F', '%A0', '%A1', '%A2', '%A3', '%A4', '%A5', '%A6', '%A7', '%A8', '%A9', '%AA', '%AB', '%AC', '%AD', '%AE', '%AF', '%B0', '%B1', '%B2', '%B3', '%B4', '%B5', '%B6', '%B7', '%B8', '%B9', '%BA', '%BB', '%BC', '%BD', '%BE', '%BF', '%C0', '%C1', '%C2', '%C3', '%C4', '%C5', '%C6', '%C7', '%C8', '%C9', '%CA', '%CB', '%CC', '%CD', '%CE', '%CF', '%D0', '%D1', '%D2', '%D3', '%D4', '%D5', '%D6', '%D7', '%D8', '%D9', '%DA', '%DB', '%DC', '%DD', '%DE', '%DF', '%E0', '%E1', '%E2', '%E3', '%E4', '%E5', '%E6', '%E7', '%E8', '%E9', '%EA', '%EB', '%EC', '%ED', '%EE', '%EF', '%F0', '%F1', '%F2', '%F3', '%F4', '%F5', '%F6', '%F7', '%F8', '%F9', '%FA', '%FB', '%FC', '%FD', '%FE', '%FF')

Hex table of all 256 characters

RFC1738_SAFE_CHARS: Set[int] = {40, 41, 45, 46, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 95, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 126}

0-9, A-Z, a-z, -, ., _, ~, (, )

RFC1738_SAFE_POINTS: Set[int] = {40, 41, 42, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 95, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122}

0-9, A-Z, a-z, @, *, _, -, +, ., /, (, )

SAFE_ALPHA: Set[int] = {48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122}

0-9, A-Z, a-z

SAFE_CHARS: Set[int] = {45, 46, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 95, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 126}

0-9, A-Z, a-z, -, ., _, ~

SAFE_POINTS: Set[int] = {42, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 95, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122}

0-9, A-Z, a-z, @, *, _, -, +, ., /

classmethod encode(value: Any, charset: Charset | None = Charset.UTF8, format: Format | None = Format.RFC3986) str[source]

Encode a scalar value to a URL‑encoded string.

  • Accepts numbers, Decimal, Enum, str, bool, and bytes. Any other type (including None) yields an empty string, matching the Node qs behavior.

  • For Charset.LATIN1, the output mirrors the JS %uXXXX + numeric entity trick so the result can be safely transported as latin‑1.

  • Otherwise, values are encoded as UTF‑8 using _encode_string.

classmethod escape(string: str, format: Format | None = Format.RFC3986) str[source]

Emulate the legacy JavaScript escaping behavior.

This function operates on UTF‑16 code units to emulate JavaScript’s legacy %uXXXX behavior. Non‑BMP code points are first expanded into surrogate pairs via _to_surrogates, then each code unit is processed.

  • Safe set: when format == Format.RFC1738, the characters ( and ) are additionally treated as safe. Otherwise, the RFC3986 safe set is used.

  • ASCII characters in the safe set are emitted unchanged.

  • Code units &lt; 256 are emitted as %XX.

  • Other code units are emitted as %uXXXX.

Reference: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/escape

static serialize_date(dt: datetime) str[source]

Serialize a datetime to ISO‑8601 using datetime.isoformat().

qs_codec.utils.utils module

Utility helpers shared across the qs_codec decode/encode internals.

The functions in this module are intentionally small, allocation‑aware, and careful about container mutation to match the behavior (and performance characteristics) of the original JavaScript qs library.

Key responsibilities: - Merging decoded key/value pairs into nested Python containers (merge) - Removing the library’s Undefined sentinel values (compact and helpers) - Minimal deep‑equality for cycle detection guards (_dicts_are_equal) - Small helpers for list/value composition (combine, apply) - Primitive checks used by the encoder (is_non_nullish_primitive)

Notes: - Undefined marks entries that should be omitted from output structures. We remove these in place where possible to minimize allocations. - Many helpers accept both list and tuple; tuples are converted to lists on mutation because Python tuples are immutable. - Several routines use an object‑identity visited set to avoid infinite recursion when user inputs contain cycles.

class qs_codec.utils.utils.Utils[source]

Bases: object

Namespace container for stateless utility routines.

All methods are `@staticmethod`s to keep call sites simple and to make the functions easy to reuse across modules without constructing objects.

static apply(val: List[Any] | Tuple[Any] | Any, fn: Callable) List[Any] | Any[source]

Map a callable over a value or sequence.

If val is a list/tuple, returns a list of mapped results; otherwise returns the single mapped value.

static combine(a: List[Any] | Tuple[Any] | Any, b: List[Any] | Tuple[Any] | Any) List[Any][source]

Concatenate two values, treating non‑sequences as singletons.

Returns a new list; tuples are expanded but not preserved as tuples.

static compact(root: Dict[str, Any]) Dict[str, Any][source]

Remove all Undefined sentinels from a nested container in place.

Traversal is iterative (explicit stack) to avoid deep recursion, and a per‑object visited set prevents infinite loops on cyclic inputs.

Parameters:

root – Dictionary to clean. It is mutated and also returned.

Returns:

The same root object for chaining.

static is_non_nullish_primitive(val: Any, skip_nulls: bool = False) bool[source]

Return True if val is considered a primitive for encoding purposes.

Rules: - None and Undefined are not primitives. - Strings are primitives; if skip_nulls is True, the empty string is not. - Numbers, booleans, Enum, datetime, and timedelta are primitives. - Any non‑container object is treated as primitive.

This mirrors the behavior expected by the original qs encoder.

static merge(target: Mapping[str, Any] | List[Any] | Tuple[Any] | None, source: Mapping[str, Any] | List[Any] | Tuple[Any] | Any | None, options: DecodeOptions | None = None) Dict[str, Any] | List[Any] | Tuple[Any] | Any[source]

Merge source into target in a qs‑compatible way.

This function mirrors how the original JavaScript qs library builds nested structures while parsing query strings. It accepts mappings, sequences (list / tuple), and scalars on either side and returns a merged value.

Rules (high level)

  • If source is None: return target unchanged.

  • If source is not a mapping: * target is a sequence → append/extend, skipping Undefined. * target is a mapping → write items from the sequence under string indices (“0”, “1”, …). * otherwise → return [target, source] (skipping Undefined where applicable).

  • If source is a mapping: * target is not a mapping → if target is a sequence, coerce it to an index‑keyed dict and merge; otherwise, concatenate as a list [target, source] while skipping Undefined. * target is a mapping → deep‑merge keys; where keys collide, merge values recursively.

List handling

When a list that already contains Undefined must receive new values and options.parse_lists is False, the list is promoted to a dict with string indices so positions can be addressed deterministically. Otherwise, sentinels are simply removed as we go.

param target:

Existing value to merge into.

type target:

mapping | list | tuple | Any | None

param source:

Incoming value.

type source:

mapping | list | tuple | Any | None

param options:

Options that affect list promotion/handling.

type options:

DecodeOptions | None

returns:

The merged structure. May be the original target object when source is None.

rtype:

mapping | list | tuple | Any

Module contents