qs_codec.utils package¶
Submodules¶
qs_codec.utils.decode_utils module¶
Utilities for decoding percent‑encoded query strings and splitting composite keys into bracketed path segments.
This mirrors the semantics of the Node qs library:
Decoding handles both UTF‑8 and Latin‑1 code paths.
Key splitting keeps bracket groups balanced and optionally treats dots as path separators when
allow_dots=True
.Top‑level dot splitting uses a character‑scanner that handles degenerate cases (leading ‘.’ starts a bracket segment; ‘.[’ is skipped; double dots preserve the first; trailing ‘.’ is preserved) and never treats literal percent‑encoded sequences (e.g., ‘%2E’) as split points; only actual ‘.’ characters at depth 0 are split.
- class qs_codec.utils.decode_utils.DecodeUtils[source]¶
Bases:
object
Decode helpers compiled into a single, importable namespace.
All methods are classmethods so they are easy to stub/patch in tests, and the compiled regular expressions are created once per interpreter session.
- HEX2_PATTERN: Pattern[str] = re.compile('%([0-9A-Fa-f]{2})')¶
- UNESCAPE_PATTERN: Pattern[str] = re.compile('%u(?P<unicode>[0-9A-Fa-f]{4})|%(?P<hex>[0-9A-Fa-f]{2})', re.IGNORECASE)¶
- classmethod decode(string: str | None, charset: Charset | None = Charset.UTF8, kind: DecodeKind = DecodeKind.VALUE) str | None [source]¶
Decode a URL‑encoded scalar.
Notes
The kind parameter is accepted for API compatibility but is currently ignored; keys and values are decoded identically. It may be removed in a future major release.
Behavior: - Replace
+
with a literal space before decoding. - Ifcharset
isLATIN1
, decode only%XX
byte sequences (no%uXXXX
).%uXXXX
sequences are left as‑is to mimic older browser/JS behavior. - Otherwise (UTF‑8), defer tourllib.parse.unquote()
. - Keys and values are decoded identically; whether a literal.
acts as a key separator is decided later by the key‑splitting logic.- Returns:
None
when the input isNone
.- Return type:
Optional[str]
- classmethod dot_to_bracket_top_level(s: str) str [source]¶
Convert top-level dot segments into bracket groups after percent-decoding.
Notes
In the normal decode path, the key has already been percent-decoded by the upstream scanner, so sequences like
%2E
/%2e
are already literal.
when this function runs. As a result, withallow_dots=True
, any top-level.
will be treated as a separator here. This is independent ofdecode_dot_in_keys
(which only affects how encoded dots inside bracket segments are normalized later during object folding).If a custom decoder returns raw tokens (i.e., bypasses percent-decoding),
%2E
/%2e
may still appear here; those percent sequences are preserved verbatim and are not used as separators.
Rules¶
Only dots at depth == 0 split. Dots inside
[]
are preserved.Degenerate cases: * leading
.
starts a bracket segment (.a
behaves like[a]
) *.[
is skipped soa.[b]
behaves likea[b]
*a..b
preserves the first dot →a.[b]
* trailing.
is preserved and ignored by the splitter
Examples
‘user.email.name’ -> ‘user[email][name]’ ‘a[b].c’ -> ‘a[b][c]’ ‘a[.].c’ -> ‘a[.][c]’ ‘a%2E[b]’ -> ‘a%2E[b]’ (only if a custom decoder left it encoded)
- classmethod split_key_into_segments(original_key: str, allow_dots: bool, max_depth: int, strict_depth: bool) List[str] [source]¶
Split a composite key into balanced bracket segments.
If
allow_dots
is True, convert top‑level dots to bracket groups using a character‑scanner (a.b[c]
→a[b][c]
), preserving dots inside brackets and degenerate cases.The parent (non‑bracket) prefix becomes the first segment, e.g.
"a[b][c]"
→["a", "[b]", "[c]"]
.Bracket groups are balanced using a counter so nested brackets within a single group (e.g.
"[with[inner]]"
) are treated as one segment.When
max_depth <= 0
, no splitting occurs; the key is returned as a single segment (qs semantics).If there are more groups beyond
max_depth
andstrict_depth
is True, anIndexError
is raised. Otherwise, the remainder is added as one final segment (again mirroring qs).Unterminated ‘[’: the remainder after the first unmatched ‘[’ is captured as a single synthetic bracket segment.
Examples
max_depth=2: “a[b][c][d]” -> [“a”, “[b]”, “[c]”, “[[d]]”] unterminated: “a[b” -> [“a”, “[[b]”]
This runs in O(n) time over the key string.
- classmethod unescape(string: str) str [source]¶
Emulate legacy JavaScript unescape behavior.
Replaces both
%XX
and%uXXXX
escape sequences with the corresponding code points. This function is intentionally permissive and does not validate UTF‑8; it is used to model historical behavior in Latin‑1 mode.Examples
>>> DecodeUtils.unescape("%u0041%20%42") 'A B' >>> DecodeUtils.unescape("%7E") '~'
qs_codec.utils.encode_utils module¶
A collection of encode utility methods used by the library.
- class qs_codec.utils.encode_utils.EncodeUtils[source]¶
Bases:
object
A collection of encode utility methods used by the library.
- HEX_TABLE: Tuple[str, ...] = ('%00', '%01', '%02', '%03', '%04', '%05', '%06', '%07', '%08', '%09', '%0A', '%0B', '%0C', '%0D', '%0E', '%0F', '%10', '%11', '%12', '%13', '%14', '%15', '%16', '%17', '%18', '%19', '%1A', '%1B', '%1C', '%1D', '%1E', '%1F', '%20', '%21', '%22', '%23', '%24', '%25', '%26', '%27', '%28', '%29', '%2A', '%2B', '%2C', '%2D', '%2E', '%2F', '%30', '%31', '%32', '%33', '%34', '%35', '%36', '%37', '%38', '%39', '%3A', '%3B', '%3C', '%3D', '%3E', '%3F', '%40', '%41', '%42', '%43', '%44', '%45', '%46', '%47', '%48', '%49', '%4A', '%4B', '%4C', '%4D', '%4E', '%4F', '%50', '%51', '%52', '%53', '%54', '%55', '%56', '%57', '%58', '%59', '%5A', '%5B', '%5C', '%5D', '%5E', '%5F', '%60', '%61', '%62', '%63', '%64', '%65', '%66', '%67', '%68', '%69', '%6A', '%6B', '%6C', '%6D', '%6E', '%6F', '%70', '%71', '%72', '%73', '%74', '%75', '%76', '%77', '%78', '%79', '%7A', '%7B', '%7C', '%7D', '%7E', '%7F', '%80', '%81', '%82', '%83', '%84', '%85', '%86', '%87', '%88', '%89', '%8A', '%8B', '%8C', '%8D', '%8E', '%8F', '%90', '%91', '%92', '%93', '%94', '%95', '%96', '%97', '%98', '%99', '%9A', '%9B', '%9C', '%9D', '%9E', '%9F', '%A0', '%A1', '%A2', '%A3', '%A4', '%A5', '%A6', '%A7', '%A8', '%A9', '%AA', '%AB', '%AC', '%AD', '%AE', '%AF', '%B0', '%B1', '%B2', '%B3', '%B4', '%B5', '%B6', '%B7', '%B8', '%B9', '%BA', '%BB', '%BC', '%BD', '%BE', '%BF', '%C0', '%C1', '%C2', '%C3', '%C4', '%C5', '%C6', '%C7', '%C8', '%C9', '%CA', '%CB', '%CC', '%CD', '%CE', '%CF', '%D0', '%D1', '%D2', '%D3', '%D4', '%D5', '%D6', '%D7', '%D8', '%D9', '%DA', '%DB', '%DC', '%DD', '%DE', '%DF', '%E0', '%E1', '%E2', '%E3', '%E4', '%E5', '%E6', '%E7', '%E8', '%E9', '%EA', '%EB', '%EC', '%ED', '%EE', '%EF', '%F0', '%F1', '%F2', '%F3', '%F4', '%F5', '%F6', '%F7', '%F8', '%F9', '%FA', '%FB', '%FC', '%FD', '%FE', '%FF')¶
Hex table of all 256 characters
- RFC1738_SAFE_CHARS: Set[int] = {40, 41, 45, 46, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 95, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 126}¶
0-9, A-Z, a-z, -, ., _, ~, (, )
- RFC1738_SAFE_POINTS: Set[int] = {40, 41, 42, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 95, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122}¶
0-9, A-Z, a-z, @, *, _, -, +, ., /, (, )
- SAFE_ALPHA: Set[int] = {48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122}¶
0-9, A-Z, a-z
- SAFE_CHARS: Set[int] = {45, 46, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 95, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 126}¶
0-9, A-Z, a-z, -, ., _, ~
- SAFE_POINTS: Set[int] = {42, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 95, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122}¶
0-9, A-Z, a-z, @, *, _, -, +, ., /
- classmethod encode(value: Any, charset: Charset | None = Charset.UTF8, format: Format | None = Format.RFC3986) str [source]¶
Encode a scalar value to a URL‑encoded string.
Accepts numbers, Decimal, Enum, str, bool, and bytes. Any other type (including None) yields an empty string, matching the Node qs behavior.
For Charset.LATIN1, the output mirrors the JS %uXXXX + numeric entity trick so the result can be safely transported as latin‑1.
Otherwise, values are encoded as UTF‑8 using _encode_string.
- classmethod escape(string: str, format: Format | None = Format.RFC3986) str [source]¶
Emulate the legacy JavaScript escaping behavior.
This function operates on UTF‑16 code units to emulate JavaScript’s legacy %uXXXX behavior. Non‑BMP code points are first expanded into surrogate pairs via _to_surrogates, then each code unit is processed.
Safe set: when format == Format.RFC1738, the characters ( and ) are additionally treated as safe. Otherwise, the RFC3986 safe set is used.
ASCII characters in the safe set are emitted unchanged.
Code units < 256 are emitted as %XX.
Other code units are emitted as %uXXXX.
Reference: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/escape
qs_codec.utils.utils module¶
Utility helpers shared across the qs_codec decode/encode internals.
The functions in this module are intentionally small, allocation‑aware, and careful about container mutation to match the behavior (and performance characteristics) of the original JavaScript qs library.
Key responsibilities: - Merging decoded key/value pairs into nested Python containers (merge) - Removing the library’s Undefined sentinel values (compact and helpers) - Minimal deep‑equality for cycle detection guards (_dicts_are_equal) - Small helpers for list/value composition (combine, apply) - Primitive checks used by the encoder (is_non_nullish_primitive)
Notes: - Undefined marks entries that should be omitted from output structures. We remove these in place where possible to minimize allocations. - Many helpers accept both list and tuple; tuples are converted to lists on mutation because Python tuples are immutable. - Several routines use an object‑identity visited set to avoid infinite recursion when user inputs contain cycles.
- class qs_codec.utils.utils.Utils[source]¶
Bases:
object
Namespace container for stateless utility routines.
All methods are `@staticmethod`s to keep call sites simple and to make the functions easy to reuse across modules without constructing objects.
- static apply(val: List[Any] | Tuple[Any] | Any, fn: Callable) List[Any] | Any [source]¶
Map a callable over a value or sequence.
If val is a list/tuple, returns a list of mapped results; otherwise returns the single mapped value.
- static combine(a: List[Any] | Tuple[Any] | Any, b: List[Any] | Tuple[Any] | Any) List[Any] [source]¶
Concatenate two values, treating non‑sequences as singletons.
Returns a new list; tuples are expanded but not preserved as tuples.
- static compact(root: Dict[str, Any]) Dict[str, Any] [source]¶
Remove all Undefined sentinels from a nested container in place.
Traversal is iterative (explicit stack) to avoid deep recursion, and a per‑object visited set prevents infinite loops on cyclic inputs.
- Parameters:
root – Dictionary to clean. It is mutated and also returned.
- Returns:
The same root object for chaining.
- static is_non_nullish_primitive(val: Any, skip_nulls: bool = False) bool [source]¶
Return True if val is considered a primitive for encoding purposes.
Rules: - None and Undefined are not primitives. - Strings are primitives; if skip_nulls is True, the empty string is not. - Numbers, booleans, Enum, datetime, and timedelta are primitives. - Any non‑container object is treated as primitive.
This mirrors the behavior expected by the original qs encoder.
- static merge(target: Mapping[str, Any] | List[Any] | Tuple[Any] | None, source: Mapping[str, Any] | List[Any] | Tuple[Any] | Any | None, options: DecodeOptions | None = None) Dict[str, Any] | List[Any] | Tuple[Any] | Any [source]¶
Merge source into target in a qs‑compatible way.
This function mirrors how the original JavaScript qs library builds nested structures while parsing query strings. It accepts mappings, sequences (
list
/tuple
), and scalars on either side and returns a merged value.Rules (high level)¶
If source is
None
: return target unchanged.If source is not a mapping: * target is a sequence → append/extend, skipping
Undefined
. * target is a mapping → write items from the sequence under string indices (“0”, “1”, …). * otherwise → return[target, source]
(skippingUndefined
where applicable).If source is a mapping: * target is not a mapping → if target is a sequence, coerce it to an index‑keyed dict and merge; otherwise, concatenate as a list
[target, source]
while skippingUndefined
. * target is a mapping → deep‑merge keys; where keys collide, merge values recursively.
List handling¶
When a list that already contains
Undefined
must receive new values andoptions.parse_lists
isFalse
, the list is promoted to a dict with string indices so positions can be addressed deterministically. Otherwise, sentinels are simply removed as we go.- param target:
Existing value to merge into.
- type target:
mapping | list | tuple | Any | None
- param source:
Incoming value.
- type source:
mapping | list | tuple | Any | None
- param options:
Options that affect list promotion/handling.
- type options:
DecodeOptions | None
- returns:
The merged structure. May be the original target object when source is
None
.- rtype:
mapping | list | tuple | Any