Normalizer catalog¶

Normalization is the automated structural‑cleanup stage of the Jane pipeline. It runs before parsing and before validation, and it is driven entirely by the structural type of the value being processed.

Normalization rules are:

Automatically selected based on the detected structural type.
Executed in order, exactly as defined in the registry.
Pure (no mutation of the original value).
Event‑emitting, so every change is recorded.
Mode‑aware, allowing lossy rules only in lax mode.

Normalization never interprets meaning. It only ensures that the shape of the value is clean, predictable, and safe for parsing and validation.

How Normalization Works¶

Structural type is detected: Jane determines the structural type (string, number, array, object, and so on) using its internal type taxonomy.
The corresponding rule list is retrieved from the registry: normalizationRuleRegistry[structuralType].
Rules run in order

Each rule receives:
- The current value.
- The current mode.
- The current path.
A rule may:
- Return no changes ([]).
- Return one or more NormalizationResult entries.
- Mark a change as lossy (allowed only in lax mode).
- Emit one or more normalization events.
The final normalized value becomes the input to parsing: Normalization prepares the value for parsing. Parsing interprets meaning. Validation enforces rules.

This ordering is intentional and foundational to Jane’s clarity‑first design.

Normalization Rules by Structural Type¶

Each subsection below describes the rules applied to that type, in order, and links to relevant validators that operate on the same type. This helps developers understand how normalization and validation complement each other.

String Normalization¶

Rules (in order):

These rules:

Remove leading and trailing whitespace.
Collapse internal whitespace sequences.
Convert empty strings to undefined (lossy in lax mode).

Number Normalization¶

Rules (in order):

These rules:

Convert -0 to 0.
Convert NaN to undefined.
Convert Infinity and -Infinity to undefined.

Boolean Normalization¶

Rules: (none)

Booleans are already structurally stable.

Array Normalization¶

Rules (in order):

These rules:

Remove holes.
Flatten nested arrays by one level.
Remove empty strings, null, and undefined (lossy in lax mode).

Object Normalization¶

Rules (in order):

These rules:

Ensure the value is a plain object.
Remove keys with empty or nullish values (lossy in lax mode).
Remove empty arrays and empty objects.

Date Normalization¶

Rules (in order):

invalidDateToUndefined

This rule:

Converts invalid Date objects to undefined.

A representative normalization rule looks like this:

export const flattenOneLevel: NormalizationRule<unknown> = (value, _mode, path) => {
    if (!Array.isArray(value)) return [];

    let changed = false;
    const flattened: unknown[] = [];

    for (const item of value) {
        if (Array.isArray(item)) {
            changed = true;
            flattened.push(...item);
        } else {
            flattened.push(item);
        }
    }

    if (!changed) return [];

    const event = normalizationEvent(
        'info',
        'array.now.flattened',
        path,
        'Flattened nested arrays by one level.',
        'Nested items were simplified.',
        { before: value, after: flattened },
    );

    return [
        {
            path,
            nextValue: flattened,
            lossy: 'lossy',
            events: [event],
        },
    ];
};

This example illustrates the normalization contract:

Detect applicability: The rule checks whether the value is an array and whether any nested arrays exist.
Compute the next value: A new array is constructed without mutating the original.
Emit a structured event: The event includes severity, code, path, developer message, user message, and metadata.
Return a Normalization Result entry: The result describes the new value, whether the change is lossy, and the events produced.

Every normalization rule in Jane follows this same pattern.

Summary¶

Normalization is the structural hygiene stage of the Jane pipeline. It prepares values for parsing and validation by applying a deterministic, type‑driven sequence of cleanup rules.

Normalization is:

Automatic: Selected based on structural type.
Ordered: Rules run exactly as defined in the registry.
Pure: No mutation of the original value.
Event‑emitting: Every change is recorded.
Mode‑aware: Lossy rules run only in lax mode.
Non‑semantic: Normalization never interprets meaning.

Normalization ensures that every pipeline begins with a clean, predictable structure. It removes noise, enforces structural invariants, and produces a transparent record of every change through structured events.