Normalize¶

Structural hygiene that makes everything else predictable.

Normalization is the second stage of the Jane pipeline. It runs after scan and before parsing, and its job is to clean up structural noise so the rest of the pipeline doesn’t have to. Normalization is not interpretation, not validation, and not coercion.

It’s structural hygiene — the kind of cleanup every developer ends up doing manually in every project, except now it’s automatic, predictable, and mode‑aware. This is the stage that makes Jane feel clean to use. By the time your validators run, the value is already in a shape you can trust.

Why Normalization Exists¶

Most real‑world data is messy:

Strings with leading/trailing whitespace.
Arrays with empty items.
Objects with undefined keys.
Numbers that aren’t really numbers.
Dates that aren’t valid.
Sparse arrays.
Nested structures with noise.

Every developer ends up writing the same cleanup code over and over.

Normalization exists so you never have to.

It gives you a predictable, structurally clean value before any parsing or validation happens.

What Normalization Does¶

Normalization applies type‑selected rules to the value:

Strings: → Trim, collapse whitespace, convert empty to undefined.
Numbers: → Remove NaN, Infinity, normalize negative zero.
Arrays: Compact, flatten, remove empty/null/undefined items.
Objects: → Remove empty keys, convert to plain objects.
Dates: → Convert invalid dates to undefined.

These rules are structural, not semantic. They never interpret meaning. They never enforce business rules.

Normalization produces a new value. It never mutates the original.

Mode‑Aware Behavior¶

Normalization is the only stage that behaves differently depending on the pipeline mode:

Strict mode: No normalization runs. You get the raw value (after containment when scan is enabled), untouched.
Moderate mode: Only non‑lossy normalization runs. This means structural cleanup that doesn’t throw away meaningful data.
Lax mode: All normalization rules run, including lossy ones. This is the developer‑friendly mode where Jane cleans aggressively.

This mode system is one of the reasons Jane feels predictable: you always know how much cleanup is happening.

What Normalization Never Does¶

Normalization is intentionally limited. It does not:

Parse strings.
Coerce types.
Enforce business rules.
Validate.
Interpret.
Walk recursively with rules.
Change the meaning of the value.

Normalization is structural hygiene. Nothing more.

When Normalization Runs¶

Normalization runs automatically unless strict mode is active:

// Lax mode -> developer-friendly normalization
jane.value(input).lax().nonEmpty().run();

// Moderate mode → lossless normalization runs
jane.value(input).nonEmpty().run();

// Strict mode → no normalization
jane.value(input).strict().nonEmpty().run();

You never have to opt in. You only opt out by choosing strict mode.

How Normalization Affects Decisions¶

Normalization emits events when:

A lossy transformation is skipped (moderate mode).
A lossy transformation is applied (lax mode).
A rule detects something unusual.

These events feed into the policy system, which decides whether they matter.

Normalization never rejects on its own. It only reports what it did.

Why Normalization Matters¶

Normalization is the quiet workhorse of the pipeline. It ensures:

Parsers receive clean, predictable input.
Validators don’t have to handle garbage.
Contributors don’t have to write defensive code.
Consumers get consistent behavior across all pipelines.
Diff/explain/replay have a stable foundation.

Normalization is the reason Jane feels “safe” and “clean” without being strict or brittle.