Diff¶

Seeing exactly how your data changed during normalization.

Diff is part of Jane’s analysis layer. The set of optional tools that help you understand how and why your data changed as it moved through the pipeline.

Diff focuses on one thing only: structural changes made during normalization. It compares the safe version of your input (after containment) with the normalized version (after normalization rules) and produces a clean, structured list of differences.

Diff is not about validation, parsing, or semantics. It’s about structure. It answers the question: "What changed between what I gave you and what you ended up validating?"

This is invaluable for debugging, audits, compliance, and transparency.

Why Diff Exists¶

Normalization and parsing are powerful. They trims strings, remove empty items, compacts arrays, strips undefined keys, and more. These changes are predictable and structural, but they can still be surprising if you’re not expecting them.

Diff exists to make normalization transparent.

It lets you see:

What was added.
What was removed.
What was changed.
Where it happened.
The before/after values.

This is especially important in moderate and lax modes, where normalization can be lossy. Diff gives you confidence that nothing happened behind your back.

What Diff does¶

Diff walks the before and after structures and records every structural change:

Added: A value exists in the normalized version but not in the safe version.
Removed: A value existed in the safe version but was removed during normalization.
Changed: A value exists in both, but the normalized version is different.

Each diff entry includes:

The FieldPath where the change occurred.
The kind of change (added, removed, changed).
The before value.
The after value.

Diff is deterministic, stable, and easy to consume.

When Diff Runs¶

Diff is optional and lazy.

It only runs if:

The policy enables diff analysis.
You explicitly request it with withDiff().

Example:

const diff = await jane.value(input).nonEmpty().withDiff();

If you don’t ask for diff, Jane doesn’t compute it. This keeps the pipeline fast by default.

What Diff Never Does¶

Diff is intentionally limited. It does not:

Validate.
Parse.
Normalize.
Mutate values.
Interpret semantics.
Affect the pipeline decision.
Change the final value.

Diff is purely observational. It tells you what happened. Not what it means.

How Diff Works (Conceptually)¶

Diff compares:

safe (after containment).
normalized (after normalization rules).

It walks both structures:

Arrays: Index by index.
Objects: Key by key.
Primitives: Direct comparison.

If two values are identical (by reference or value), diff skips them. If they differ, diff records the change.

This produces a clean, minimal list of structural differences.

Example Diff Output

Imagine this input:

" hello "

In moderate mode, normalization trims it:

safe: " hello "
normalized: "hello"

Diff would produce:

[
    {
        path: "$",
        kind: "changed",
        before: "  hello  ",
        after: "hello"
    }
]

Simple, predictable, and easy to understand.

Why Diff Matters¶

Diff is the foundation for:

Explain (human-readable narrative).
Replay (step-by-step reconstruction).
Telemetry (structured event logs).
Audits (what changed and why).
Debugging (understanding lossy normalization).

It gives you visibility into the pipeline that most validation libraries simply don’t offer. Diff turns normalization from a black box into a transparent, inspection-enabling process.