String has unsafe Unicode¶
stringHasUnsafeUnicode is a built‑in scan rule that detects Unicode bidirectional‑control characters.
These invisible characters can alter how text is rendered, obscure intent, or introduce security‑relevant ambiguity. When any bidi control character is present, the rule emits a warn‑level string.has.unsafe-unicode scan event. If the value is not a string or contains no unsafe characters, no events are emitted.
Signature¶
export const stringHasUnsafeUnicode: ScanRule (raw: unknown, path: FieldPath) => JaneEvent[]
Events¶
| Event code | Description |
|---|---|
string.has.unsafe-unicode |
String contains Unicode bidi control characters. |
Design rationale¶
- Detects Unicode bidirectional‑control characters that can reorder visual text presentation.
- Surfaces invisible characters that may obscure meaning, mislead reviewers, or enable spoofing.
- Uses a fixed, explicit character class — no heuristics or inference.
- Emits a warning when any unsafe character is present.
- Ensures downstream layers never operate on misleading or adversarial text.
- Performs no mutation, normalization, or interpretation of the input.
Invoke¶
stringHasUnsafeUnicode runs automatically whenever the scan stage is enabled.
Activation methods:
- Enable scan explicitly:
jane.value(input).scan(). - Use a mode that enables scan:
strict()enables scan by default.moderate()andlax()do not enable scan unless.scan()is called.- Enable scan with policy:
jane.value(input).withPolicy({ mode: 'strict' }).
If scan is not enabled or the value is not a string, stringHasUnsafeUnicode does not run and no unsafe‑Unicode detection occurs.
Examples¶
Unsafe Unicode detected¶
const result = stringHasUnsafeUnicode("abc\u202Edef", "$");
// → [ JaneEvent{ kind: "warn", code: "string.has.unsafe-unicode", ... } ]
Safe string¶
const result = stringHasUnsafeUnicode("hello world", "$");
// → []
Non‑string value¶
const result = stringHasUnsafeUnicode(123, "$");
// → []