AI governance & audit

Prove what your AI actually decided.

When a regulator or a court asks you to justify an automated decision two years later, a log that says “the model said no” isn’t proof. The AI Evidence Capsule is a notarized, reproducible receipt: re-run the exact inference offline and get byte-identical output, with the model’s integrity cryptographically sealed. Think git + a notary public for AI decisions.

10/10
byte-identical cold-process replay (real Pythia-70m)
~0.5 KB
evidence per inference (weights shared once)
caught
1-byte weight tamper detected on verify
offline
self-contained — no live re-execution or model call
A tiny receipt, not a copy of the model

Each inference keeps a ~0.5 KB capsule — exact input, seed/decode policy, the pinned execution manifest, and SHA-256 of the model store + output. The weights live once in a shared, verified store; the per-decision evidence is kilobytes.

Re-run it yourself — byte for byte

An auditor rebuilds the model from the verified weight store + the pinned manifest, replays the capsule's exact input from a cold process, and gets byte-identical output. “We logged it” becomes “re-run it and check the bytes.”

Tamper detected and located

The weight store carries per-tensor SHA-256. Flip a single byte and the integrity check fails — so you can prove the model that made the decision was the untouched original, not something altered after the fact.

How it works

seal:    weights  ->  one verified AT-1 store (lossless, per-tensor SHA-256)
record:  each inference -> ~0.5 KB capsule {input, seed, manifest, output-hash}
replay:  auditor rebuilds the model from the verified store + pinned manifest,
         re-runs the capsule input  ->  asserts BYTE-IDENTICAL output
tamper:  flip one weight byte  ->  integrity check fails (detected + located)

The one requirement is discipline, not magic: the manifest pins the execution recipe(framework + version + dtype + device + kernel). With it pinned, replay is exact; that’s the same “pin the kernel/SKU” rule any reproducible system follows.

Who needs it

  • EU AI Act high-risk systems — record-keeping that lets a decision be reconstructed (Art. 12).
  • Credit, hiring, insurance, healthcare AI — defensible, reproducible decision records.
  • Model providers — prove which exact weights served a request, untampered.
  • AI assurance / Big-4 — an evidence layer beneath governance platforms.
What it is not. It is not model compression for shipping (weights stay roughly full size — lossless only). It is nota model-extraction or reverse-engineering tool — it’s the model ownerproving provenance over weights they already hold, and an auditor need never receive the raw weights (verify the sealed hash, or replay in a controlled environment). It rides AT-1’s addressable + streaming + verified weights (patent filed).