Compressed weights that are still randomly addressable.
Models ship essentially uncompressed because gzip/xz forfeit the random access that loading a model needs — one stream, all-or-nothing. AT-1 compresses each tensor independently behind an index, so the file is ~32–42% smaller than raw bf16 and yet any single tensor is fetched, decompressed, and SHA-256-verified on its own. A drop-in for safetensors.
- 32–42%
- smaller than raw bf16 (the format models ship in)
- per-tensor
- random access — fetch one tensor, not the file
- SHA-256
- integrity verified on every tensor read
- byte-exact
- 94/94 Pythia tensors reconstruct identically
What a whole-file compressor can’t do
Fetch one tensor, not the whole file
gzip/xz compress a model as one stream — to read any tensor you must decompress all of it. AT-1 compresses tensor-by-tensor behind an index, so a single layer, one mixture-of-experts expert, or a LoRA adapter is fetched and decompressed without touching the rest. That's the capability a whole-file compressor structurally cannot offer.
Every weight is integrity-checked
Each tensor carries its own SHA-256, verified the moment it's read. After the model-poisoning and tampered-checkpoint scares, a registry that proves each downloaded weight is exactly the published bytes — per tensor, not just per file — is its own selling point.
A drop-in for safetensors
save_file / load_file mirror the safetensors API, plus load_tensor for selective fetch. Point a hub's download path at .at1w and callers don't change — they just get ~40% smaller storage, smaller downloads, and selective loading.
Validated on real transformer weights
Real trained models were packed and read back byte-for-byte. bf16 — the format modern large models ship in — compresses best because it has fewer random mantissa bits.
31% smaller, byte-exact on all 94 tensors, a single tensor fetched without decompressing the rest, integrity verified per tensor.
The byte-plane (exponent/mantissa) split adds ~5–10% on ratio over a general compressor — but the decisive, unique property is staying randomly addressable and integrity-verified.
Where we’re honest about the boundary
~32–42% smaller than raw bf16 WHILE staying randomly addressable + per-tensor integrity-verified.
We are not a better ratio than xz — only ~5–10% better. If you just want max ratio on a cold model, use xz; the moat here is addressability + integrity, not the byte count.
You cannot run inference on compressed bytes — a tensor is decompressed on load. The win is storage + download bandwidth + selective load + integrity, not compute.
How a registry integrates it
Pack
Convert a .safetensors repo to .at1w — lossless, tensor-by-tensor, with a per-tensor index and hash. ~40% smaller at rest.
Serve
Put .at1w behind the registry's download path. A drop-in loader decompresses per tensor on load; selective fetch pulls only the layers/experts/adapters requested.
Verify
Every served tensor is SHA-256-checked against the published value — supply-chain integrity on every download, not just the file.
Bill
Registered as TB under management on the same meter as AT-1 storage. Decompress/read isn't billed.
Measure it on your own corpus
We’ll pack a sample of your model registry and report the size reduction, the byte-exact check, and a selective-fetch latency — on your own weights.
Patent pending (US provisional filed 2026). Billed as TB under management, like AT-1 storage.