Proof on real data

Measured, not promised

Every row is a real public dataset, accepted only after the encoder's gate confirmed a byte-exact reconstruction. “AT-1 ratio” is times-smaller vs raw; the last column is the margin over the format that industry actually uses today.

Data typeAT-1 ratiovs the format you use today
Server logs
Apache access · SSH auth · HDFS
19–37×
1.1–1.4× smaller than xz-9
Genomics
1000-Genomes VCF, chr22
209×
2.48× smaller than native BCF (2.95× vs .vcf.gz)
Telemetry / IoT
UCI smart-meter power
22×
1.9× smaller than xz-9 · ~3× vs Parquet-zstd
Neurophysiology / EEG
PhysioNet CHB-MIT scalp EEG
5.1×
1.6× smaller than xz-9, lossless
Financial ticksQUERYABLE
Binance BTCUSDT aggTrades
18.7×
~3× smaller than Parquet-zstd · 54× less query I/O
Event JSON / NDJSON
GitHub Archive events
21×
1.15× smaller than xz-9
Database exportsQUERYABLE
Mongo / Elasticsearch NDJSON
~28×
~2× smaller than xz-9, and queryable
Lakehouse tabularQUERYABLE
NYC-taxi, Parquet round-trip
−27%
smaller than Parquet-zstd, still block-addressable
Map / geo
OpenStreetMap, Luxembourg
14×
1.32× smaller than PBF, 1.47× vs xz

Every row is reproduced from a real public dataset, and every codec reports a byte-for-byte lossless check — we cite no result that isn't verified lossless.

Radical honesty

Where we lose — and what we don't claim

Every compression vendor publishes the wins. We publish the losses too, because the advantage is the capability (query + byte-exact) and the economics, not a ratio number that changes quarter to quarter. Here is exactly where AT-1 is the wrong tool.

Decode speed is xz-class

The decoder is fast enough for archival reads, not for hot paths — ~2.2× xz and ~12× zstd in CPU/byte. AT-1 is a cold / archival tier, not your hot storage.

Already-compressed media

JPEG, H.264, and other entropy-saturated media gain ~nothing from any compressor, AT-1 included. We don't pretend otherwise.

Monochrome DICOM under JPEG-LS

On monochrome pixel data already under JPEG 2000 / JPEG-LS, the image-domain codec wins. AT-1's imaging win is uncompressed / RLE / color DICOM only.

Numeric-heavy tabular

On dense numeric SMART/sensor columns, a trained OpenZL graph edges us ~1.06× — at the cost of minutes of per-format training. We're zero-config.

High-entropy network data

On NetFlow and Zeek conn logs, the data is near its entropy floor; AT-1 ties xz and the non-inferiority fallback correctly kicks in. No structural win to claim.

“Best ratio everywhere”

Ratio leadership is contested and unprovable, so we never claim it. We claim only what the verification gate measured on real data, per domain.

Reproduce any number on this page — every codec prints LOSSLESS (byte-for-byte): True/False and we cite no result that prints False. Sources: comparison.html, VALIDATION_RESULTS.md, BENCHMARKS_OPENZL_AND_SPEED.md.