Small. Queryable. Byte-exact. Pick three.
Every other tool forces a choice. General compressors are exact but opaque. Columnar formats are queryable but they re-encode and discard your original. AT-1 is the only one that is all three at once — on every data type.
gzip / zstd / xz
Small and byte-exact — but opaque. To read one record you decompress the whole file. No query, no skipping.
Parquet / ORC
Queryable — but they re-encode your data. You can never recover the exact original file. A dealbreaker for audit, compliance and regulated science.
AT-1
Small, queryable in place, and byte-for-byte recoverable from the same file — with an embedded SHA-256 that the decoder checks on every read.
Three things no compressor and no columnar format can do together
A storage bill that drops, recurring
Cold partitions shrink 2–37× depending on data, byte-for-byte lossless. Encode is a one-time fraction of a cent per GB; the saving repeats every month.
~$170 / TB / year saved at a conservative 2.5× on logs · payback < 1 day
Query the archive without rehydrating it
Per-block min/max zone maps let a predicate skip whole row-groups and decode only the columns it touches — over object storage, range-GET-ing just those byte ranges.
A selective time query reads 1/54 of the file (54× less I/O), exact rows
Prove it's the same bytes — anywhere
Every encode is byte-compared to the original before it ships. Each file embeds a SHA-256 the decoder re-checks and refuses on mismatch. A non-inferiority fallback means AT-1 is never larger than plain xz/zstd.
0 crashes in 10,000+ fuzz iterations · correct codec on 11/11 real datasets
Not just smaller files — a verified-lossless data engine
The same byte-exact container queries in place, streams live, remembers its own tamper-evident history, ships as one openable HTML file, and decodes from any language. Every capability ships today — each one links to its docs.
Patent-pending tiers
Trust & integrity
Query in place
Works everywhere
Operate at scale
One decode core. Eleven engines, verified live.
There is exactly one piece of hard technology — a ~260-line, fuzz-hardened, zone-mapped block decoder. Every engine adapter is a thin layer over it, through native C, Apache Arrow, or Postgres/JDBC federation. We've run real SQL over AT-1 data on all eleven.
Adding an engine is ~300 lines of glue, or zero for anything Arrow-native. The same .at1 still reconstructs the original byte-for-byte through the non-querying decoder — querying is additive, never a re-encode. Full matrix, connection strings + reproduce commands: the engines guide.