The trade-off everyone else makes

Small. Queryable. Byte-exact. Pick three.

Every other tool forces a choice. General compressors are exact but opaque. Columnar formats are queryable but they re-encode and discard your original. AT-1 is the only one that is all three at once — on every data type.

gzip / zstd / xz

Small and byte-exact — but opaque. To read one record you decompress the whole file. No query, no skipping.

Parquet / ORC

Queryable — but they re-encode your data. You can never recover the exact original file. A dealbreaker for audit, compliance and regulated science.

AT-1

Small, queryable in place, and byte-for-byte recoverable from the same file — with an embedded SHA-256 that the decoder checks on every read.

Why teams adopt it

Three things no compressor and no columnar format can do together

01 / SAVE

A storage bill that drops, recurring

Cold partitions shrink 2–37× depending on data, byte-for-byte lossless. Encode is a one-time fraction of a cent per GB; the saving repeats every month.

~$170 / TB / year saved at a conservative 2.5× on logs · payback < 1 day

02 / QUERY

Query the archive without rehydrating it

Per-block min/max zone maps let a predicate skip whole row-groups and decode only the columns it touches — over object storage, range-GET-ing just those byte ranges.

A selective time query reads 1/54 of the file (54× less I/O), exact rows

03 / TRUST

Prove it's the same bytes — anywhere

Every encode is byte-compared to the original before it ships. Each file embeds a SHA-256 the decoder re-checks and refuses on mismatch. A non-inferiority fallback means AT-1 is never larger than plain xz/zstd.

0 crashes in 10,000+ fuzz iterations · correct codec on 11/11 real datasets

the whole product

Not just smaller files — a verified-lossless data engine

The same byte-exact container queries in place, streams live, remembers its own tamper-evident history, ships as one openable HTML file, and decodes from any language. Every capability ships today — each one links to its docs.

Works everywhere

Operate at scale

Query it from the engine you already run

One decode core. Eleven engines, verified live.

There is exactly one piece of hard technology — a ~260-line, fuzz-hardened, zone-mapped block decoder. Every engine adapter is a thin layer over it, through native C, Apache Arrow, or Postgres/JDBC federation. We've run real SQL over AT-1 data on all eleven.

DuckDB
native C ext
SQLite
C vtable
PostgreSQL
C FDW
ClickHouse
Arrow
Spark
Arrow
Trino
FDW federation
Presto
FDW federation
Flink
JDBC federation
Polars
Arrow
pandas
Arrow
Dask
Arrow

Adding an engine is ~300 lines of glue, or zero for anything Arrow-native. The same .at1 still reconstructs the original byte-for-byte through the non-querying decoder — querying is additive, never a re-encode. Full matrix, connection strings + reproduce commands: the engines guide.