Features — AT-1

The trade-off everyone else makes

Small. Queryable. Byte-exact. Pick three.

Every other tool forces a choice. General compressors are exact but opaque. Columnar formats are queryable but they re-encode and discard your original. AT-1 is the only one that is all three at once — on every data type.

gzip / zstd / xz

Small and byte-exact — but opaque. To read one record you decompress the whole file. No query, no skipping.

Parquet / ORC

Queryable — but they re-encode your data. You can never recover the exact original file. A dealbreaker for audit, compliance and regulated science.

AT-1

Small, queryable in place, and byte-for-byte recoverable from the same file — with an embedded SHA-256 that the decoder checks on every read.

Why teams adopt it

Three things no compressor and no columnar format can do together

01 / SAVE

A storage bill that drops, recurring

Cold partitions shrink 2–37× depending on data, byte-for-byte lossless. Encode is a one-time fraction of a cent per GB; the saving repeats every month.

~$170 / TB / year saved at a conservative 2.5× on logs · payback < 1 day

02 / QUERY

Query the archive without rehydrating it

Per-block min/max zone maps let a predicate skip whole row-groups and decode only the columns it touches — over object storage, range-GET-ing just those byte ranges.

A selective time query reads 1/54 of the file (54× less I/O), exact rows

03 / TRUST

Prove it's the same bytes — anywhere

Every encode is byte-compared to the original before it ships. Each file embeds a SHA-256 the decoder re-checks and refuses on mismatch. A non-inferiority fallback means AT-1 is never larger than plain xz/zstd.

0 crashes in 10,000+ fuzz iterations · correct codec on 11/11 real datasets

the whole product

Not just smaller files — a verified-lossless data engine

The same byte-exact container queries in place, streams live, remembers its own tamper-evident history, ships as one openable HTML file, and decodes from any language. Every capability ships today — each one links to its docs.

Patent-pending tiers

Generative compression

For industrial time-series, discover the model behind a signal and store the model plus a tiny residual — byte-exact, provably never larger than xz, and queryable straight from the model. 1.3–1.5× smaller than xz on real sensor data.

Generative tier

Condition monitoring

The residual the compressor already stores IS a bearing-fault diagnostic — detect and localize a developing defect and trend its severity, label-free, no extra sensors. Validated on real CWRU data: 78% detection, 0% false alarm.

Diagnostics

Addressable weights

Store model weights ~32–42% smaller than raw bf16 while keeping per-tensor random access + a SHA-256 check on every read — what whole-file gzip/xz structurally cannot. A safetensors drop-in.

Addressable weights

Trust & integrity

Verified-lossless, never larger

Every encode is byte-compared to the original and ships an embedded SHA-256 the decoder re-checks. A non-inferiority fallback means AT-1 is never larger than plain xz/zstd.

How it works

Appendable tables + time-travel

Append-only immutable segments behind a tamper-evident hash-chained log. Reconstruct and query the table as it was at any past moment — provably un-edited.

Tables & time-travel

Operational tools

at1-doctor scans for savings, at1-watch auto-tiers cold files through the full gated pipeline, and at1-attest produces signed attestations — every action lands in a hash-chained ledger.

Operational tools

Query in place

Queryable in place

SQL predicate & projection pushdown with per-block zone maps skip whole row-groups — selective queries read under 1% of a file, while the same bytes still reconstruct exactly.

Query & SQL

Query over the wire

Point a reader at a URL: it fetches a few KB of footer, then range-GETs only the blocks a predicate can't exclude from a cold .at1 in S3/HTTPS — reading under 1%, no server.

Remote query

Living Database™

Emit a single self-contained, searchable .html that IS the database — opens on any phone, offline. Bundles also full-text-search inside PDFs and Word docs without unpacking.

Living Database

Works everywhere

Works with your engine

DuckDB, SQLite, Postgres, Spark, Trino, Presto, Flink, ClickHouse, Polars, pandas and Dask all query .at1 live — over one ~260-line decode core, via native C, Arrow, or federation.

Query from your engine

SDKs & bindings

Decode anywhere: Python, C ABI, WASM, Go, Rust and Node bindings over the same portable core. Apache-2.0 — anyone can open an AT-1 file, forever.

SDK & bindings

MCP server for AI agents

Let Claude, Cursor or VS Code compress and query .at1 directly through the Model Context Protocol — agents read your archives without rehydrating them.

MCP server

Desktop app

Drag-drop compress and query on Windows, macOS and Linux, with auto-update — the same verified-lossless core, no terminal required.

Download

Domain codecs

Structure-aware encoders for logs, CSV/tabular, JSON/NDJSON, genomics (VCF), DICOM medical imaging, OpenStreetMap geo and ML embeddings — auto-selected per file.

See the benchmarks

Operate at scale

Streaming & live ingest

Bounded-memory streaming compress, and query a stream while it lands: each batch is sealed through the verification gate so readers see only whole, byte-exact rows.

Streaming

Managed cloud

An S3-compatible service: write ordinary CSV/JSON, it lands compressed + verified, reads come back byte-exact, and a SQL REST endpoint queries it in place — drop-in for an S3 prefix.

Managed cloud

Lakehouse cold-tier

Store Iceberg and Delta Lake data files as AT-1 — 2–7× smaller at rest — while Spark, Trino, DuckDB and pyiceberg read them back value-identical with pushdown. The catalog never changes.

Lakehouse & engines

Auto-tier your database

Point the agent at S3 or a Postgres/MySQL table: cold objects and append-only history move to verified, queryable .at1 on a schedule — deleted from the source only after a byte-exact check.

Tiering agents

Runs where you run

A Kubernetes operator + Helm chart schedule the tiering agents declaratively, and the same ~84 KB decode core runs in the browser via WASM — your storage shrinks across every environment.

Operational tools

Query it from the engine you already run

One decode core. Eleven engines, verified live.

There is exactly one piece of hard technology — a ~260-line, fuzz-hardened, zone-mapped block decoder. Every engine adapter is a thin layer over it, through native C, Apache Arrow, or Postgres/JDBC federation. We've run real SQL over AT-1 data on all eleven.

DuckDB

native C ext

SQLite

C vtable

PostgreSQL

C FDW

ClickHouse

Arrow

Spark

Arrow

Trino

FDW federation

Presto

FDW federation

Flink

JDBC federation

Polars

Arrow

pandas

Arrow

Dask

Arrow

Adding an engine is ~300 lines of glue, or zero for anything Arrow-native. The same .at1 still reconstructs the original byte-for-byte through the non-querying decoder — querying is additive, never a re-encode. Full matrix, connection strings + reproduce commands: the engines guide.