Derived-column eliminator

Stop storing columns you can compute.

Analytic and telemetry tables are full of columns that are exact functions of their neighbours — total = a + b, running sums, offsets, rolled-up counters. AT-1 discovers the relation automatically, stores a tiny rule plus an all-zero residual instead of the whole column, and asserts a byte-exact rebuild. A per-column compressor structurally cannot see across columns — so this is surplus the standard path leaves on the table.

8.27%
smaller whole-file vs plain qcolumnar (real NetFlow)
3
exact derived columns auto-discovered & eliminated
≈1 byte
each, where a column is an exact function of others
byte-exact round-trip, never-worse fallback

A real example: NetFlow counters

On a real CTU-13 NetFlow capture, AT-1 found three byte-columns that are exact additive functions of their siblings (total packets, total bytes, total application bytes — each the sum of a source and destination counter). It stored each as a relation tag plus an all-zero residual — collapsing roughly a third of the numeric columns to almost nothing. The result: 10,771,454 B → 9,880,331 B, a 8.27% whole-file saving on top of qcolumnar, byte-for-byte lossless.

# discover exact cross-column relations and store the rule, not the column
at1 derived compress flows.csv flows.at1fd     # 8.27% smaller, byte-exact
# AT-1 finds e.g.  TotPkts = SrcPkts + DstPkts  and stores a tag + an all-zero residual

at1 derived decompress flows.at1fd flows.csv   # every original byte rebuilt
Byte-exact, verified, never-worse

A column is only eliminated when the relation rebuilds it exactly— the residual is verified zero and the whole-file round-trip is asserted byte-exact. If a candidate relation isn't byte-safe for a file (quoting, formatting), AT-1 falls back to plain columnar, so you are never worse. The win is real where derived/rolled-up counters are pervasive — network, observability and analytic tables; a table with no cross-column relations simply sees no change.

Built for

NetFlow & observability tables · lakehouse / Fabric analytic partitions · financial and IoT exports with totals and rolling aggregates — anywhere a schema carries columns derived from other columns.