Query .at1 from your engine

AT-1's queryable container decodes through a single engine-agnostic core (at1_block.c) — directly in C, or through Apache Arrow. So the engine you already run queries an .at1 archive in place, no rip-and-replace, with the same byte-exact original recoverable underneath by a non-querying decoder.

EngineMethodHow you reach it
DuckDBNative C extensionread_at1('a.at1') table function; projection + filter pushdown
SQLiteNative C virtual tableCREATE VIRTUAL TABLE … USING at1('a.at1', …)
PostgreSQLNative C foreign data wrapperCREATE FOREIGN TABLE … SERVER at1 OPTIONS (filename '…')
Polars / pandas / DaskArrow export (at1_arrow)to_polars / to_pandas / to_arrow — zero new code
ClickHouse / SparkArrow IPC exportwrite_ipc → Arrow stream the engine ingests
Trino / Presto / FlinkPostgres/JDBC bridgeengine's Postgres connector → at1_fdw → at1_block.c

Predicate and zone-map pushdown rides through every adapter. Each query consults the per-block min/max zone maps in the footer, skips the row-groups it can rule out, and decompresses only the projected columns of the survivors — so a selective query still reads only the touched blocks, no matter which engine drove it.


DuckDB — native C extension

The extension (duckdb_at1/) exposes a read_at1('a.at1') table function that reads an .at1 as a table, with projection + filter pushdown (zone-map row-group skipping) filling DuckDB vectors per block.

LOAD 'at1';
SELECT symbol, count(*), avg(price)
FROM read_at1('trades-2026-01.at1')
WHERE ts BETWEEN 1704067200000 AND 1704153600000   -- zone-map skips non-matching row-groups
GROUP BY symbol;

The C core (at1_block.c) is verified byte-exact against the Python reference on every cell, and fuzzed with 0 crashes. A Python adapter (at1_duckdb.attach_at1(con, "t", "a.at1")) works today with no build.

SQLite — native virtual table

A loadable extension (sqlite_at1/) — a thin adapter over the same decode core. Create a virtual table over the archive and query it with ordinary SQLite SQL:

.load ./at1_vtab sqlite3_at1_init
CREATE VIRTUAL TABLE trades USING at1('ticks.at1', 'agg_id','price','qty','ts_ms', ...);
SELECT count(*), avg(price) FROM trades WHERE agg_id BETWEEN 100 AND 200;  -- zone-map skips groups

xBestIndex/xFilter use the int zone maps to keep only the row-groups that can satisfy the predicate. Column types map int → INTEGER, decimal → REAL, text → TEXT. Built and verified live against the stock sqlite3 CLI.

PostgreSQL — foreign data wrapper

at1_fdw (postgres_at1/) exposes an .at1 archive as a Postgres foreign table:

CREATE EXTENSION at1_fdw;
CREATE SERVER at1 FOREIGN DATA WRAPPER at1_fdw;
CREATE FOREIGN TABLE trades (
    agg_id bigint, price float8, qty float8,
    first_id bigint, last_id bigint, ts_ms bigint,
    is_buyer_maker text, is_best_match text
) SERVER at1 OPTIONS (filename '/data/ticks.at1');

SELECT count(*), avg(price) FROM trades WHERE agg_id BETWEEN 100 AND 200;  -- zone-map skips groups

BeginForeignScan extracts int Var <op> Const quals and uses the zone-map footer to compute the surviving row-groups; Postgres re-checks the clauses, so results are always correct. Verified end-to-end against real PostgreSQL 16 (EXPLAIN shows a pushed-down Foreign Scan), ~350 lines of glue, zero new decode logic.

Arrow / Feather export — Polars, pandas, Dask

at1_arrow decodes an archive to an Apache Arrow table — the hub every other binding converts from. Projection and filter pushdown ride through columns and where:

import at1_arrow
df  = at1_arrow.to_polars("trades.at1", columns=["agg_id","price"], where={"agg_id": (lo, hi)})
df2 = at1_arrow.to_pandas("trades.at1")             # -> pandas (Dask: dd.from_pandas(df2))
at1_arrow.write_ipc("trades.at1", "trades.arrow")   # universal Arrow IPC handoff to any engine

to_polars / to_pandas / to_arrow are zero-copy where dtypes allow; write_ipc / write_feather export an Arrow IPC stream or Feather file any Arrow-native tool ingests. Dask wraps the pandas frame with dd.from_pandas(...).

The Arrow / Postgres-FDW bridge — ClickHouse, Spark, Trino, Presto, Flink

Engines without a native AT-1 plugin reach the same core two ways. ClickHouse and Apache Spark ingest the at1_arrow.write_ipc export (ClickHouse Arrow format, spark.read.format('arrow')). Trino, Presto, and Flink all have first-class PostgreSQL / JDBC connectors, so they query the FDW foreign table — and thus the .at1 — with zero custom plugin code:

Trino / Presto / Flink / Spark ──(Postgres or JDBC connector)──▶ at1-postgres ──at1_fdw──▶ at1_block.c ──▶ trades.at1

Verified live: Trino, Presto, and Flink each return results identical to the AT-1 reference over the federation in connectors/federation/ (Spark reaches the same FDW via JDBC). The Postgres hop is the proven path today; a native per-engine Arrow-batch connector (one split per row-group) is the max-throughput path. Either way, the predicate/zone-map pushdown rides through.


One format, one decode core (at1_block.c), thin glue per engine — the same byte-exact archive queried from whatever a customer already runs. Full matrix + reproduce commands: docs/ADAPTERS.md in the repo.