Query over the wire

The AT-1 container is header-first: the small footer streams (meta, index/zone maps, bloom) come before the large blockspayload, and the blocks are stored uncompressed so every block's (offset, length) is computable from the index. A reader holding only a URL fetches a few KB of footer, learns the layout, then range-GETs only the block byte ranges a query touches — the same predicate/projection pushdown the local engine does, over the wire. No gateway, no decode server, no infrastructure: query a cold .at1 in object storage reading well under 1% of it.

Just a URL — RemoteAT1

Point it at any HTTP/S3 object that honors RFC 7233 Range requests (S3, GCS, MinIO, nginx all do). It fetches a 64 KB header chunk (growing on demand), parses every small stream, records where the blocks payload starts without downloading it, then range-GETs only the touched blocks per query.

from at1remote import RemoteAT1
r = RemoteAT1("https://bucket.example.com/trades.at1")          # presigned URL or bearer header
rows, stats = r.query({0: (29500, None)}, select=[0, 2])         # range pushdown + projection
print(stats["http_bytes_fetched"], "B over", stats["http_requests"],
      "range-GETs, of", stats["file_bytes"])
  • Predicates & projection: same forms as the local engine — {col: (lo, hi)} (None = open bound) or {col: ('=', value)}.
  • Access: a signed URL needs no extra auth; for a bearer token pass headers={"Authorization": ...}. SigV4-signed buckets are reached with a presigned URL.
  • Observable: stats reports http_bytes_fetched, http_requests, and file_bytes.

Only queryable codecs (qcolumnar, qjson) are remote-queryable; the reader rejects anything else. Block ranges from the (untrusted) footer index are clamped to the declared blocks region, so a malformed file can't read out of bounds.

The mechanism, proven locally — RangeReader

RangeReader is the same idea where a seek on a file stands in for a range-GET on an object: it reads the directory + small streams, then per query reads only the byte ranges of the touched column blocks. The reported file_bytes_read (header + touched blocks) is what an object store would fetch over HTTP Range.

from at1_rangereader import RangeReader
r = RangeReader("trades.at1")
rows, stats = r.scan(where={"ts": (lo, hi)}, select=["aggId", "ts"])
# stats['file_bytes_read'] << file size   (header + only the touched block ranges)

A selective query typically reads a small single-digit percentage of the file — measured framing: fetched ~120 KB of a 48 MB archive for a tight time window, the rest never leaves object storage.

Many daily files — catalog / partition skipping

Real deployments partition into many .at1 files (by day, by shard). AT1Catalog records each file's global min/max per numeric column (aggregated from that file's footer zone maps), so a query skips whole fileswhose bounds can't satisfy the predicate before opening them, and only range-GET-reads the survivors.

from at1_catalog import AT1Catalog
cat = AT1Catalog("data_dir")              # many daily .at1 files (or a list of paths)
rows, stats = cat.scan(where={"ts": (lo, hi)}, select=["aggId", "ts"])
# stats: files_total / files_skipped / files_scanned / file_bytes_read

A time query over a year of daily files can touch a handful of files and read ~1% of the dataset: files_skipped vs files_scanned in stats shows exactly how many were ruled out unopened.

The S3 gateway — serve ?select& meter bytes

When you want a server (so engines configured for s3:// — Spark, Trino, DuckDB httpfs, Snowflake external stages — can reach it), s3_gateway.py puts a read-only S3 API in front of the cold tier: GetObject (with HTTP Range, including the suffix form Parquet engines use for footer reads), HeadObject, ListObjectsV2, and an S3-Select-style ?select that pushes predicates straight onto the .at1 — only the touched column blocks are decompressed; the full Parquet is never materialized for a select.

python s3_gateway.py serve ROOT --bucket at1 \
    --access-key AK --secret-key SK --port 9000      # SigV4 auth; loopback-only without keys

# S3-Select-style pushdown straight off the .at1 (no full file materialized):
POST /at1/trades.parquet?select
  {"where": {"agg_id": ["=", 100]}, "select": ["agg_id", "price"], "limit": 1000}
# response carries stats.bytes_read vs total block bytes -- the saving is observable
  • Auth: header-based AWS Signature Version 4, verified against a configured access-key/secret pair — the same scheme real S3 clients sign with, so DuckDB/Spark/Trino need only an endpoint override and path-style URLs. With no credentials it refuses to bind a non-loopback host (fail closed).
  • Metering: each read is metered on the I/O / egress billing axis, and the gateway tracks bytes_served, select_bytes_read, and pushdown_bytes_saved (served at /_at1/stats with a live ticker at /_at1/ticker).
  • Honest scope: read-only cold tier —PUT/DELETE return 405, path-style addressing only, no pre-signed query-string auth yet.

Same byte-exact archive underneath: a select returns values while GetObject still serves the exact original bytes. See docs/ICEBERG.md for the cold-tier gateway design.