Operational tools
Three command-line tools wrap the AT-1 pipeline for the jobs that come before and around production use: sizing a migration, auto-tiering cold data, and attesting archives for compliance. Each runs the same verified, byte-exact pipeline as at1 compress — they never estimate from a lookup table, and they show the files where AT-1 only ties xz, not just the wins.
at1-doctor — measure what AT-1 would save
at1-doctor scan /data [--sample-mb 8] [--max-files 200]
[--rate-storage 0.023] [--report savings.html]Scans a directory and sample-compresses every file through the real verified pipeline (auto codec selection + the byte-exact gate), then emits a self-contained HTML savings report. Ratios are measured on the first --sample-mb of each file; file-level totals are extrapolated from that measured ratio; the $/GB-month storage rate you pass is the one stated assumption. Already-compressed formats (gz, zst, xz, png, mp4, pdf, …) are skipped.
file size codec sample ratio note events/2026-05.ndjson 1,204,338,112 qjson 11.34x measured trades/ticks.csv 880,201,003 qcolumnar 6.10x measured assets/logo.bin 4,096,000 - - ties xz -- no structural win scanned 3 file(s), 2.089 GB in 7s (sample=8 MB/file -- ratios measured, totals extrapolated) projected after AT-1: 0.241 GB -> ~$510/yr storage at $0.023/GB-mo (assumption)
When to use: point it at a prospect or production data directory to size a migration and produce a defensible savings number — with the provenance of every figure labelled, and the no-win files shown rather than hidden. Add --report savings.html for a shareable artifact.
at1-watch — set-and-forget auto-tiering daemon
at1-watch DIR --older-than 7d [--interval 60] [--delete-original]
[--include "*.csv,*.log,*.ndjson"] [--once] [--dry-run]
[--verify-ledger]Watches a directory and tiers files that have gone cold. The policy is conservative by default because it touches customer data:
- A file is tiered only when it is older than
--older-than(mtime) and its size has been stable across two consecutive scans — nothing mid-write gets tiered. - Tiering runs the full gated pipeline (auto codec, query-optimized, verification gate, SHA-256 trailer) to
<name>.at1next to the file. - The original is kept unless
--delete-originalis set — and even then it is deleted only after an independent decompress-and-compare against the recorded SHA-256. - Every action lands in a hash-chained ledger (
DIR/.at1_watch_ledger.jsonl); the timestamp is hashed into each link, so tampering is detectable with--verify-ledger.
When to use: run it as a background daemon for hands-off lifecycle management of logs, exports, and telemetry. Start with --dry-run to preview, then --once for a single pass (it scans twice so the stability check has both samples), or leave it looping on --interval.
at1-attest — cryptographic attestation for compliance & custody
at1-attest TABLE_DIR [--report attestation.html] [--deep] [--timestamp]
Produces a one-command attestation report for an AT-1 table or watch ledger — “this archive's full contents and history, cryptographically verified, as of now.” Three independently checkable layers:
- Contents — every live segment re-hashed and compared to its manifest SHA-256.
- History — the hash-chained event log recomputed end-to-end, so any edit to any past append/compact event breaks every later link.
- Bytes — with
--deep, every segment is decompressed and its embedded integrity trailer re-verified (decode == original, byte-for-byte).
VERIFIED contents: 14 segment(s) re-hashed against the manifest VERIFIED history: hash chain recomputed across 9 event(s) (appends/compactions) VERIFIED bytes: 14 segment(s) decompressed; embedded SHA-256 trailer verified decode == original root hash: 7f3a…c91d verdict: ALL CHECKS VERIFIED
The report states exactly what was and was not checked, and emits a root hash over the segment hashes plus the chain head. This is evidence generation, not a signature scheme: pair the root hash with your own timestamping/signing — RFC 3161 (--timestamp fetches a token from a public TSA), sigstore, or a notarized email — for third-party non-repudiation.
When to use: for compliance and chain-of-custody — proving to an auditor or counterparty that an archived dataset and its full edit history are intact and untampered as of a given moment.