Ingest from a URL
at1 fetch pulls a file from a URL and writes a verified .at1 in one step. For line-oriented data the HTTP body is fed through the encoder in flight, so the raw plaintext is never written to disk— peak disk is the compressed output, not the original. Move a 50 GB log out of object storage and only ever land the compressed copy.
at1 fetch https://logs.example.com/app.log app.at1 at1 fetch https://data.example.com/events.csv events.at1 columnar at1 fetch https://1000genomes.example/chr22.vcf chr22.at1 vcf --backend zstd
- Codec — pass one explicitly, or omit it and AT-1 infers from the URL extension /
Content-Type. - Plaintext never stored for the line-streamable codecs (
log,columnar,vcf,ssh,json): the download is compressed chunk-by-chunk and only the.at1is written. - Verified as it lands — every chunk is decoded back and checked against the source bytes during compression, and a SHA-256 integrity trailer is written over the original; a mismatch refuses to trust the output.
- Backends —
--backend xz(max ratio, default) orzstd(fast); chunk size via--chunk-lines.
Confirm the result the same way you would any .at1:
at1 integrity app.at1 # SHA-256 trailer: decode == original at1 decompress app.at1 app.log # byte-for-byte identical to the source
Honest scope: codecs that must see the whole file to choose a layout (e.g. whole-file auto selection, or non-line formats) fall back to a buffered mode — the download is written to a temporary file, compressed, then deleted, so the plaintext is on disk only transiently. AT-1 tells you which mode it used. Resume / range-GET is not in this version. See also streaming & live ingest.