AT-1 Prompt Compiler

Cut your AI token bill — without changing a thing

A drop-in proxy that compiles every prompt before it reaches the model: it removes the repeated content your app re-sends each turn, so you pay for fewer tokens and get the same answers. Lossless, accuracy-verified, and fast enough that you'll never notice it.

up to 62%
fewer tokens on repeated chat context (lossless)
~5.5×
cheaper PDFs — route the text layer, not page images
50%
saved when the same image is attached twice
<1 ms
added latency — you won't feel it

Honest about what this is — and isn't

No codec can shrink the tokens a model actually reads — that's fixed by the tokenizer. So we don't compress your bytes; we stop you re-sending the same thing. On unique prose or one-off code there's little to remove and we'll save you little — and we'll tell you so. The big wins are on the repetitive traffic that dominates real agent and chat workloads.

How it works

It removes repetition, not meaning

Coding agents and chat apps re-send the same system prompt, tool definitions, files and images on every turn. The compiler sends each one once and references it after — the model still receives every byte of information, just not three copies of it.

Lossless, and proven not to change answers

The transform is byte-reversible, and we verified on two independent models that the model answers exactly as well over a compiled prompt as over the original. Optional deeper (lossy) modes exist for images and templated data — off by default, each clearly flagged.

Drop-in, invisible

Point your app at the AT-1 endpoint instead of the provider's — one line. Works with the OpenAI and Anthropic APIs, your existing SDKs, and coding agents, with no change to how your team works.

PDFs and images, handled right

PDFs are routed to their cheap text layer and their repeated page furniture is removed; identical images are de-duplicated. A lossless image re-encode saves nothing — token cost is set by resolution — so we don't pretend otherwise.

How you pay

Share of savings — you only pay when we save you money

We meter the tokens we remove from every request and bill a fraction of the money that saves you (you keep the large majority). No savings, no charge — so it can never cost you more than it saves. A free tier covers your first block of saved tokens each month, and high-volume teams can switch to a flat per-seat plan.

Turn it off anytime

No lock-in, ever — it's always one step to switch off or remove:

  • Skip a single request — add the header X-AT1-Compile: off and that one call is sent through untouched.
  • Pause everything — set AT1_PROMPTC_DISABLE=1 and the proxy becomes an invisible pass-through, exactly like calling the AI provider directly. No redeploy. Unset it to turn back on.
  • Uninstall completely— point your app's endpoint back at the provider (it was only a one-line change), or stop the service / npm uninstall. Your requests go straight to the model again, and we keep nothing of yours.