Cut your AI token bill — without changing a thing
A drop-in proxy that compiles every prompt before it reaches the model: it removes the repeated content your app re-sends each turn, so you pay for fewer tokens and get the same answers. Lossless, accuracy-verified, and fast enough that you'll never notice it.
Honest about what this is — and isn't
No codec can shrink the tokens a model actually reads — that's fixed by the tokenizer. So we don't compress your bytes; we stop you re-sending the same thing. On unique prose or one-off code there's little to remove and we'll save you little — and we'll tell you so. The big wins are on the repetitive traffic that dominates real agent and chat workloads.
How it works
It removes repetition, not meaning
Coding agents and chat apps re-send the same system prompt, tool definitions, files and images on every turn. The compiler sends each one once and references it after — the model still receives every byte of information, just not three copies of it.
Lossless, and proven not to change answers
The transform is byte-reversible, and we verified on two independent models that the model answers exactly as well over a compiled prompt as over the original. Optional deeper (lossy) modes exist for images and templated data — off by default, each clearly flagged.
Drop-in, invisible
Point your app at the AT-1 endpoint instead of the provider's — one line. Works with the OpenAI and Anthropic APIs, your existing SDKs, and coding agents, with no change to how your team works.
PDFs and images, handled right
PDFs are routed to their cheap text layer and their repeated page furniture is removed; identical images are de-duplicated. A lossless image re-encode saves nothing — token cost is set by resolution — so we don't pretend otherwise.
How you pay
We meter the tokens we remove from every request and bill a fraction of the money that saves you (you keep the large majority). No savings, no charge — so it can never cost you more than it saves. A free tier covers your first block of saved tokens each month, and high-volume teams can switch to a flat per-seat plan.
Turn it off anytime
No lock-in, ever — it's always one step to switch off or remove:
- Skip a single request — add the header
X-AT1-Compile: offand that one call is sent through untouched. - Pause everything — set
AT1_PROMPTC_DISABLE=1and the proxy becomes an invisible pass-through, exactly like calling the AI provider directly. No redeploy. Unset it to turn back on. - Uninstall completely— point your app's endpoint back at the provider (it was only a one-line change), or stop the service /
npm uninstall. Your requests go straight to the model again, and we keep nothing of yours.