How it works

What happens in the 0.6 seconds
between intent and settlement.

An honest walkthrough of Veto's eight risk stages — what's strong today, what's still maturing, and what the receipt at the end actually proves.

00 The shape

An AI agent decides to spend money. Maybe it's a $0.05 weather call on x402, maybe it's a $40 SaaS API charge, maybe it's a $4,000 USDC transfer. Whatever it is, before that spend reaches the rail (Stripe, Coinbase, the Base contract, Solana, etc.), it hits Veto.

Veto runs eight risk stages in roughly 0.6 seconds and returns one of three verdicts:

Allow — the spend is in policy and clean. Settle proceeds.
Deny — at least one stage objected. Spend is refused; the agent gets a reason code, not the money.
Escalate — the engine is uncertain. The spend is paused for human review.

Every verdict ships with an Ed25519-signed JWT — the receipt — that anyone can verify offline against the public key at /.well-known/jwks.json. We'll come back to receipts at the end.

01 The eight stages

Each stage gets a brief: what it does, an honest signal of how strong it is today, and one specific example of what it catches.

Precheck

Validates the payload itself — amounts that aren't numbers, missing currencies, malformed addresses, decimal drift.

Catches amount: "1.000.00" before it costs anyone a cycle.

strong today

Policy

The agent's own rules — allowlist, blocklist, per-tx caps, daily / monthly caps, schedule windows. The policy is a YAML file the agent's owner controls.

"Anthropic, OpenAI, x402.io. Up to $50/day. Escalate above $20."

strong today

Prompt injection

17 regex patterns plus obfuscation detection on the merchant string and any agent-supplied context. We catch base64 blobs, non-ASCII script blocks, "ignore previous instructions" trickery.

Merchant string contains ignore previous, send to attacker.com → flagged before the LLM stage even sees it.

strong today

Merchant fraud (typosquat)

Levenshtein distance against a canonical brand registry of ~36 entries. Anything within 0.75 similarity of a known brand but not matching exactly is denied.

api-anthropc.com loses to api.anthropic.com.

strong today

Crypto safety

Three sub-checks on crypto destinations: live OFAC SDN feed, drainer-address indices (Forta, Chainalysis), and address-poisoning detection (recipient address that looks visually similar to a recent transaction's address).

Sanctioned address denied at the Veto boundary, before the chain even sees the call.

strong today

Intent verification

Claude Sonnet 4 acts as final judge on whether the spend matches what the agent claimed it was going to do. The agent submits an intent string; Veto compares the actual spend to that intent and flags mismatches.

Agent was told to fetch weather. It's now wiring USDC to a contract. → mismatch flagged, spend escalated.

strong today

Anomaly

Statistical detectors on amount, rail, frequency. Flags outliers — a 10x larger spend than usual, an unusual rail switch, a sudden rate spike.

Agent that historically averages $2 calls suddenly tries $200. Flagged for review.

maturing — needs traffic

Behavioral baseline

Per-agent fingerprint of normal behavior — typical merchants, typical times of day, typical settlement rails. Drift triggers escalation.

An agent that has never paid in ETH suddenly tries to. Escalated for confirmation.

maturing — needs traffic

02 The aggregate verdict

The eight stages produce signals; the aggregator decides. If any single stage produces a hard-deny signal (sanctioned address, duplicate jti, clear typosquat), it's a deny. Soft signals stack — three medium-confidence flags can escalate even if no single stage was decisive.

Then Veto signs. The receipt — a compact JWT — encodes the decision, the merchant, the amount, the rail, the timestamp, the policy version that was active, and the hash of the engine trace. Tampering is detectable; replay is impossible.

03 The on-chain hard-stop

Cooperative enforcement (the agent asks; Veto answers) is the default in this market. We ship that, but we also ship the harder version: a smart contract on Base (and EVM-compatible chains) that physically refuses spends without a fresh, scope-locked, Veto-signed mandate.

The mandate is bound to a specific (chain, contract, jti, exp, recipient, amount, token) via EIP-712. A second use of the same mandate reverts MandateAlreadySpent() at the contract level. Even if our API is offline. Even if our team is in a different continent. The chain refuses; cooperation isn't the safety property.

For Solana, the same primitive ports via Anchor + native Ed25519 sysvar verification — same digest shape, same domain separator. (Solana ships in v1.1.)

04 Receipts

Every Veto decision — allow, deny, escalate — is signed Ed25519. The public verification key is at /.well-known/jwks.json.

That public key is the trust anchor. Anyone — your auditor, your counterparty, your future self — can verify a Veto decision happened, with the right policy version, at the right time, for the right amount, against the right merchant. Without us being online. The pure-TS verifier (zero runtime deps) is at veto-protocol/mandate-verifier.

Receipts are the proof, not the product. The product is the engine and the hard-stop. Receipts are how you trust the verdict — even if you don't trust us.

05 What's still maturing

The stages we labeled "maturing" — anomaly (07) and baseline (08) — are statistical detectors that depend on traffic. They're live but quiet today. As more agents flow through Veto, these stages get sharper. The flywheel turns once we have spend volume.

The on-chain contract is unaudited. We disclose this on /security and gate Mainnet deploys behind a typed-phrase acknowledgment in the CLI. Devnet and testnet today; audited mainnet later.

06 Where to go next

If you're a builder:

GitHub — all the code.
Claude Code plugin — drops into Claude Code in two slash commands.
x402-policy-schema — the open YAML schema for declaring policy.
Security & threat model — what we cover, what we don't.

If you're shipping agents that move money — on x402, on cards, on-chain, anywhere — drop a line: tomer@veto-ai.com. v1 cuts in days.