Trust Graduation Protocol
Autonomy should not be switched on. It should be earned.
Most agentic products ask the wrong question.
What can this model automate?
Thanks for reading gomission.ai's Substack! Subscribe for free to receive new posts and support my work.
Mission asks a narrower one:
What has this agent earned the right to do for this person, in this action class, under these constraints?
That is the Trust Graduation Protocol.
Autonomy is not a global setting. It is not owned by a model, a workspace, or a product toggle. It is earned in a specific relationship: one principal, one agent, one action class, one history of observed evidence.
A system may be good enough to prepare a relationship follow-up and still have no right to send an external email. It may be safe enough to clean a local tracker and still have no right to publish a public post.
Trust is local. Trust is contextual. Trust can move down.
The Problem
The usual product pattern is capability first, governance second. The model can write, so let it draft. The model can browse, so let it research. The model can use tools, so let it call tools.
That works for a demo. It is weak as a production contract.
Capability is not permission. A good draft is not evidence that the system should send. A confident answer is not evidence that the system understood the principal. A successful sample run is not evidence that the same action is safe for another recipient, channel, obligation, or moment.
The deeper failure is memory. Most systems do not remember the difference between prepared work, approved work, edited work, rejected work, sent work, and work that produced a real outcome.
The Contract
A Trust-Graduation-compliant system must answer the core question before every external-effect action. The phrase action class matters. Trust is not one score.
Required v0.1 classes include draft.compose, draft.response, email.send.internal, email.send.external, calendar.create, and social.post.public.
Implementations can add more classes: proposal.submit, payment.initiate, file.delete, contract.review, press.respond. But they should not collapse them. Drafting, sending, posting, deleting, booking, and paying are different surfaces. Each needs its own evidence.
Evidence, Not Vibes
The protocol counts observed decisions and outcomes.
Useful evidence: approved drafts, minor edits, heavy rewrites, held drafts, rejected drafts, sent messages with receipts, replies recovered, meetings booked, operator corrections, rollback events, trust issues.
Not evidence: more generated drafts, model confidence, synthetic simulation lift, prepared work with no decision, success in a different workspace.
This distinction is the heart of the protocol. Agents can produce infinite output. Output volume should not create trust. Decisions and outcomes should.
v0.1 normalizes evidence into eight labels: sent, approved, minor_edit, edited, heavy_rewrite, held, rejected, dropped.
Posterior Trust
The reference implementation starts with a weak symmetric prior: Beta(2, 2).
Each non-neutral evidence row updates the posterior for its action class. The system reports mean, credible interval, sample count, evidence quality, current tier, and whether more evidence is needed.
The conservative v0.1 rule is simple: needs_more_evidence is true when samples are under 10 or the credible interval is wider than 0.35.
Sparse evidence does not graduate. Uncertain evidence does not graduate. The agent can still prepare work, but the product should be honest about why it is not advancing.
Tiers
gated: prepare and present for approval.
supervised: positive evidence; narrow internal action may be allowed.
auto_capped: stronger evidence; bounded internal action under an explicit cap.
review: violation or rejection pattern; side effects halt.
External-effect classes must not auto-promote past gated without explicit principal action.
That includes external email, public social posts, money movement, legal actions, public submissions, and any act that materially affects the principal's reputation or obligations.
The protocol can observe reliability in those classes. It can report readiness. It should not quietly convert reliability into permission.
The Ladder
For human-facing products, Mission maps the protocol into a plain ladder: Observe, Prepare, Stage, Execute Narrow, Delegate, Govern.
v0.1 formalizes the lower part. That is intentional. Most agentic systems have not yet earned the upper part.
The first trustworthy promise is not: this agent can do everything for you.
It is: this agent never exceeds what it has earned.
Why Mission Uses It
Mission is a local-first execution layer for real work: relationships, drafts, open loops, proof, social signal, research, and next moves.
Trust Graduation keeps the product honest. A Gmail reply draft can improve from the user's edits. A DM draft can remain approval-gated until voice evidence is strong. A public post package can be prepared without being posted. A task can be staged without becoming an external commitment. A loop closes only when there is a receipt.
Preparation comes before automation.
The agent earns its boundary through the operator's decisions.
For Builders
The protocol is implementable without Mission.
Minimum contract: recordEvidence(row), posterior(actionClass), tier(actionClass), graduationReady(actionClass), requiresReview(actionClass).
The important rules: evidence is observed, not imagined. Trust is per action class. Trust can move down. External-effect actions remain explicitly gated. The principal's real decisions shape the autonomy boundary.
The next phase of agentic software should not be more reckless autonomy. It should be earned autonomy.
Links
GitHub protocol library: https://github.com/gomission/trust-graduation
Mission reference implementation
Evidence schema · Posterior schema
Written by Ronen Tanchum. Developed at Phenomena Labs.
The Trust Graduation core package is currently published under the Apache-2.0 license.
Thanks for reading gomission.ai's Substack! Subscribe for free to receive new posts and support my work.

