From 299ded3cdf33268437e72e9b64dcadd5da705d1b Mon Sep 17 00:00:00 2001 From: Eve Date: Thu, 7 May 2026 15:46:30 +0800 Subject: [PATCH] docs: define reporting governance plugin product --- .../agent-reporting-governance-plugin.md | 157 ++++++++++++++++++ 1 file changed, 157 insertions(+) create mode 100644 docs/architecture/agent-reporting-governance-plugin.md diff --git a/docs/architecture/agent-reporting-governance-plugin.md b/docs/architecture/agent-reporting-governance-plugin.md new file mode 100644 index 0000000..da778f2 --- /dev/null +++ b/docs/architecture/agent-reporting-governance-plugin.md @@ -0,0 +1,157 @@ +# Agent Reporting Governance Plugin + +## Product Definition + +### Problem statement + +Multi-agent systems fail in predictable reporting ways long before they fail technically. A task may be dispatched correctly, work may even be progressing, yet the operator still loses control because reporting is missing, delayed, misleading, or unverifiable. + +Today, many of these expectations live only in prompts, habits, or reviewer discipline. That is not enough. Prompt-only norms are easy to forget, easy to bypass, hard to audit, and inconsistent across personas, workspaces, runtimes, and machines. The result is the same class of operational failure repeating in different forms: + +- no report +- late report +- fake progress +- forgotten checkpoint +- unverified completion claims +- subagent result not forwarded +- placeholder or proxy reports that are presented as if they were final, direct, or verified + +The Agent Reporting Governance Plugin exists to turn reporting from a soft expectation into an enforceable, portable product capability. + +### Target users + +This plugin is for teams and operators who rely on agents to do real work and need trustworthy operational visibility, especially: + +- primary operators supervising one or more agents +- people running subagent-based workflows where completion can be lost between child and main sessions +- workspace owners who need common reporting rules across multiple projects +- platform builders who want reporting controls to survive prompt changes and runtime differences +- reviewers and auditors who need evidence about what was actually reported, when, and with what verification status + +### Core outcomes + +The product should produce a small set of high-value outcomes: + +1. **Reporting becomes governable** + Reporting requirements are expressed as policy and enforcement, not just advice in prompts. + +2. **Operators can distinguish real progress from narrative progress** + The system should make it harder for an agent to appear active without providing meaningful status evidence. + +3. **Critical workflow handoffs become visible** + Especially for subagent orchestration, the system should surface when a child result exists but was not forwarded through the required reporting path. + +4. **Completion claims become policy-aware** + A task should not be treated as cleanly complete when required checkpoints, evidence, or verification steps are missing. + +5. **Placeholder/proxy reporting stays honest** + If an agent is reporting a guess, summary, relay, or provisional state rather than a verified direct result, that status must be explicitly labeled. + +6. **Reporting rules become reusable across environments** + The same governance model should be deployable across different machines, workspaces, and runtime implementations. + +### Why prompt-only rules are insufficient + +Prompt instructions are necessary, but they are not a durable control surface for governance. + +Prompt-only rules are insufficient because: + +- **they are not reliably enforced** — an agent can ignore, forget, or partially follow them +- **they are not portable** — behavior changes when the persona, system prompt, workspace conventions, or wrapper prompt changes +- **they are hard to audit** — operators cannot easily inspect which reporting obligations were active and whether they were satisfied +- **they break under orchestration** — subagent dispatch, forwarded results, checkpoints, and verification claims need machine-checkable state, not only natural language reminders +- **they invite ambiguity** — an agent can produce language that sounds compliant without actually meeting reporting requirements +- **they are weak against operational drift** — over time, informal rules become inconsistently applied across repos, sessions, and machines + +This product therefore treats prompts as guidance, but not as the sole enforcement mechanism. Governance must be embodied in plugin logic, policy artifacts, adapters, receipts, and verifiable state transitions. + +## Product scope + +The Agent Reporting Governance Plugin defines and enforces reporting obligations for agent workflows. In MVP and beyond, its scope is specifically about governing reporting quality, timing, and truthfulness — not about replacing all workflow logic. + +### In scope + +The plugin governs whether required report states and transitions are present, missing, late, misleading, or improperly represented. + +It must explicitly govern these cases: + +- **No report**: required status or completion reporting never happened +- **Late report**: a required report arrived after the allowed checkpoint or reporting window +- **Fake progress**: the agent reported activity or confidence without enough grounding evidence, concrete advancement, or truthful status framing +- **Forgotten checkpoint**: an expected intermediate report or review checkpoint was skipped +- **Unverified completion claims**: the agent claimed a task was done, fixed, passing, or otherwise complete without the required verification status or evidence +- **Subagent result not forwarded**: a child/subagent result existed or completed, but the required result was not forwarded into the main reporting path +- **Placeholder/proxy reports**: relayed, guessed, summarized, pending, or surrogate reports are allowed only if they are explicitly labeled as placeholders, proxies, or otherwise non-final/non-direct + +### Functional scope + +The product should support policy-driven handling for questions such as: + +- what kinds of reports are required for a task type or workflow stage +- when a checkpoint becomes overdue +- what evidence is required before a completion claim is considered valid +- how a forwarded subagent result is distinguished from a missing or silently dropped result +- how provisional, relayed, or second-hand reporting must be labeled +- what enforcement action should occur when governance rules are violated + +### Enforcement scope + +The plugin is intended to provide reusable governance primitives such as: + +- reporting policy evaluation +- checkpoint and lateness evaluation +- completion-claim verification gates +- forwarding/relay integrity checks for subagent workflows +- labeling requirements for placeholder or proxy reports +- structured outputs that adapters can use to block, warn, escalate, annotate, or require correction + +### Portability requirements + +Portability is a first-class product requirement, not a future nice-to-have. + +The plugin must: + +- **not depend on one persona, one prompt, or one workspace** +- **work across machines** +- **support multiple runtimes through adapters** +- **treat policy packs and deployment profiles as versioned artifacts** + +This means the product definition assumes: + +- reporting governance rules live outside any single prompt personality +- enforcement logic can be installed in different workspaces without rewriting the core policy model +- adapter layers translate runtime-specific events into a common governance model +- policy packs can be pinned, reviewed, diffed, promoted, and rolled back like other versioned operational artifacts +- deployment profiles can express environment-specific wiring without forking the product definition itself + +## Non-goals + +To keep the product sharp, the following are explicitly out of scope for this plugin: + +- building a general-purpose workflow engine for every agent behavior +- deciding whether a task is product-correct or technically correct in all domains +- replacing human judgment for approval, acceptance, or managerial sign-off +- guaranteeing that an agent never lies; the product instead focuses on making false or unsupported reporting detectable and governable +- prescribing one universal UX, one runtime, one transport, or one storage backend +- coupling governance to one vendor model, one OpenClaw persona, or one repository layout +- forcing all workflows to use subagents; the plugin must still help in single-agent flows, but subagent forwarding integrity is a key governed case + +## Product boundaries and design stance + +This plugin should be understood as governance infrastructure for reporting behavior. + +It is not merely a documentation bundle and not merely a prompt patch. It is a product layer that sits between workflow events and operator trust. Its job is to convert ambiguous reporting norms into explicit, inspectable policy outcomes. + +The design stance is: + +- **policy over folklore** +- **evidence over vibes** +- **portable adapters over workspace-specific hacks** +- **explicit labels over ambiguous proxy language** +- **governed completion over self-declared completion** + +## Summary + +The Agent Reporting Governance Plugin is a portable enforcement layer for trustworthy agent reporting. It exists because prompts alone cannot reliably prevent missing reports, late reports, fake progress, skipped checkpoints, unverified completion claims, dropped subagent results, or mislabeled placeholder/proxy reporting. + +Its value is simple: operators should be able to trust not only that work happened, but that reporting about the work was timely, honest, verifiable, and transferable across environments.