spec: add evidence model for reporting governance

2026-05-07 16:02:37 +08:00
parent 6921217ca3
commit e1cc6bfa53
2 changed files with 654 additions and 0 deletions
--- a/docs/specs/reporting-governance-evidence-model.md
+++ b/docs/specs/reporting-governance-evidence-model.md
@@ -0,0 +1,466 @@
+# Reporting Governance Evidence Model
+
+## Purpose
+
+This document defines the canonical evidence model for the reporting-governance plugin. It complements the event model by defining what counts as evidence, how evidence strength is classified, and what minimum evidence is required before an agent may make progress or completion claims.
+
+The model exists to prevent two recurring failure modes:
+
+1. **Fake progress**: status reports that sound active but contain no new artifact, output, or decision trace.
+2. **Blackhole reporting**: real work or failures happen, but the operator receives no auditable evidence trail that the work happened, stalled, failed, or completed.
+
+## Relationship to the event model
+
+The event model in `docs/specs/reporting-governance-event-model.md` defines the canonical envelope and event types used by the plugin. This evidence model defines the structure and interpretation of the artifacts referenced by event envelope `evidence_refs` and by `task_evidence_attached` / `task_claimed_complete` evaluation logic.
+
+Alignment rules:
+
+- Evidence items are attached to canonical events through `evidence_refs` in the event envelope.
+- `task_evidence_attached` should represent the addition of one or more canonical evidence items.
+- `task_claimed_complete` must be evaluated against the evidence quality rules in this document.
+- `subagent_completed`, `subagent_result_forwarded`, `subagent_result_not_forwarded`, `silence_timeout`, and `forced_operator_update` often depend on evidence to determine whether reporting was real, merely narrative, or absent.
+- `operator_context`, `task_id`, `agent_id`, and `correlation_id` from the event envelope should be preserved in evidence metadata so evidence can be audited across a task chain.
+
+## Canonical evidence item
+
+Every evidence item should use the following logical shape.
+
+| Field | Type | Required | Notes |
+| --- | --- | --- | --- |
+| `evidence_id` | string | yes | Unique identifier for the evidence item. UUID recommended. |
+| `task_id` | string | yes | Must match the governed task. |
+| `correlation_id` | string | yes | Links evidence across related parent/child workflows. |
+| `agent_id` | string | yes | Agent responsible for generating or attaching the evidence. |
+| `class` | string | yes | Canonical evidence class from this document. |
+| `quality` | string | yes | Canonical evidence quality level from this document. |
+| `summary` | string | yes | Short human-readable description of what the evidence proves. |
+| `captured_at` | string | yes | RFC 3339 timestamp for when the evidence was captured or recorded. |
+| `refs` | array | yes | One or more concrete references to artifacts, messages, files, logs, or outputs. |
+| `supports` | object | yes | What claims or governance checks this evidence supports. |
+| `source_event_id` | string | no | Event that introduced or attached the evidence. |
+| `metadata` | object | no | Adapter/runtime-specific details, hashes, command, reviewer, etc. |
+
+### `refs`
+
+Each `refs` item should map cleanly to the event model's `evidence_refs` convention.
+
+Recommended shape:
+
+```json
+{
+  "kind": "command_output",
+  "ref": "artifacts/verify/pytest.txt",
+  "label": "pytest verification output",
+  "sha256": "b1946ac92492d2347c6235b4d2611184",
+  "mime_type": "text/plain"
+}
+```
+
+Recommended `kind` values remain aligned with the event model:
+
+- `command_output`
+- `file`
+- `url`
+- `message`
+- `commit`
+- `log_excerpt`
+- `screenshot`
+- `schema_validation`
+
+### `supports`
+
+The `supports` object should declare intended use so policy can evaluate claims consistently.
+
+Recommended fields:
+
+```json
+{
+  "claim_types": ["progress", "completion"],
+  "verification_state": "verified",
+  "governance_checks": ["result_forwarding_integrity", "anti_fake_progress"]
+}
+```
+
+Recommended `claim_types` values:
+
+- `progress`
+- `completion`
+- `verified_completion`
+- `failure_report`
+- `dispatch_report`
+
+Recommended `verification_state` values:
+
+- `unverified`
+- `partially_verified`
+- `verified`
+- `operator_confirmed`
+
+## Canonical evidence classes
+
+The following classes are the stable governance API surface for evidence classification.
+
+### 1. `tool_output`
+
+Evidence captured directly from a tool or command execution.
+
+Typical examples:
+- test runner output
+- lint output
+- schema validation output
+- command result showing a created worktree or launched process
+
+Typical strength:
+- Usually `moderate` when it shows new execution output
+- `strong` when paired with hashes, exit code, or clearly task-specific success criteria
+- `weak` if partial or ambiguous
+
+### 2. `file_change`
+
+Evidence that code, config, docs, or state files changed in a way relevant to the task.
+
+Typical examples:
+- git diff excerpt
+- committed file with path and hash
+- generated patch or schema file
+
+Typical strength:
+- `moderate` for an auditable diff or saved file artifact
+- `strong` when tied to a commit or hash and clearly linked to the task
+- `weak` for a vague statement that a file was edited without artifact linkage
+
+### 3. `verification_output`
+
+Evidence that a verification step ran and produced a result.
+
+Typical examples:
+- unit/integration test output
+- JSON Schema validation result
+- HTTP 200 deployment check
+- review checklist result
+
+Typical strength:
+- `strong` by default when the verification target and output are clear
+- `decisive` when it is the authoritative acceptance check for the claim
+- `weak` if the verification is incomplete or not task-specific
+
+### 4. `decision_record`
+
+Evidence that a meaningful decision, judgment, or review outcome was recorded.
+
+Typical examples:
+- rationale for choosing a schema field
+- peer review note accepting or rejecting a claim
+- policy evaluation result requiring escalation
+
+Typical strength:
+- `moderate` when it captures a concrete decision with rationale
+- `strong` when the decision is signed off by the responsible reviewer/operator and linked to artifacts
+- `weak` if it is just an unsupported opinion
+
+### 5. `external_reply`
+
+Evidence originating from an external system, reviewer, or operator-visible endpoint outside the executing agent.
+
+Typical examples:
+- operator acknowledgement with requested changes
+- CI service callback
+- webhook response from a deployment or review system
+- channel message confirming receipt or failure notice
+
+Typical strength:
+- `moderate` when authentic and task-linked
+- `strong` when it is an authoritative acceptance/rejection signal
+- `weak` if context is incomplete
+
+### 6. `runtime_artifact`
+
+Evidence emitted by the runtime or orchestration layer, especially for lifecycle and anti-blackhole auditing.
+
+Typical examples:
+- subagent completion artifact
+- forwarding receipt
+- watchdog trigger record
+- dispatch binding artifact
+
+Typical strength:
+- `moderate` when it proves state transition or runtime action
+- `strong` when it includes immutable IDs, timestamps, or integrity fields
+- `decisive` when it is the authoritative record for forwarding integrity or watchdog enforcement
+
+### 7. `operator_message`
+
+Evidence from an operator-visible message that contains new substantive reporting content.
+
+Typical examples:
+- checkpoint report with a new artifact link
+- forced update message disclosing failure
+- completion report quoting verification output and commit reference
+
+Typical strength:
+- `moderate` when it carries new task-specific evidence references
+- `strong` when it includes artifact references plus explicit review or acceptance context
+- `none` or `weak` when it is only narrative and has no new artifact
+
+## Evidence quality levels
+
+Evidence quality is a policy signal, not a moral judgment. It describes how strongly an item supports a claim.
+
+### `none`
+The item provides no claim-supporting evidence.
+
+Use when:
+- there is no new artifact
+- the message is only a heartbeat, reminder, or timestamp
+- the content repeats prior claims without new proof
+
+### `weak`
+There is some task-related signal, but it is incomplete, ambiguous, or too easy to fake.
+
+Use when:
+- partial output is shown without clear success/failure meaning
+- a message describes work without an auditable artifact
+- a file is mentioned but not linked, hashed, diffed, or otherwise inspectable
+
+### `moderate`
+There is enough concrete, task-specific evidence to support a normal completion or progress claim.
+
+Use when:
+- there is new tool output, file artifact, runtime artifact, or decision trace
+- the evidence is auditable and clearly tied to the task
+- a reasonable reviewer could inspect it and conclude the claimed step happened
+
+### `strong`
+The evidence is concrete, auditable, and includes verification or authoritative traceability.
+
+Use when:
+- verification output is present and task-specific
+- multiple evidence classes corroborate each other
+- immutable references such as commit SHAs, result IDs, or message anchors are present
+
+### `decisive`
+The evidence is authoritative for the claim and leaves little room for ambiguity.
+
+Use when:
+- the acceptance condition is directly proven by the evidence
+- an authoritative system-of-record result is attached
+- runtime integrity records prove the exact event under dispute
+
+## Evidence-to-claim mapping
+
+These are minimum governance thresholds.
+
+### Progress claim
+A progress claim requires **at least one new evidence item** since the previous checkpoint.
+
+Interpretation:
+- Repeating the same summary with no new artifact is not progress.
+- A new evidence item may be `tool_output`, `file_change`, `runtime_artifact`, `decision_record`, `verification_output`, `external_reply`, or an `operator_message` that itself contains a new artifact reference.
+- The item does not need to be strong, but it must be new and non-empty evidence, typically `weak` or better. Governance policies may require `moderate` for specific high-risk tasks.
+
+### Completion claim
+A completion claim requires **at least one `moderate`, `strong`, or `decisive` evidence item** supporting `completion`.
+
+Interpretation:
+- Narrative-only completion claims are invalid.
+- A file change alone may be enough only if it is auditable and task-specific.
+- For sensitive tasks, policies may require multiple corroborating items, but the baseline threshold is `moderate+`.
+
+### Verified completion claim
+A verified completion claim requires **at least one `strong` or `decisive` evidence item** supporting `verified_completion`.
+
+Interpretation:
+- Verification output is the usual path.
+- A review record tied to verification artifacts may also satisfy the threshold.
+- `moderate` evidence is not enough for verified completion.
+
+## Anti-fake-progress rules
+
+The following patterns should be treated as suspicious or invalid unless paired with new evidence:
+
+- status narration with no artifact reference
+- repeated “still working” reports with unchanged evidence set
+- claiming a file was updated without diff, path, commit, or saved artifact
+- claiming verification passed without actual verification output
+- claiming a subagent finished when no completion artifact or forwarding record exists
+
+Policy recommendation:
+- If a checkpoint contains only `none` evidence, it should not satisfy `task_checkpoint_sent` obligations for progress-bearing reports.
+- If a claim references earlier evidence only, it should be marked as a repeat rather than new progress.
+
+## Anti-blackhole rules
+
+Evidence must help prevent invisible work and invisible failure.
+
+Policy recommendation:
+- A failed subagent spawn should emit runtime and/or operator-visible evidence immediately.
+- A completed subagent should produce a runtime artifact that can later be matched to a forwarding record.
+- A forced operator update should carry evidence showing why silence or blocking occurred.
+- If a watchdog fires, the watchdog artifact itself is evidence and should be retained as a `runtime_artifact`.
+
+## Negative examples: non-evidence
+
+The following should be classified as `none` unless paired with a new supporting artifact.
+
+### 1. Heartbeat only
+Example:
+> HEARTBEAT_OK
+
+Why it is not evidence:
+- It proves liveness of a message path, not task progress.
+- It does not show work, verification, or a new decision.
+
+### 2. Timestamp update only
+Example:
+> Updated at 15:40.
+
+Why it is not evidence:
+- Time passing is not proof of work.
+- No artifact or state transition is shown.
+
+### 3. Reminder acknowledgement only
+Example:
+> Got it, I’ll keep watching.
+
+Why it is not evidence:
+- Acknowledgement is not task execution evidence.
+- It does not support progress or completion.
+
+### 4. Repeated summary with no new artifact
+Example:
+> Still implementing the schema and reviewing consistency.
+
+Why it is not evidence:
+- It may be true, but it is not auditable by itself.
+- Without a new file diff, output, or decision record, it cannot satisfy a progress checkpoint.
+
+## Evaluation guidance
+
+### Single-item evaluation
+
+A single evidence item should be scored based on:
+- specificity to the task
+- auditability of the referenced artifact
+- recency and novelty since the prior checkpoint
+- independence or authoritativeness of the source
+- whether it directly proves the claim being made
+
+### Multi-item evaluation
+
+Multiple items may raise confidence when they corroborate each other.
+
+Examples:
+- `file_change` + `tool_output` can raise a claim from `moderate` to `strong`
+- `subagent_completed` runtime artifact + `subagent_result_forwarded` runtime artifact can strongly prove anti-blackhole compliance
+- `verification_output` + `decision_record` from review can support `verified_completion`
+
+## Examples supporting governance goals
+
+### Example A: valid progress checkpoint that prevents fake progress
+
+Claim:
+> Implemented the evidence schema draft.
+
+Evidence set:
+1. `file_change` — `schemas/reporting-governance/evidence.schema.json` saved with hash and diff
+2. `operator_message` — checkpoint message linking the schema file and summarizing the new fields
+
+Assessment:
+- New evidence exists since the last checkpoint.
+- Progress claim is valid.
+- If the file artifact is auditable, this is at least `moderate`.
+
+### Example B: invalid progress checkpoint that should be rejected
+
+Claim:
+> Still making progress on the evidence model.
+
+Evidence set:
+1. `operator_message` containing only the sentence above
+
+Assessment:
+- No new artifact.
+- Quality is `none` or `weak` at best.
+- Should not count as progress for checkpoint compliance.
+
+### Example C: completion claim supported by moderate evidence
+
+Claim:
+> Evidence model spec and schema are implemented.
+
+Evidence set:
+1. `file_change` — spec file created
+2. `file_change` — schema file created
+3. `tool_output` — git status or diff showing exactly those files
+
+Assessment:
+- Completion claim meets `moderate+` threshold.
+- Not yet verified completion unless verification output is also attached.
+
+### Example D: verified completion claim supported by strong evidence
+
+Claim:
+> Evidence model implementation is complete and verified.
+
+Evidence set:
+1. `file_change` — committed schema and spec files with commit SHA
+2. `verification_output` — JSON Schema parses and validation command succeeds
+3. `decision_record` — reviewer notes internal consistency check passed
+
+Assessment:
+- Verified completion threshold is met because verification is explicit and auditable.
+- This is `strong`, possibly `decisive` if the verification is the authoritative acceptance gate.
+
+### Example E: anti-blackhole forwarding proof
+
+Claim:
+> Subagent result was surfaced correctly to the operator path.
+
+Evidence set:
+1. `runtime_artifact` — `subagent_completed` result artifact with `result_ref`
+2. `runtime_artifact` — `subagent_result_forwarded` record referencing the same result
+3. `operator_message` — forwarded completion message with anchor/message ref
+
+Assessment:
+- Correlated runtime artifacts prove the result did not disappear silently.
+- Strong support for the result-forwarding integrity gate.
+
+### Example F: forced disclosure after failure
+
+Claim:
+> Dispatch failure was reported immediately instead of becoming a blackhole.
+
+Evidence set:
+1. `runtime_artifact` — `subagent_spawn_failed`
+2. `runtime_artifact` — `forced_operator_update`
+3. `external_reply` or `operator_message` — operator-visible failure notice
+
+Assessment:
+- Strong anti-blackhole evidence because failure was both recorded and surfaced.
+
+## Recommended policy hooks
+
+The plugin should support policy checks such as:
+
+- reject completion claims without `moderate+` evidence
+- reject verified completion claims without `strong+` evidence
+- reject progress checkpoints with zero new evidence items
+- flag repeated `none` evidence checkpoints as fake-progress risk
+- flag subagent completion without forwarding evidence as blackhole risk
+- require preservation of correlated runtime artifacts for watchdog investigations
+
+## Schema alignment note
+
+The JSON Schema in `schemas/reporting-governance/evidence.schema.json` encodes the canonical evidence shape and enumerations from this document. It is intended to align with:
+
+- `schemas/reporting-governance/event-envelope.schema.json`
+- `schemas/reporting-governance/events.schema.json`
+
+In particular:
+- top-level identifiers (`task_id`, `correlation_id`, `agent_id`) match the event envelope vocabulary
+- `refs` items use the same reference conventions as `evidence_refs`
+- claim-support vocabulary is compatible with `task_evidence_attached`, `task_claimed_complete`, and runtime integrity events
+
+## Stability
+
+The evidence classes and quality levels in this document are canonical and should be treated as stable governance API values. Future revisions may extend metadata and policy rules, but should avoid breaking the canonical class or quality enumerations without an explicit version change.