spec: add mvp governance policy packs

2026-05-07 16:52:22 +08:00
parent 4ab944101b
commit 7763e1e462
5 changed files with 1288 additions and 0 deletions
--- a/docs/specs/reporting-governance-policy-packs.md
+++ b/docs/specs/reporting-governance-policy-packs.md
@@ -0,0 +1,436 @@
+# Reporting Governance Policy Packs
+
+## Purpose
+
+This document defines the MVP policy-pack format for the reporting-governance plugin and specifies the first four reusable packs:
+
+- `no-silence`
+- `no-fake-progress`
+- `verified-completion-only`
+- `mandatory-checkpoint-structure`
+
+These packs convert reporting expectations into portable policy artifacts that can be versioned, reviewed, pinned, and enforced across runtimes.
+
+They are designed to align with:
+
+- `docs/specs/reporting-governance-event-model.md`
+- `docs/specs/reporting-governance-evidence-model.md`
+- `docs/specs/reporting-governance-decision-model.md`
+
+## Design goals
+
+The MVP policy-pack system should:
+
+- express reporting rules as versioned artifacts instead of prompt folklore
+- map canonical events plus evidence into canonical decisions
+- make operator-visible follow-up obligations explicit
+- preserve auditability when progress or completion claims are rewritten, blocked, or downgraded
+- capture the concrete failure modes observed today:
+  - result arrived but no operator-visible follow-up
+  - promised follow-up not delivered
+  - next-step claim without evidence
+
+## Policy-pack scope
+
+A policy pack is a deployable bundle of one or more governance rules focused on a single reporting risk area.
+
+For MVP, each pack is a single `policy.yaml` file living at:
+
+- `policy-packs/<pack-id>/policy.yaml`
+
+Future versions may allow multiple rule files, shared templates, profile overlays, or environment-specific wiring. MVP intentionally keeps the artifact simple.
+
+## Canonical policy-pack structure
+
+Every policy pack should use the following top-level shape.
+
+```yaml
+apiVersion: reporting-governance/v1alpha1
+kind: PolicyPack
+metadata:
+  id: no-silence
+  title: No Silence
+  version: 1.0.0
+  summary: >-
+    Prevent missed checkpoints, invisible subagent handoffs, and silent task execution.
+  owner: reporting-governance-plugin
+  severity_default: high
+  applies_to:
+    runtimes: [openclaw]
+    task_modes: [interactive, silent]
+  tags: [reporting, checkpoints, anti-blackhole]
+
+spec:
+  evaluation_mode: any_rule_match
+  rules:
+    - id: no-silence.missed-checkpoint
+      title: Missed checkpoint requires immediate visible recovery
+      triggers:
+        event_types: [task_checkpoint_due, silence_timeout]
+      conditions:
+        all:
+          - fact: checkpoint.is_overdue
+            equals: true
+      evidence_requirements:
+        progress:
+          min_new_items_since_last_checkpoint: 1
+      decision_output:
+        decision: force_checkpoint
+        severity: high
+        suggested_status: in_progress
+      operator_message_templates:
+        checkpoint_forced: >-
+          Required update: this task exceeded its reporting window...
+```
+
+## Top-level fields
+
+### `apiVersion`
+Stable format version for the policy artifact.
+
+### `kind`
+Must be `PolicyPack` for MVP.
+
+### `metadata`
+Describes the pack as a versioned operational artifact.
+
+Required metadata fields:
+
+| Field | Type | Notes |
+| --- | --- | --- |
+| `id` | string | Stable pack identifier. |
+| `title` | string | Human-readable pack name. |
+| `version` | string | Semantic or release version. |
+| `summary` | string | Short operator/reviewer summary. |
+| `owner` | string | Responsible component or team. |
+| `severity_default` | string | Default severity when a rule does not override it. |
+| `applies_to` | object | Runtime/workflow applicability. |
+| `tags` | array | Search and grouping labels. |
+
+Recommended `applies_to` fields:
+
+- `runtimes`
+- `task_modes`
+- `workflow_shapes`
+- `channels`
+
+### `spec`
+Contains evaluation behavior and rule definitions.
+
+Required spec fields:
+
+| Field | Type | Notes |
+| --- | --- | --- |
+| `evaluation_mode` | string | MVP values: `any_rule_match` or `first_match`. |
+| `rules` | array | One or more policy rules. |
+
+## Rule structure
+
+Each rule should use the following shape.
+
+| Field | Type | Required | Notes |
+| --- | --- | --- | --- |
+| `id` | string | yes | Stable rule identifier used as `policy_id` in decisions. |
+| `title` | string | yes | Human-readable rule title. |
+| `intent` | string | yes | Short explanation of the behavior being governed. |
+| `triggers` | object | yes | Canonical event types or derived trigger states. |
+| `conditions` | object | yes | Boolean logic over event, evidence, decision, or task facts. |
+| `evidence_requirements` | object | yes | Minimum evidence required for the governed claim or checkpoint. |
+| `decision_output` | object | yes | Canonical decision-model output template. |
+| `operator_message_templates` | object | yes | Operator-facing message templates for forced notices, rewrites, and review requests. |
+| `notes` | array | no | Design notes, caveats, or alignment comments. |
+
+## `triggers`
+
+`triggers` identifies when a rule should evaluate.
+
+Recommended fields:
+
+| Field | Type | Notes |
+| --- | --- | --- |
+| `event_types` | array | Canonical event types from the event model. |
+| `derived_signals` | array | Normalized evaluator signals such as `checkpoint_overdue`. |
+| `claim_types` | array | Claim categories such as `progress`, `completion`, `verified_completion`. |
+
+Example:
+
+```yaml
+triggers:
+  event_types: [task_claimed_complete]
+  claim_types: [completion, verified_completion]
+```
+
+## `conditions`
+
+`conditions` describes what must be true for the rule to fire.
+
+MVP uses structured boolean groups:
+
+- `all`
+- `any`
+- `not`
+
+Each leaf condition references a normalized evaluator fact, for example:
+
+- `checkpoint.is_overdue`
+- `checkpoint.externalized_path_valid`
+- `forwarding.result_available_without_visible_followup`
+- `evidence.new_items_since_last_checkpoint`
+- `evidence.completion_min_quality`
+- `claim.promised_followup_due`
+- `claim.next_step_has_supporting_evidence`
+- `message.has_required_checkpoint_fields`
+
+Common comparators:
+
+- `equals`
+- `not_equals`
+- `greater_than`
+- `less_than`
+- `in`
+- `contains`
+
+## `evidence_requirements`
+
+`evidence_requirements` connects the pack to the canonical evidence model.
+
+Recommended fields:
+
+| Field | Type | Notes |
+| --- | --- | --- |
+| `progress.min_new_items_since_last_checkpoint` | integer | Minimum new evidence items for progress-bearing reports. |
+| `progress.allowed_quality_floor` | string | Minimum evidence quality allowed for progress claims. |
+| `completion.min_quality` | string | Minimum quality for completion claims, usually `moderate`. |
+| `verified_completion.min_quality` | string | Minimum quality for verified completion, usually `strong`. |
+| `must_reference_event_types` | array | Event types that must be cited in operator-visible notices. |
+| `must_reference_evidence_classes` | array | Evidence classes required for the claim path. |
+
+Semantics:
+
+- Progress claims must respect the evidence model requirement for at least one **new** evidence item since the previous checkpoint.
+- Completion claims must meet the evidence model threshold of at least `moderate` support.
+- Verified completion claims must meet the evidence model threshold of at least `strong` support.
+- Some policies require operator-visible disclosure even when the evidence is runtime-derived rather than user-facing.
+
+## `decision_output`
+
+`decision_output` must map to the canonical decision model.
+
+Required fields:
+
+| Field | Type | Notes |
+| --- | --- | --- |
+| `decision` | string | Canonical decision value such as `block`, `force_checkpoint`, `downgrade_status`, `annotate_placeholder`, `require_review`. |
+| `severity` | string | Canonical severity. |
+| `reason` | string | Stable rationale template. |
+| `suggested_status` | string or null | Recommended workflow status. |
+| `required_actions` | array | Canonical required actions. |
+| `operator_notice` | object | Operator-visible notice requirements. |
+
+Rule-specific variable interpolation is allowed in messages and reasons, but the resulting decision must still be expressible in the decision model defined in `docs/specs/reporting-governance-decision-model.md`.
+
+## `operator_message_templates`
+
+Each rule must provide operator-facing templates so the runtime does not have to improvise compliance language.
+
+Recommended template keys:
+
+- `checkpoint_forced`
+- `placeholder_rewrite`
+- `review_required`
+- `status_downgraded`
+- `blocked`
+
+Template goals:
+
+- clearly distinguish provisional vs verified states
+- disclose when evidence is missing, repeated, or ambiguous
+- reference required events or artifacts when the decision model says `must_reference`
+- make follow-up obligations explicit
+
+## Alignment with event, evidence, and decision models
+
+### Event model alignment
+
+Policy packs must trigger from canonical events and derived facts built from them.
+
+Important event-model mappings for MVP packs:
+
+- `task_checkpoint_due`
+- `task_checkpoint_sent`
+- `task_claimed_complete`
+- `subagent_completed`
+- `subagent_result_forwarded`
+- `subagent_result_not_forwarded`
+- `silence_timeout`
+- `forced_operator_update`
+- `operator_review_requested`
+
+### Evidence model alignment
+
+Policy packs must interpret evidence using the evidence model’s classes and quality levels.
+
+Important evidence concepts for MVP packs:
+
+- no new evidence since last checkpoint
+- narrative-only or reminder-only updates count as `none`
+- next-step claims should cite new `decision_record`, `tool_output`, `file_change`, `runtime_artifact`, or other supported evidence when they assert concrete advancement
+- completion claims require `moderate+`
+- verified completion claims require `strong+`
+
+### Decision model alignment
+
+Policy packs must emit decisions from the canonical decision vocabulary.
+
+Important decision patterns for MVP:
+
+- `force_checkpoint` for missed checkpoint and dropped visible follow-up
+- `annotate_placeholder` for status reports that would otherwise overstate progress
+- `downgrade_status` for unsupported completion claims
+- `require_review` for ambiguous completion evidence or borderline acceptance cases
+- `block` when a required report structure or hard prerequisite is absent
+
+## MVP policy packs
+
+### 1. No Silence
+
+Purpose:
+- prevent missed checkpoints from becoming blackholes
+- prevent child-result arrival from remaining invisible to the operator
+- prevent “silent” tasks unless a valid externalized checkpoint path exists
+
+Required failure modes covered:
+- missed checkpoint
+- subagent result not forwarded
+- silent task without valid externalized checkpoint path
+- result arrived but no operator-visible follow-up
+- promised follow-up not delivered
+
+Typical decisions:
+- `force_checkpoint`
+- `block`
+- `escalate` for repeated or critical cases
+
+### 2. No Fake Progress
+
+Purpose:
+- prevent narrative activity from being treated as real progress
+- require new evidence for progress-bearing checkpoints
+- stop reminder-only or repeated status-only updates from being framed as advancement
+
+Required failure modes covered:
+- no new evidence since last checkpoint
+- repeated status-only updates
+- reminder-only activity presented as progress
+- next-step claim without evidence
+- promised follow-up not delivered when the promised next update never gains evidence
+
+Typical decisions:
+- `annotate_placeholder`
+- `rewrite`
+- `force_checkpoint` when silence risk is also active
+
+### 3. Verified Completion Only
+
+Purpose:
+- prevent unsupported completion from being treated as accepted completion
+- automatically downgrade unsupported completion to `pending_verification`
+- route ambiguous completion to operator review
+
+Required failure modes covered:
+- no completion without required evidence
+- auto-downgrade to `pending_verification`
+- operator review path when completion is ambiguous
+- next-step or closure claim unsupported by evidence
+
+Typical decisions:
+- `downgrade_status`
+- `require_review`
+- `block` for hard invalid completion attempts
+
+### 4. Mandatory Checkpoint Structure
+
+Purpose:
+- require reports to include a stable operator-usable structure
+- ensure that every checkpoint answers the core managerial questions
+- prevent reports that look active but omit actionable next-state information
+
+Required structure equivalents:
+- current status
+- completed this segment
+- next step
+- next report condition
+- whether operator intervention is needed
+
+Typical decisions:
+- `rewrite`
+- `block`
+- `force_checkpoint` when missing structure would otherwise hide the actual task state
+
+## Evaluator guidance
+
+### Rule ordering
+
+Recommended MVP evaluation order:
+
+1. `no-silence`
+2. `mandatory-checkpoint-structure`
+3. `no-fake-progress`
+4. `verified-completion-only`
+
+Rationale:
+- visibility failures should be corrected first
+- then the structure of the operator-visible report
+- then truthfulness of progress wording
+- then closure and acceptance semantics
+
+### Multiple matches
+
+If multiple rules match, the evaluator should prefer the most safety-preserving outcome.
+
+Recommended precedence when conflicts occur:
+
+1. `escalate`
+2. `block`
+3. `force_checkpoint`
+4. `downgrade_status`
+5. `require_review`
+6. `rewrite`
+7. `annotate_placeholder`
+8. `allow`
+
+If one rule requires an operator-visible notice and another does not, the safer behavior is to preserve the notice requirement.
+
+### Audit trail
+
+When a rule rewrites, annotates, blocks, or downgrades a claim, the runtime should preserve:
+
+- original attempted message or claim text
+- triggering event IDs when available
+- attached or missing evidence summary
+- resulting canonical decision object
+- operator-visible message actually sent
+
+## Future evolution
+
+Potential future extensions include:
+
+- multi-file packs with shared templates
+- environment/profile overlays
+- schema-backed pack validation
+- rule suppression windows for approved maintenance flows
+- pack composition and inheritance
+- rule telemetry for false-positive tuning
+
+## Summary
+
+The MVP policy-pack model gives the reporting-governance plugin a portable enforcement surface. It standardizes how reporting rules are declared, how they consume canonical events and evidence, and how they produce operator-visible decisions.
+
+The four initial packs focus on the observed operational failure modes that matter most:
+
+- silence
+- fake progress
+- unsupported completion
+- structurally incomplete checkpoints
+
+Together, they provide a practical first layer of governance that is explicit, reviewable, and aligned with the event, evidence, and decision models.