spec: add mvp governance policy packs

This commit is contained in:
Eve
2026-05-07 16:52:22 +08:00
parent 4ab944101b
commit 7763e1e462
5 changed files with 1288 additions and 0 deletions

View File

@@ -0,0 +1,436 @@
# Reporting Governance Policy Packs
## Purpose
This document defines the MVP policy-pack format for the reporting-governance plugin and specifies the first four reusable packs:
- `no-silence`
- `no-fake-progress`
- `verified-completion-only`
- `mandatory-checkpoint-structure`
These packs convert reporting expectations into portable policy artifacts that can be versioned, reviewed, pinned, and enforced across runtimes.
They are designed to align with:
- `docs/specs/reporting-governance-event-model.md`
- `docs/specs/reporting-governance-evidence-model.md`
- `docs/specs/reporting-governance-decision-model.md`
## Design goals
The MVP policy-pack system should:
- express reporting rules as versioned artifacts instead of prompt folklore
- map canonical events plus evidence into canonical decisions
- make operator-visible follow-up obligations explicit
- preserve auditability when progress or completion claims are rewritten, blocked, or downgraded
- capture the concrete failure modes observed today:
- result arrived but no operator-visible follow-up
- promised follow-up not delivered
- next-step claim without evidence
## Policy-pack scope
A policy pack is a deployable bundle of one or more governance rules focused on a single reporting risk area.
For MVP, each pack is a single `policy.yaml` file living at:
- `policy-packs/<pack-id>/policy.yaml`
Future versions may allow multiple rule files, shared templates, profile overlays, or environment-specific wiring. MVP intentionally keeps the artifact simple.
## Canonical policy-pack structure
Every policy pack should use the following top-level shape.
```yaml
apiVersion: reporting-governance/v1alpha1
kind: PolicyPack
metadata:
id: no-silence
title: No Silence
version: 1.0.0
summary: >-
Prevent missed checkpoints, invisible subagent handoffs, and silent task execution.
owner: reporting-governance-plugin
severity_default: high
applies_to:
runtimes: [openclaw]
task_modes: [interactive, silent]
tags: [reporting, checkpoints, anti-blackhole]
spec:
evaluation_mode: any_rule_match
rules:
- id: no-silence.missed-checkpoint
title: Missed checkpoint requires immediate visible recovery
triggers:
event_types: [task_checkpoint_due, silence_timeout]
conditions:
all:
- fact: checkpoint.is_overdue
equals: true
evidence_requirements:
progress:
min_new_items_since_last_checkpoint: 1
decision_output:
decision: force_checkpoint
severity: high
suggested_status: in_progress
operator_message_templates:
checkpoint_forced: >-
Required update: this task exceeded its reporting window...
```
## Top-level fields
### `apiVersion`
Stable format version for the policy artifact.
### `kind`
Must be `PolicyPack` for MVP.
### `metadata`
Describes the pack as a versioned operational artifact.
Required metadata fields:
| Field | Type | Notes |
| --- | --- | --- |
| `id` | string | Stable pack identifier. |
| `title` | string | Human-readable pack name. |
| `version` | string | Semantic or release version. |
| `summary` | string | Short operator/reviewer summary. |
| `owner` | string | Responsible component or team. |
| `severity_default` | string | Default severity when a rule does not override it. |
| `applies_to` | object | Runtime/workflow applicability. |
| `tags` | array | Search and grouping labels. |
Recommended `applies_to` fields:
- `runtimes`
- `task_modes`
- `workflow_shapes`
- `channels`
### `spec`
Contains evaluation behavior and rule definitions.
Required spec fields:
| Field | Type | Notes |
| --- | --- | --- |
| `evaluation_mode` | string | MVP values: `any_rule_match` or `first_match`. |
| `rules` | array | One or more policy rules. |
## Rule structure
Each rule should use the following shape.
| Field | Type | Required | Notes |
| --- | --- | --- | --- |
| `id` | string | yes | Stable rule identifier used as `policy_id` in decisions. |
| `title` | string | yes | Human-readable rule title. |
| `intent` | string | yes | Short explanation of the behavior being governed. |
| `triggers` | object | yes | Canonical event types or derived trigger states. |
| `conditions` | object | yes | Boolean logic over event, evidence, decision, or task facts. |
| `evidence_requirements` | object | yes | Minimum evidence required for the governed claim or checkpoint. |
| `decision_output` | object | yes | Canonical decision-model output template. |
| `operator_message_templates` | object | yes | Operator-facing message templates for forced notices, rewrites, and review requests. |
| `notes` | array | no | Design notes, caveats, or alignment comments. |
## `triggers`
`triggers` identifies when a rule should evaluate.
Recommended fields:
| Field | Type | Notes |
| --- | --- | --- |
| `event_types` | array | Canonical event types from the event model. |
| `derived_signals` | array | Normalized evaluator signals such as `checkpoint_overdue`. |
| `claim_types` | array | Claim categories such as `progress`, `completion`, `verified_completion`. |
Example:
```yaml
triggers:
event_types: [task_claimed_complete]
claim_types: [completion, verified_completion]
```
## `conditions`
`conditions` describes what must be true for the rule to fire.
MVP uses structured boolean groups:
- `all`
- `any`
- `not`
Each leaf condition references a normalized evaluator fact, for example:
- `checkpoint.is_overdue`
- `checkpoint.externalized_path_valid`
- `forwarding.result_available_without_visible_followup`
- `evidence.new_items_since_last_checkpoint`
- `evidence.completion_min_quality`
- `claim.promised_followup_due`
- `claim.next_step_has_supporting_evidence`
- `message.has_required_checkpoint_fields`
Common comparators:
- `equals`
- `not_equals`
- `greater_than`
- `less_than`
- `in`
- `contains`
## `evidence_requirements`
`evidence_requirements` connects the pack to the canonical evidence model.
Recommended fields:
| Field | Type | Notes |
| --- | --- | --- |
| `progress.min_new_items_since_last_checkpoint` | integer | Minimum new evidence items for progress-bearing reports. |
| `progress.allowed_quality_floor` | string | Minimum evidence quality allowed for progress claims. |
| `completion.min_quality` | string | Minimum quality for completion claims, usually `moderate`. |
| `verified_completion.min_quality` | string | Minimum quality for verified completion, usually `strong`. |
| `must_reference_event_types` | array | Event types that must be cited in operator-visible notices. |
| `must_reference_evidence_classes` | array | Evidence classes required for the claim path. |
Semantics:
- Progress claims must respect the evidence model requirement for at least one **new** evidence item since the previous checkpoint.
- Completion claims must meet the evidence model threshold of at least `moderate` support.
- Verified completion claims must meet the evidence model threshold of at least `strong` support.
- Some policies require operator-visible disclosure even when the evidence is runtime-derived rather than user-facing.
## `decision_output`
`decision_output` must map to the canonical decision model.
Required fields:
| Field | Type | Notes |
| --- | --- | --- |
| `decision` | string | Canonical decision value such as `block`, `force_checkpoint`, `downgrade_status`, `annotate_placeholder`, `require_review`. |
| `severity` | string | Canonical severity. |
| `reason` | string | Stable rationale template. |
| `suggested_status` | string or null | Recommended workflow status. |
| `required_actions` | array | Canonical required actions. |
| `operator_notice` | object | Operator-visible notice requirements. |
Rule-specific variable interpolation is allowed in messages and reasons, but the resulting decision must still be expressible in the decision model defined in `docs/specs/reporting-governance-decision-model.md`.
## `operator_message_templates`
Each rule must provide operator-facing templates so the runtime does not have to improvise compliance language.
Recommended template keys:
- `checkpoint_forced`
- `placeholder_rewrite`
- `review_required`
- `status_downgraded`
- `blocked`
Template goals:
- clearly distinguish provisional vs verified states
- disclose when evidence is missing, repeated, or ambiguous
- reference required events or artifacts when the decision model says `must_reference`
- make follow-up obligations explicit
## Alignment with event, evidence, and decision models
### Event model alignment
Policy packs must trigger from canonical events and derived facts built from them.
Important event-model mappings for MVP packs:
- `task_checkpoint_due`
- `task_checkpoint_sent`
- `task_claimed_complete`
- `subagent_completed`
- `subagent_result_forwarded`
- `subagent_result_not_forwarded`
- `silence_timeout`
- `forced_operator_update`
- `operator_review_requested`
### Evidence model alignment
Policy packs must interpret evidence using the evidence models classes and quality levels.
Important evidence concepts for MVP packs:
- no new evidence since last checkpoint
- narrative-only or reminder-only updates count as `none`
- next-step claims should cite new `decision_record`, `tool_output`, `file_change`, `runtime_artifact`, or other supported evidence when they assert concrete advancement
- completion claims require `moderate+`
- verified completion claims require `strong+`
### Decision model alignment
Policy packs must emit decisions from the canonical decision vocabulary.
Important decision patterns for MVP:
- `force_checkpoint` for missed checkpoint and dropped visible follow-up
- `annotate_placeholder` for status reports that would otherwise overstate progress
- `downgrade_status` for unsupported completion claims
- `require_review` for ambiguous completion evidence or borderline acceptance cases
- `block` when a required report structure or hard prerequisite is absent
## MVP policy packs
### 1. No Silence
Purpose:
- prevent missed checkpoints from becoming blackholes
- prevent child-result arrival from remaining invisible to the operator
- prevent “silent” tasks unless a valid externalized checkpoint path exists
Required failure modes covered:
- missed checkpoint
- subagent result not forwarded
- silent task without valid externalized checkpoint path
- result arrived but no operator-visible follow-up
- promised follow-up not delivered
Typical decisions:
- `force_checkpoint`
- `block`
- `escalate` for repeated or critical cases
### 2. No Fake Progress
Purpose:
- prevent narrative activity from being treated as real progress
- require new evidence for progress-bearing checkpoints
- stop reminder-only or repeated status-only updates from being framed as advancement
Required failure modes covered:
- no new evidence since last checkpoint
- repeated status-only updates
- reminder-only activity presented as progress
- next-step claim without evidence
- promised follow-up not delivered when the promised next update never gains evidence
Typical decisions:
- `annotate_placeholder`
- `rewrite`
- `force_checkpoint` when silence risk is also active
### 3. Verified Completion Only
Purpose:
- prevent unsupported completion from being treated as accepted completion
- automatically downgrade unsupported completion to `pending_verification`
- route ambiguous completion to operator review
Required failure modes covered:
- no completion without required evidence
- auto-downgrade to `pending_verification`
- operator review path when completion is ambiguous
- next-step or closure claim unsupported by evidence
Typical decisions:
- `downgrade_status`
- `require_review`
- `block` for hard invalid completion attempts
### 4. Mandatory Checkpoint Structure
Purpose:
- require reports to include a stable operator-usable structure
- ensure that every checkpoint answers the core managerial questions
- prevent reports that look active but omit actionable next-state information
Required structure equivalents:
- current status
- completed this segment
- next step
- next report condition
- whether operator intervention is needed
Typical decisions:
- `rewrite`
- `block`
- `force_checkpoint` when missing structure would otherwise hide the actual task state
## Evaluator guidance
### Rule ordering
Recommended MVP evaluation order:
1. `no-silence`
2. `mandatory-checkpoint-structure`
3. `no-fake-progress`
4. `verified-completion-only`
Rationale:
- visibility failures should be corrected first
- then the structure of the operator-visible report
- then truthfulness of progress wording
- then closure and acceptance semantics
### Multiple matches
If multiple rules match, the evaluator should prefer the most safety-preserving outcome.
Recommended precedence when conflicts occur:
1. `escalate`
2. `block`
3. `force_checkpoint`
4. `downgrade_status`
5. `require_review`
6. `rewrite`
7. `annotate_placeholder`
8. `allow`
If one rule requires an operator-visible notice and another does not, the safer behavior is to preserve the notice requirement.
### Audit trail
When a rule rewrites, annotates, blocks, or downgrades a claim, the runtime should preserve:
- original attempted message or claim text
- triggering event IDs when available
- attached or missing evidence summary
- resulting canonical decision object
- operator-visible message actually sent
## Future evolution
Potential future extensions include:
- multi-file packs with shared templates
- environment/profile overlays
- schema-backed pack validation
- rule suppression windows for approved maintenance flows
- pack composition and inheritance
- rule telemetry for false-positive tuning
## Summary
The MVP policy-pack model gives the reporting-governance plugin a portable enforcement surface. It standardizes how reporting rules are declared, how they consume canonical events and evidence, and how they produce operator-visible decisions.
The four initial packs focus on the observed operational failure modes that matter most:
- silence
- fake progress
- unsupported completion
- structurally incomplete checkpoints
Together, they provide a practical first layer of governance that is explicit, reviewable, and aligned with the event, evidence, and decision models.