Files
reporting-governance-plugin/docs/specs/reporting-governance-policy-packs.md
2026-05-07 16:52:22 +08:00

437 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Reporting Governance Policy Packs
## Purpose
This document defines the MVP policy-pack format for the reporting-governance plugin and specifies the first four reusable packs:
- `no-silence`
- `no-fake-progress`
- `verified-completion-only`
- `mandatory-checkpoint-structure`
These packs convert reporting expectations into portable policy artifacts that can be versioned, reviewed, pinned, and enforced across runtimes.
They are designed to align with:
- `docs/specs/reporting-governance-event-model.md`
- `docs/specs/reporting-governance-evidence-model.md`
- `docs/specs/reporting-governance-decision-model.md`
## Design goals
The MVP policy-pack system should:
- express reporting rules as versioned artifacts instead of prompt folklore
- map canonical events plus evidence into canonical decisions
- make operator-visible follow-up obligations explicit
- preserve auditability when progress or completion claims are rewritten, blocked, or downgraded
- capture the concrete failure modes observed today:
- result arrived but no operator-visible follow-up
- promised follow-up not delivered
- next-step claim without evidence
## Policy-pack scope
A policy pack is a deployable bundle of one or more governance rules focused on a single reporting risk area.
For MVP, each pack is a single `policy.yaml` file living at:
- `policy-packs/<pack-id>/policy.yaml`
Future versions may allow multiple rule files, shared templates, profile overlays, or environment-specific wiring. MVP intentionally keeps the artifact simple.
## Canonical policy-pack structure
Every policy pack should use the following top-level shape.
```yaml
apiVersion: reporting-governance/v1alpha1
kind: PolicyPack
metadata:
id: no-silence
title: No Silence
version: 1.0.0
summary: >-
Prevent missed checkpoints, invisible subagent handoffs, and silent task execution.
owner: reporting-governance-plugin
severity_default: high
applies_to:
runtimes: [openclaw]
task_modes: [interactive, silent]
tags: [reporting, checkpoints, anti-blackhole]
spec:
evaluation_mode: any_rule_match
rules:
- id: no-silence.missed-checkpoint
title: Missed checkpoint requires immediate visible recovery
triggers:
event_types: [task_checkpoint_due, silence_timeout]
conditions:
all:
- fact: checkpoint.is_overdue
equals: true
evidence_requirements:
progress:
min_new_items_since_last_checkpoint: 1
decision_output:
decision: force_checkpoint
severity: high
suggested_status: in_progress
operator_message_templates:
checkpoint_forced: >-
Required update: this task exceeded its reporting window...
```
## Top-level fields
### `apiVersion`
Stable format version for the policy artifact.
### `kind`
Must be `PolicyPack` for MVP.
### `metadata`
Describes the pack as a versioned operational artifact.
Required metadata fields:
| Field | Type | Notes |
| --- | --- | --- |
| `id` | string | Stable pack identifier. |
| `title` | string | Human-readable pack name. |
| `version` | string | Semantic or release version. |
| `summary` | string | Short operator/reviewer summary. |
| `owner` | string | Responsible component or team. |
| `severity_default` | string | Default severity when a rule does not override it. |
| `applies_to` | object | Runtime/workflow applicability. |
| `tags` | array | Search and grouping labels. |
Recommended `applies_to` fields:
- `runtimes`
- `task_modes`
- `workflow_shapes`
- `channels`
### `spec`
Contains evaluation behavior and rule definitions.
Required spec fields:
| Field | Type | Notes |
| --- | --- | --- |
| `evaluation_mode` | string | MVP values: `any_rule_match` or `first_match`. |
| `rules` | array | One or more policy rules. |
## Rule structure
Each rule should use the following shape.
| Field | Type | Required | Notes |
| --- | --- | --- | --- |
| `id` | string | yes | Stable rule identifier used as `policy_id` in decisions. |
| `title` | string | yes | Human-readable rule title. |
| `intent` | string | yes | Short explanation of the behavior being governed. |
| `triggers` | object | yes | Canonical event types or derived trigger states. |
| `conditions` | object | yes | Boolean logic over event, evidence, decision, or task facts. |
| `evidence_requirements` | object | yes | Minimum evidence required for the governed claim or checkpoint. |
| `decision_output` | object | yes | Canonical decision-model output template. |
| `operator_message_templates` | object | yes | Operator-facing message templates for forced notices, rewrites, and review requests. |
| `notes` | array | no | Design notes, caveats, or alignment comments. |
## `triggers`
`triggers` identifies when a rule should evaluate.
Recommended fields:
| Field | Type | Notes |
| --- | --- | --- |
| `event_types` | array | Canonical event types from the event model. |
| `derived_signals` | array | Normalized evaluator signals such as `checkpoint_overdue`. |
| `claim_types` | array | Claim categories such as `progress`, `completion`, `verified_completion`. |
Example:
```yaml
triggers:
event_types: [task_claimed_complete]
claim_types: [completion, verified_completion]
```
## `conditions`
`conditions` describes what must be true for the rule to fire.
MVP uses structured boolean groups:
- `all`
- `any`
- `not`
Each leaf condition references a normalized evaluator fact, for example:
- `checkpoint.is_overdue`
- `checkpoint.externalized_path_valid`
- `forwarding.result_available_without_visible_followup`
- `evidence.new_items_since_last_checkpoint`
- `evidence.completion_min_quality`
- `claim.promised_followup_due`
- `claim.next_step_has_supporting_evidence`
- `message.has_required_checkpoint_fields`
Common comparators:
- `equals`
- `not_equals`
- `greater_than`
- `less_than`
- `in`
- `contains`
## `evidence_requirements`
`evidence_requirements` connects the pack to the canonical evidence model.
Recommended fields:
| Field | Type | Notes |
| --- | --- | --- |
| `progress.min_new_items_since_last_checkpoint` | integer | Minimum new evidence items for progress-bearing reports. |
| `progress.allowed_quality_floor` | string | Minimum evidence quality allowed for progress claims. |
| `completion.min_quality` | string | Minimum quality for completion claims, usually `moderate`. |
| `verified_completion.min_quality` | string | Minimum quality for verified completion, usually `strong`. |
| `must_reference_event_types` | array | Event types that must be cited in operator-visible notices. |
| `must_reference_evidence_classes` | array | Evidence classes required for the claim path. |
Semantics:
- Progress claims must respect the evidence model requirement for at least one **new** evidence item since the previous checkpoint.
- Completion claims must meet the evidence model threshold of at least `moderate` support.
- Verified completion claims must meet the evidence model threshold of at least `strong` support.
- Some policies require operator-visible disclosure even when the evidence is runtime-derived rather than user-facing.
## `decision_output`
`decision_output` must map to the canonical decision model.
Required fields:
| Field | Type | Notes |
| --- | --- | --- |
| `decision` | string | Canonical decision value such as `block`, `force_checkpoint`, `downgrade_status`, `annotate_placeholder`, `require_review`. |
| `severity` | string | Canonical severity. |
| `reason` | string | Stable rationale template. |
| `suggested_status` | string or null | Recommended workflow status. |
| `required_actions` | array | Canonical required actions. |
| `operator_notice` | object | Operator-visible notice requirements. |
Rule-specific variable interpolation is allowed in messages and reasons, but the resulting decision must still be expressible in the decision model defined in `docs/specs/reporting-governance-decision-model.md`.
## `operator_message_templates`
Each rule must provide operator-facing templates so the runtime does not have to improvise compliance language.
Recommended template keys:
- `checkpoint_forced`
- `placeholder_rewrite`
- `review_required`
- `status_downgraded`
- `blocked`
Template goals:
- clearly distinguish provisional vs verified states
- disclose when evidence is missing, repeated, or ambiguous
- reference required events or artifacts when the decision model says `must_reference`
- make follow-up obligations explicit
## Alignment with event, evidence, and decision models
### Event model alignment
Policy packs must trigger from canonical events and derived facts built from them.
Important event-model mappings for MVP packs:
- `task_checkpoint_due`
- `task_checkpoint_sent`
- `task_claimed_complete`
- `subagent_completed`
- `subagent_result_forwarded`
- `subagent_result_not_forwarded`
- `silence_timeout`
- `forced_operator_update`
- `operator_review_requested`
### Evidence model alignment
Policy packs must interpret evidence using the evidence models classes and quality levels.
Important evidence concepts for MVP packs:
- no new evidence since last checkpoint
- narrative-only or reminder-only updates count as `none`
- next-step claims should cite new `decision_record`, `tool_output`, `file_change`, `runtime_artifact`, or other supported evidence when they assert concrete advancement
- completion claims require `moderate+`
- verified completion claims require `strong+`
### Decision model alignment
Policy packs must emit decisions from the canonical decision vocabulary.
Important decision patterns for MVP:
- `force_checkpoint` for missed checkpoint and dropped visible follow-up
- `annotate_placeholder` for status reports that would otherwise overstate progress
- `downgrade_status` for unsupported completion claims
- `require_review` for ambiguous completion evidence or borderline acceptance cases
- `block` when a required report structure or hard prerequisite is absent
## MVP policy packs
### 1. No Silence
Purpose:
- prevent missed checkpoints from becoming blackholes
- prevent child-result arrival from remaining invisible to the operator
- prevent “silent” tasks unless a valid externalized checkpoint path exists
Required failure modes covered:
- missed checkpoint
- subagent result not forwarded
- silent task without valid externalized checkpoint path
- result arrived but no operator-visible follow-up
- promised follow-up not delivered
Typical decisions:
- `force_checkpoint`
- `block`
- `escalate` for repeated or critical cases
### 2. No Fake Progress
Purpose:
- prevent narrative activity from being treated as real progress
- require new evidence for progress-bearing checkpoints
- stop reminder-only or repeated status-only updates from being framed as advancement
Required failure modes covered:
- no new evidence since last checkpoint
- repeated status-only updates
- reminder-only activity presented as progress
- next-step claim without evidence
- promised follow-up not delivered when the promised next update never gains evidence
Typical decisions:
- `annotate_placeholder`
- `rewrite`
- `force_checkpoint` when silence risk is also active
### 3. Verified Completion Only
Purpose:
- prevent unsupported completion from being treated as accepted completion
- automatically downgrade unsupported completion to `pending_verification`
- route ambiguous completion to operator review
Required failure modes covered:
- no completion without required evidence
- auto-downgrade to `pending_verification`
- operator review path when completion is ambiguous
- next-step or closure claim unsupported by evidence
Typical decisions:
- `downgrade_status`
- `require_review`
- `block` for hard invalid completion attempts
### 4. Mandatory Checkpoint Structure
Purpose:
- require reports to include a stable operator-usable structure
- ensure that every checkpoint answers the core managerial questions
- prevent reports that look active but omit actionable next-state information
Required structure equivalents:
- current status
- completed this segment
- next step
- next report condition
- whether operator intervention is needed
Typical decisions:
- `rewrite`
- `block`
- `force_checkpoint` when missing structure would otherwise hide the actual task state
## Evaluator guidance
### Rule ordering
Recommended MVP evaluation order:
1. `no-silence`
2. `mandatory-checkpoint-structure`
3. `no-fake-progress`
4. `verified-completion-only`
Rationale:
- visibility failures should be corrected first
- then the structure of the operator-visible report
- then truthfulness of progress wording
- then closure and acceptance semantics
### Multiple matches
If multiple rules match, the evaluator should prefer the most safety-preserving outcome.
Recommended precedence when conflicts occur:
1. `escalate`
2. `block`
3. `force_checkpoint`
4. `downgrade_status`
5. `require_review`
6. `rewrite`
7. `annotate_placeholder`
8. `allow`
If one rule requires an operator-visible notice and another does not, the safer behavior is to preserve the notice requirement.
### Audit trail
When a rule rewrites, annotates, blocks, or downgrades a claim, the runtime should preserve:
- original attempted message or claim text
- triggering event IDs when available
- attached or missing evidence summary
- resulting canonical decision object
- operator-visible message actually sent
## Future evolution
Potential future extensions include:
- multi-file packs with shared templates
- environment/profile overlays
- schema-backed pack validation
- rule suppression windows for approved maintenance flows
- pack composition and inheritance
- rule telemetry for false-positive tuning
## Summary
The MVP policy-pack model gives the reporting-governance plugin a portable enforcement surface. It standardizes how reporting rules are declared, how they consume canonical events and evidence, and how they produce operator-visible decisions.
The four initial packs focus on the observed operational failure modes that matter most:
- silence
- fake progress
- unsupported completion
- structurally incomplete checkpoints
Together, they provide a practical first layer of governance that is explicit, reviewable, and aligned with the event, evidence, and decision models.