spec: add mvp governance policy packs
This commit is contained in:
436
docs/specs/reporting-governance-policy-packs.md
Normal file
436
docs/specs/reporting-governance-policy-packs.md
Normal file
@@ -0,0 +1,436 @@
|
||||
# Reporting Governance Policy Packs
|
||||
|
||||
## Purpose
|
||||
|
||||
This document defines the MVP policy-pack format for the reporting-governance plugin and specifies the first four reusable packs:
|
||||
|
||||
- `no-silence`
|
||||
- `no-fake-progress`
|
||||
- `verified-completion-only`
|
||||
- `mandatory-checkpoint-structure`
|
||||
|
||||
These packs convert reporting expectations into portable policy artifacts that can be versioned, reviewed, pinned, and enforced across runtimes.
|
||||
|
||||
They are designed to align with:
|
||||
|
||||
- `docs/specs/reporting-governance-event-model.md`
|
||||
- `docs/specs/reporting-governance-evidence-model.md`
|
||||
- `docs/specs/reporting-governance-decision-model.md`
|
||||
|
||||
## Design goals
|
||||
|
||||
The MVP policy-pack system should:
|
||||
|
||||
- express reporting rules as versioned artifacts instead of prompt folklore
|
||||
- map canonical events plus evidence into canonical decisions
|
||||
- make operator-visible follow-up obligations explicit
|
||||
- preserve auditability when progress or completion claims are rewritten, blocked, or downgraded
|
||||
- capture the concrete failure modes observed today:
|
||||
- result arrived but no operator-visible follow-up
|
||||
- promised follow-up not delivered
|
||||
- next-step claim without evidence
|
||||
|
||||
## Policy-pack scope
|
||||
|
||||
A policy pack is a deployable bundle of one or more governance rules focused on a single reporting risk area.
|
||||
|
||||
For MVP, each pack is a single `policy.yaml` file living at:
|
||||
|
||||
- `policy-packs/<pack-id>/policy.yaml`
|
||||
|
||||
Future versions may allow multiple rule files, shared templates, profile overlays, or environment-specific wiring. MVP intentionally keeps the artifact simple.
|
||||
|
||||
## Canonical policy-pack structure
|
||||
|
||||
Every policy pack should use the following top-level shape.
|
||||
|
||||
```yaml
|
||||
apiVersion: reporting-governance/v1alpha1
|
||||
kind: PolicyPack
|
||||
metadata:
|
||||
id: no-silence
|
||||
title: No Silence
|
||||
version: 1.0.0
|
||||
summary: >-
|
||||
Prevent missed checkpoints, invisible subagent handoffs, and silent task execution.
|
||||
owner: reporting-governance-plugin
|
||||
severity_default: high
|
||||
applies_to:
|
||||
runtimes: [openclaw]
|
||||
task_modes: [interactive, silent]
|
||||
tags: [reporting, checkpoints, anti-blackhole]
|
||||
|
||||
spec:
|
||||
evaluation_mode: any_rule_match
|
||||
rules:
|
||||
- id: no-silence.missed-checkpoint
|
||||
title: Missed checkpoint requires immediate visible recovery
|
||||
triggers:
|
||||
event_types: [task_checkpoint_due, silence_timeout]
|
||||
conditions:
|
||||
all:
|
||||
- fact: checkpoint.is_overdue
|
||||
equals: true
|
||||
evidence_requirements:
|
||||
progress:
|
||||
min_new_items_since_last_checkpoint: 1
|
||||
decision_output:
|
||||
decision: force_checkpoint
|
||||
severity: high
|
||||
suggested_status: in_progress
|
||||
operator_message_templates:
|
||||
checkpoint_forced: >-
|
||||
Required update: this task exceeded its reporting window...
|
||||
```
|
||||
|
||||
## Top-level fields
|
||||
|
||||
### `apiVersion`
|
||||
Stable format version for the policy artifact.
|
||||
|
||||
### `kind`
|
||||
Must be `PolicyPack` for MVP.
|
||||
|
||||
### `metadata`
|
||||
Describes the pack as a versioned operational artifact.
|
||||
|
||||
Required metadata fields:
|
||||
|
||||
| Field | Type | Notes |
|
||||
| --- | --- | --- |
|
||||
| `id` | string | Stable pack identifier. |
|
||||
| `title` | string | Human-readable pack name. |
|
||||
| `version` | string | Semantic or release version. |
|
||||
| `summary` | string | Short operator/reviewer summary. |
|
||||
| `owner` | string | Responsible component or team. |
|
||||
| `severity_default` | string | Default severity when a rule does not override it. |
|
||||
| `applies_to` | object | Runtime/workflow applicability. |
|
||||
| `tags` | array | Search and grouping labels. |
|
||||
|
||||
Recommended `applies_to` fields:
|
||||
|
||||
- `runtimes`
|
||||
- `task_modes`
|
||||
- `workflow_shapes`
|
||||
- `channels`
|
||||
|
||||
### `spec`
|
||||
Contains evaluation behavior and rule definitions.
|
||||
|
||||
Required spec fields:
|
||||
|
||||
| Field | Type | Notes |
|
||||
| --- | --- | --- |
|
||||
| `evaluation_mode` | string | MVP values: `any_rule_match` or `first_match`. |
|
||||
| `rules` | array | One or more policy rules. |
|
||||
|
||||
## Rule structure
|
||||
|
||||
Each rule should use the following shape.
|
||||
|
||||
| Field | Type | Required | Notes |
|
||||
| --- | --- | --- | --- |
|
||||
| `id` | string | yes | Stable rule identifier used as `policy_id` in decisions. |
|
||||
| `title` | string | yes | Human-readable rule title. |
|
||||
| `intent` | string | yes | Short explanation of the behavior being governed. |
|
||||
| `triggers` | object | yes | Canonical event types or derived trigger states. |
|
||||
| `conditions` | object | yes | Boolean logic over event, evidence, decision, or task facts. |
|
||||
| `evidence_requirements` | object | yes | Minimum evidence required for the governed claim or checkpoint. |
|
||||
| `decision_output` | object | yes | Canonical decision-model output template. |
|
||||
| `operator_message_templates` | object | yes | Operator-facing message templates for forced notices, rewrites, and review requests. |
|
||||
| `notes` | array | no | Design notes, caveats, or alignment comments. |
|
||||
|
||||
## `triggers`
|
||||
|
||||
`triggers` identifies when a rule should evaluate.
|
||||
|
||||
Recommended fields:
|
||||
|
||||
| Field | Type | Notes |
|
||||
| --- | --- | --- |
|
||||
| `event_types` | array | Canonical event types from the event model. |
|
||||
| `derived_signals` | array | Normalized evaluator signals such as `checkpoint_overdue`. |
|
||||
| `claim_types` | array | Claim categories such as `progress`, `completion`, `verified_completion`. |
|
||||
|
||||
Example:
|
||||
|
||||
```yaml
|
||||
triggers:
|
||||
event_types: [task_claimed_complete]
|
||||
claim_types: [completion, verified_completion]
|
||||
```
|
||||
|
||||
## `conditions`
|
||||
|
||||
`conditions` describes what must be true for the rule to fire.
|
||||
|
||||
MVP uses structured boolean groups:
|
||||
|
||||
- `all`
|
||||
- `any`
|
||||
- `not`
|
||||
|
||||
Each leaf condition references a normalized evaluator fact, for example:
|
||||
|
||||
- `checkpoint.is_overdue`
|
||||
- `checkpoint.externalized_path_valid`
|
||||
- `forwarding.result_available_without_visible_followup`
|
||||
- `evidence.new_items_since_last_checkpoint`
|
||||
- `evidence.completion_min_quality`
|
||||
- `claim.promised_followup_due`
|
||||
- `claim.next_step_has_supporting_evidence`
|
||||
- `message.has_required_checkpoint_fields`
|
||||
|
||||
Common comparators:
|
||||
|
||||
- `equals`
|
||||
- `not_equals`
|
||||
- `greater_than`
|
||||
- `less_than`
|
||||
- `in`
|
||||
- `contains`
|
||||
|
||||
## `evidence_requirements`
|
||||
|
||||
`evidence_requirements` connects the pack to the canonical evidence model.
|
||||
|
||||
Recommended fields:
|
||||
|
||||
| Field | Type | Notes |
|
||||
| --- | --- | --- |
|
||||
| `progress.min_new_items_since_last_checkpoint` | integer | Minimum new evidence items for progress-bearing reports. |
|
||||
| `progress.allowed_quality_floor` | string | Minimum evidence quality allowed for progress claims. |
|
||||
| `completion.min_quality` | string | Minimum quality for completion claims, usually `moderate`. |
|
||||
| `verified_completion.min_quality` | string | Minimum quality for verified completion, usually `strong`. |
|
||||
| `must_reference_event_types` | array | Event types that must be cited in operator-visible notices. |
|
||||
| `must_reference_evidence_classes` | array | Evidence classes required for the claim path. |
|
||||
|
||||
Semantics:
|
||||
|
||||
- Progress claims must respect the evidence model requirement for at least one **new** evidence item since the previous checkpoint.
|
||||
- Completion claims must meet the evidence model threshold of at least `moderate` support.
|
||||
- Verified completion claims must meet the evidence model threshold of at least `strong` support.
|
||||
- Some policies require operator-visible disclosure even when the evidence is runtime-derived rather than user-facing.
|
||||
|
||||
## `decision_output`
|
||||
|
||||
`decision_output` must map to the canonical decision model.
|
||||
|
||||
Required fields:
|
||||
|
||||
| Field | Type | Notes |
|
||||
| --- | --- | --- |
|
||||
| `decision` | string | Canonical decision value such as `block`, `force_checkpoint`, `downgrade_status`, `annotate_placeholder`, `require_review`. |
|
||||
| `severity` | string | Canonical severity. |
|
||||
| `reason` | string | Stable rationale template. |
|
||||
| `suggested_status` | string or null | Recommended workflow status. |
|
||||
| `required_actions` | array | Canonical required actions. |
|
||||
| `operator_notice` | object | Operator-visible notice requirements. |
|
||||
|
||||
Rule-specific variable interpolation is allowed in messages and reasons, but the resulting decision must still be expressible in the decision model defined in `docs/specs/reporting-governance-decision-model.md`.
|
||||
|
||||
## `operator_message_templates`
|
||||
|
||||
Each rule must provide operator-facing templates so the runtime does not have to improvise compliance language.
|
||||
|
||||
Recommended template keys:
|
||||
|
||||
- `checkpoint_forced`
|
||||
- `placeholder_rewrite`
|
||||
- `review_required`
|
||||
- `status_downgraded`
|
||||
- `blocked`
|
||||
|
||||
Template goals:
|
||||
|
||||
- clearly distinguish provisional vs verified states
|
||||
- disclose when evidence is missing, repeated, or ambiguous
|
||||
- reference required events or artifacts when the decision model says `must_reference`
|
||||
- make follow-up obligations explicit
|
||||
|
||||
## Alignment with event, evidence, and decision models
|
||||
|
||||
### Event model alignment
|
||||
|
||||
Policy packs must trigger from canonical events and derived facts built from them.
|
||||
|
||||
Important event-model mappings for MVP packs:
|
||||
|
||||
- `task_checkpoint_due`
|
||||
- `task_checkpoint_sent`
|
||||
- `task_claimed_complete`
|
||||
- `subagent_completed`
|
||||
- `subagent_result_forwarded`
|
||||
- `subagent_result_not_forwarded`
|
||||
- `silence_timeout`
|
||||
- `forced_operator_update`
|
||||
- `operator_review_requested`
|
||||
|
||||
### Evidence model alignment
|
||||
|
||||
Policy packs must interpret evidence using the evidence model’s classes and quality levels.
|
||||
|
||||
Important evidence concepts for MVP packs:
|
||||
|
||||
- no new evidence since last checkpoint
|
||||
- narrative-only or reminder-only updates count as `none`
|
||||
- next-step claims should cite new `decision_record`, `tool_output`, `file_change`, `runtime_artifact`, or other supported evidence when they assert concrete advancement
|
||||
- completion claims require `moderate+`
|
||||
- verified completion claims require `strong+`
|
||||
|
||||
### Decision model alignment
|
||||
|
||||
Policy packs must emit decisions from the canonical decision vocabulary.
|
||||
|
||||
Important decision patterns for MVP:
|
||||
|
||||
- `force_checkpoint` for missed checkpoint and dropped visible follow-up
|
||||
- `annotate_placeholder` for status reports that would otherwise overstate progress
|
||||
- `downgrade_status` for unsupported completion claims
|
||||
- `require_review` for ambiguous completion evidence or borderline acceptance cases
|
||||
- `block` when a required report structure or hard prerequisite is absent
|
||||
|
||||
## MVP policy packs
|
||||
|
||||
### 1. No Silence
|
||||
|
||||
Purpose:
|
||||
- prevent missed checkpoints from becoming blackholes
|
||||
- prevent child-result arrival from remaining invisible to the operator
|
||||
- prevent “silent” tasks unless a valid externalized checkpoint path exists
|
||||
|
||||
Required failure modes covered:
|
||||
- missed checkpoint
|
||||
- subagent result not forwarded
|
||||
- silent task without valid externalized checkpoint path
|
||||
- result arrived but no operator-visible follow-up
|
||||
- promised follow-up not delivered
|
||||
|
||||
Typical decisions:
|
||||
- `force_checkpoint`
|
||||
- `block`
|
||||
- `escalate` for repeated or critical cases
|
||||
|
||||
### 2. No Fake Progress
|
||||
|
||||
Purpose:
|
||||
- prevent narrative activity from being treated as real progress
|
||||
- require new evidence for progress-bearing checkpoints
|
||||
- stop reminder-only or repeated status-only updates from being framed as advancement
|
||||
|
||||
Required failure modes covered:
|
||||
- no new evidence since last checkpoint
|
||||
- repeated status-only updates
|
||||
- reminder-only activity presented as progress
|
||||
- next-step claim without evidence
|
||||
- promised follow-up not delivered when the promised next update never gains evidence
|
||||
|
||||
Typical decisions:
|
||||
- `annotate_placeholder`
|
||||
- `rewrite`
|
||||
- `force_checkpoint` when silence risk is also active
|
||||
|
||||
### 3. Verified Completion Only
|
||||
|
||||
Purpose:
|
||||
- prevent unsupported completion from being treated as accepted completion
|
||||
- automatically downgrade unsupported completion to `pending_verification`
|
||||
- route ambiguous completion to operator review
|
||||
|
||||
Required failure modes covered:
|
||||
- no completion without required evidence
|
||||
- auto-downgrade to `pending_verification`
|
||||
- operator review path when completion is ambiguous
|
||||
- next-step or closure claim unsupported by evidence
|
||||
|
||||
Typical decisions:
|
||||
- `downgrade_status`
|
||||
- `require_review`
|
||||
- `block` for hard invalid completion attempts
|
||||
|
||||
### 4. Mandatory Checkpoint Structure
|
||||
|
||||
Purpose:
|
||||
- require reports to include a stable operator-usable structure
|
||||
- ensure that every checkpoint answers the core managerial questions
|
||||
- prevent reports that look active but omit actionable next-state information
|
||||
|
||||
Required structure equivalents:
|
||||
- current status
|
||||
- completed this segment
|
||||
- next step
|
||||
- next report condition
|
||||
- whether operator intervention is needed
|
||||
|
||||
Typical decisions:
|
||||
- `rewrite`
|
||||
- `block`
|
||||
- `force_checkpoint` when missing structure would otherwise hide the actual task state
|
||||
|
||||
## Evaluator guidance
|
||||
|
||||
### Rule ordering
|
||||
|
||||
Recommended MVP evaluation order:
|
||||
|
||||
1. `no-silence`
|
||||
2. `mandatory-checkpoint-structure`
|
||||
3. `no-fake-progress`
|
||||
4. `verified-completion-only`
|
||||
|
||||
Rationale:
|
||||
- visibility failures should be corrected first
|
||||
- then the structure of the operator-visible report
|
||||
- then truthfulness of progress wording
|
||||
- then closure and acceptance semantics
|
||||
|
||||
### Multiple matches
|
||||
|
||||
If multiple rules match, the evaluator should prefer the most safety-preserving outcome.
|
||||
|
||||
Recommended precedence when conflicts occur:
|
||||
|
||||
1. `escalate`
|
||||
2. `block`
|
||||
3. `force_checkpoint`
|
||||
4. `downgrade_status`
|
||||
5. `require_review`
|
||||
6. `rewrite`
|
||||
7. `annotate_placeholder`
|
||||
8. `allow`
|
||||
|
||||
If one rule requires an operator-visible notice and another does not, the safer behavior is to preserve the notice requirement.
|
||||
|
||||
### Audit trail
|
||||
|
||||
When a rule rewrites, annotates, blocks, or downgrades a claim, the runtime should preserve:
|
||||
|
||||
- original attempted message or claim text
|
||||
- triggering event IDs when available
|
||||
- attached or missing evidence summary
|
||||
- resulting canonical decision object
|
||||
- operator-visible message actually sent
|
||||
|
||||
## Future evolution
|
||||
|
||||
Potential future extensions include:
|
||||
|
||||
- multi-file packs with shared templates
|
||||
- environment/profile overlays
|
||||
- schema-backed pack validation
|
||||
- rule suppression windows for approved maintenance flows
|
||||
- pack composition and inheritance
|
||||
- rule telemetry for false-positive tuning
|
||||
|
||||
## Summary
|
||||
|
||||
The MVP policy-pack model gives the reporting-governance plugin a portable enforcement surface. It standardizes how reporting rules are declared, how they consume canonical events and evidence, and how they produce operator-visible decisions.
|
||||
|
||||
The four initial packs focus on the observed operational failure modes that matter most:
|
||||
|
||||
- silence
|
||||
- fake progress
|
||||
- unsupported completion
|
||||
- structurally incomplete checkpoints
|
||||
|
||||
Together, they provide a practical first layer of governance that is explicit, reviewable, and aligned with the event, evidence, and decision models.
|
||||
Reference in New Issue
Block a user