reporting-governance-plugin/docs/specs/reporting-governance-policy-packs.md

# Reporting Governance Policy Packs

## Purpose

This document defines the MVP policy-pack format for the reporting-governance plugin and specifies the first four reusable packs:

- `no-silence`
- `no-fake-progress`
- `verified-completion-only`
- `mandatory-checkpoint-structure`

These packs convert reporting expectations into portable policy artifacts that can be versioned, reviewed, pinned, and enforced across runtimes.

They are designed to align with:

- `docs/specs/reporting-governance-event-model.md`
- `docs/specs/reporting-governance-evidence-model.md`
- `docs/specs/reporting-governance-decision-model.md`

## Design goals

The MVP policy-pack system should:

- express reporting rules as versioned artifacts instead of prompt folklore
- map canonical events plus evidence into canonical decisions
- make operator-visible follow-up obligations explicit
- preserve auditability when progress or completion claims are rewritten, blocked, or downgraded
- capture the concrete failure modes observed today:
  - result arrived but no operator-visible follow-up
  - promised follow-up not delivered
  - next-step claim without evidence

## Policy-pack scope

A policy pack is a deployable bundle of one or more governance rules focused on a single reporting risk area.

For MVP, each pack is a single `policy.yaml` file living at:

- `policy-packs/<pack-id>/policy.yaml`

Future versions may allow multiple rule files, shared templates, profile overlays, or environment-specific wiring. MVP intentionally keeps the artifact simple.

## Canonical policy-pack structure

Every policy pack should use the following top-level shape.

```yaml
apiVersion: reporting-governance/v1alpha1
kind: PolicyPack
metadata:
  id: no-silence
  title: No Silence
  version: 1.0.0
  summary: >-
    Prevent missed checkpoints, invisible subagent handoffs, and silent task execution.
  owner: reporting-governance-plugin
  severity_default: high
  applies_to:
    runtimes: [openclaw]
    task_modes: [interactive, silent]
  tags: [reporting, checkpoints, anti-blackhole]

spec:
  evaluation_mode: any_rule_match
  rules:
    - id: no-silence.missed-checkpoint
      title: Missed checkpoint requires immediate visible recovery
      triggers:
        event_types: [task_checkpoint_due, silence_timeout]
      conditions:
        all:
          - fact: checkpoint.is_overdue
            equals: true
      evidence_requirements:
        progress:
          min_new_items_since_last_checkpoint: 1
      decision_output:
        decision: force_checkpoint
        severity: high
        suggested_status: in_progress
      operator_message_templates:
        checkpoint_forced: >-
          Required update: this task exceeded its reporting window...
```

## Top-level fields

### `apiVersion`
Stable format version for the policy artifact.

### `kind`
Must be `PolicyPack` for MVP.

### `metadata`
Describes the pack as a versioned operational artifact.

Required metadata fields:

| Field | Type | Notes |
| --- | --- | --- |
| `id` | string | Stable pack identifier. |
| `title` | string | Human-readable pack name. |
| `version` | string | Semantic or release version. |
| `summary` | string | Short operator/reviewer summary. |
| `owner` | string | Responsible component or team. |
| `severity_default` | string | Default severity when a rule does not override it. |
| `applies_to` | object | Runtime/workflow applicability. |
| `tags` | array | Search and grouping labels. |

Recommended `applies_to` fields:

- `runtimes`
- `task_modes`
- `workflow_shapes`
- `channels`

### `spec`
Contains evaluation behavior and rule definitions.

Required spec fields:

| Field | Type | Notes |
| --- | --- | --- |
| `evaluation_mode` | string | MVP values: `any_rule_match` or `first_match`. |
| `rules` | array | One or more policy rules. |

## Rule structure

Each rule should use the following shape.

| Field | Type | Required | Notes |
| --- | --- | --- | --- |
| `id` | string | yes | Stable rule identifier used as `policy_id` in decisions. |
| `title` | string | yes | Human-readable rule title. |
| `intent` | string | yes | Short explanation of the behavior being governed. |
| `triggers` | object | yes | Canonical event types or derived trigger states. |
| `conditions` | object | yes | Boolean logic over event, evidence, decision, or task facts. |
| `evidence_requirements` | object | yes | Minimum evidence required for the governed claim or checkpoint. |
| `decision_output` | object | yes | Canonical decision-model output template. |
| `operator_message_templates` | object | yes | Operator-facing message templates for forced notices, rewrites, and review requests. |
| `notes` | array | no | Design notes, caveats, or alignment comments. |

## `triggers`

`triggers` identifies when a rule should evaluate.

Recommended fields:

| Field | Type | Notes |
| --- | --- | --- |
| `event_types` | array | Canonical event types from the event model. |
| `derived_signals` | array | Normalized evaluator signals such as `checkpoint_overdue`. |
| `claim_types` | array | Claim categories such as `progress`, `completion`, `verified_completion`. |

Example:

```yaml
triggers:
  event_types: [task_claimed_complete]
  claim_types: [completion, verified_completion]
```

## `conditions`

`conditions` describes what must be true for the rule to fire.

MVP uses structured boolean groups:

- `all`
- `any`
- `not`

Each leaf condition references a normalized evaluator fact, for example:

- `checkpoint.is_overdue`
- `checkpoint.externalized_path_valid`
- `forwarding.result_available_without_visible_followup`
- `evidence.new_items_since_last_checkpoint`
- `evidence.completion_min_quality`
- `claim.promised_followup_due`
- `claim.next_step_has_supporting_evidence`
- `message.has_required_checkpoint_fields`

Common comparators:

- `equals`
- `not_equals`
- `greater_than`
- `less_than`
- `in`
- `contains`

## `evidence_requirements`

`evidence_requirements` connects the pack to the canonical evidence model.

Recommended fields:

| Field | Type | Notes |
| --- | --- | --- |
| `progress.min_new_items_since_last_checkpoint` | integer | Minimum new evidence items for progress-bearing reports. |
| `progress.allowed_quality_floor` | string | Minimum evidence quality allowed for progress claims. |
| `completion.min_quality` | string | Minimum quality for completion claims, usually `moderate`. |
| `verified_completion.min_quality` | string | Minimum quality for verified completion, usually `strong`. |
| `must_reference_event_types` | array | Event types that must be cited in operator-visible notices. |
| `must_reference_evidence_classes` | array | Evidence classes required for the claim path. |

Semantics:

- Progress claims must respect the evidence model requirement for at least one **new** evidence item since the previous checkpoint.
- Completion claims must meet the evidence model threshold of at least `moderate` support.
- Verified completion claims must meet the evidence model threshold of at least `strong` support.
- Some policies require operator-visible disclosure even when the evidence is runtime-derived rather than user-facing.

## `decision_output`

`decision_output` must map to the canonical decision model.

Required fields:

| Field | Type | Notes |
| --- | --- | --- |
| `decision` | string | Canonical decision value such as `block`, `force_checkpoint`, `downgrade_status`, `annotate_placeholder`, `require_review`. |
| `severity` | string | Canonical severity. |
| `reason` | string | Stable rationale template. |
| `suggested_status` | string or null | Recommended workflow status. |
| `required_actions` | array | Canonical required actions. |
| `operator_notice` | object | Operator-visible notice requirements. |

Rule-specific variable interpolation is allowed in messages and reasons, but the resulting decision must still be expressible in the decision model defined in `docs/specs/reporting-governance-decision-model.md`.

## `operator_message_templates`

Each rule must provide operator-facing templates so the runtime does not have to improvise compliance language.

Recommended template keys:

- `checkpoint_forced`
- `placeholder_rewrite`
- `review_required`
- `status_downgraded`
- `blocked`

Template goals:

- clearly distinguish provisional vs verified states
- disclose when evidence is missing, repeated, or ambiguous
- reference required events or artifacts when the decision model says `must_reference`
- make follow-up obligations explicit

## Alignment with event, evidence, and decision models

### Event model alignment

Policy packs must trigger from canonical events and derived facts built from them.

Important event-model mappings for MVP packs:

- `task_checkpoint_due`
- `task_checkpoint_sent`
- `task_claimed_complete`
- `subagent_completed`
- `subagent_result_forwarded`
- `subagent_result_not_forwarded`
- `silence_timeout`
- `forced_operator_update`
- `operator_review_requested`

### Evidence model alignment

Policy packs must interpret evidence using the evidence model’s classes and quality levels.

Important evidence concepts for MVP packs:

- no new evidence since last checkpoint
- narrative-only or reminder-only updates count as `none`
- next-step claims should cite new `decision_record`, `tool_output`, `file_change`, `runtime_artifact`, or other supported evidence when they assert concrete advancement
- completion claims require `moderate+`
- verified completion claims require `strong+`

### Decision model alignment

Policy packs must emit decisions from the canonical decision vocabulary.

Important decision patterns for MVP:

- `force_checkpoint` for missed checkpoint and dropped visible follow-up
- `annotate_placeholder` for status reports that would otherwise overstate progress
- `downgrade_status` for unsupported completion claims
- `require_review` for ambiguous completion evidence or borderline acceptance cases
- `block` when a required report structure or hard prerequisite is absent

## MVP policy packs

### 1. No Silence

Purpose:
- prevent missed checkpoints from becoming blackholes
- prevent child-result arrival from remaining invisible to the operator
- prevent “silent” tasks unless a valid externalized checkpoint path exists

Required failure modes covered:
- missed checkpoint
- subagent result not forwarded
- silent task without valid externalized checkpoint path
- result arrived but no operator-visible follow-up
- promised follow-up not delivered

Typical decisions:
- `force_checkpoint`
- `block`
- `escalate` for repeated or critical cases

### 2. No Fake Progress

Purpose:
- prevent narrative activity from being treated as real progress
- require new evidence for progress-bearing checkpoints
- stop reminder-only or repeated status-only updates from being framed as advancement

Required failure modes covered:
- no new evidence since last checkpoint
- repeated status-only updates
- reminder-only activity presented as progress
- next-step claim without evidence
- promised follow-up not delivered when the promised next update never gains evidence

Typical decisions:
- `annotate_placeholder`
- `rewrite`
- `force_checkpoint` when silence risk is also active

### 3. Verified Completion Only

Purpose:
- prevent unsupported completion from being treated as accepted completion
- automatically downgrade unsupported completion to `pending_verification`
- route ambiguous completion to operator review

Required failure modes covered:
- no completion without required evidence
- auto-downgrade to `pending_verification`
- operator review path when completion is ambiguous
- next-step or closure claim unsupported by evidence

Typical decisions:
- `downgrade_status`
- `require_review`
- `block` for hard invalid completion attempts

### 4. Mandatory Checkpoint Structure

Purpose:
- require reports to include a stable operator-usable structure
- ensure that every checkpoint answers the core managerial questions
- prevent reports that look active but omit actionable next-state information

Required structure equivalents:
- current status
- completed this segment
- next step
- next report condition
- whether operator intervention is needed

Typical decisions:
- `rewrite`
- `block`
- `force_checkpoint` when missing structure would otherwise hide the actual task state

## Evaluator guidance

### Rule ordering

Recommended MVP evaluation order:

1. `no-silence`
2. `mandatory-checkpoint-structure`
3. `no-fake-progress`
4. `verified-completion-only`

Rationale:
- visibility failures should be corrected first
- then the structure of the operator-visible report
- then truthfulness of progress wording
- then closure and acceptance semantics

### Multiple matches

If multiple rules match, the evaluator should prefer the most safety-preserving outcome.

Recommended precedence when conflicts occur:

1. `escalate`
2. `block`
3. `force_checkpoint`
4. `downgrade_status`
5. `require_review`
6. `rewrite`
7. `annotate_placeholder`
8. `allow`

If one rule requires an operator-visible notice and another does not, the safer behavior is to preserve the notice requirement.

### Audit trail

When a rule rewrites, annotates, blocks, or downgrades a claim, the runtime should preserve:

- original attempted message or claim text
- triggering event IDs when available
- attached or missing evidence summary
- resulting canonical decision object
- operator-visible message actually sent

## Future evolution

Potential future extensions include:

- multi-file packs with shared templates
- environment/profile overlays
- schema-backed pack validation
- rule suppression windows for approved maintenance flows
- pack composition and inheritance
- rule telemetry for false-positive tuning

## Summary

The MVP policy-pack model gives the reporting-governance plugin a portable enforcement surface. It standardizes how reporting rules are declared, how they consume canonical events and evidence, and how they produce operator-visible decisions.

The four initial packs focus on the observed operational failure modes that matter most:

- silence
- fake progress
- unsupported completion
- structurally incomplete checkpoints

Together, they provide a practical first layer of governance that is explicit, reviewable, and aligned with the event, evidence, and decision models.