reporting-governance-plugin/docs/runbooks/silent-long-task-reminder-contract.md

# Silent Long-Task Reminder Contract

This runbook defines the minimum contract for any reminder / forced checkpoint used by a silent long-task.

## Goal

A reminder must not be a vague "still working?" ping.
It must carry enough context to force an honest state update.

## Minimum fields

Every reminder / forced checkpoint should include:

- **this is a reminder / forced checkpoint**
- **task name**
- **original goal**
- **what counts as evidence**
- **what to do if still unfinished**
- **what to do if there is no new evidence**
- **how to hand off if owner judgement is needed**

## Contract shape

Recommended structure:

1. `Reminder:` This is a forced checkpoint for a silent long-task.
2. `Task:` <task name>
3. `Goal:` <what the task is trying to achieve>
4. `Evidence that counts:` <file change / verification output / decision / blocker change / external result>
5. `If unfinished:` report current status, latest real evidence, next step, next report condition.
6. `If no new evidence:` do not claim progress; downgrade to `paused` or `blocked`.
7. `If owner judgement is needed:` use the planned handoff path (for Telegram, usually button-path).

## Example reminder text

### Example 1: Research task

Reminder: This is a forced checkpoint for a silent long-task.
Task: Research OpenClawGovernor concepts
Goal: Identify what concepts are worth borrowing into our current agent governance model.
Evidence that counts: new findings, comparisons, extracted design principles, concrete conclusions.
If unfinished: report current status, real findings so far, next step, and next report condition.
If no new evidence: do not claim progress; mark `paused` or `blocked`.
If owner judgement is needed: hand off with the planned decision path.

### Example 2: Debugging task

Reminder: This is a forced checkpoint for a silent long-task.
Task: Diagnose Telegram reply closure bug
Goal: Find why button-first ordering still fails in some full-flow cases.
Evidence that counts: reproduced failure, narrowed root cause, code/path identified, mitigation validated.
If unfinished: report current status, latest concrete evidence, next debugging step, and next report condition.
If no new evidence: stop pretending active progress; mark `blocked` or `paused`.
If owner judgement is needed: summarize findings and use the planned handoff.

### Example 3: Delegated / waiting task

Reminder: This is a forced checkpoint for a silent long-task.
Task: Wait for delegated survey result
Goal: Receive and validate the subagent survey result for next-step decisions.
Evidence that counts: new subagent output, session history retrieved, re-dispatch done, blocker identified.
If unfinished: report current waiting state, last real evidence, next action, and next report condition.
If no new evidence: do not keep task cosmetically active; mark `blocked` or `paused`.
If owner judgement is needed: use the planned Telegram decision handoff.

## Writing guidance

- Write reminder text so it reads naturally when fired.
- Avoid internal jargon without context.
- Make the task and goal explicit enough that the forced checkpoint is self-contained.
- A reminder should force honesty, not enable fake activity.