openclaw/approved-plan-continuity-hard-gate

Files

openclaw@cowbay.org 111cf27634 feat: export continuity hard-gate and watchdog workstream

2026-04-24 12:36:31 +08:00

19 KiB

Raw Blame History

Subagent Anti-Blackhole / Completion-Delivery Watchdog Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Prevent B-class fake timeouts where a subagent finishes, stalls, or loses its return path off-thread and the main conversation never receives a trustworthy completion update.

Architecture: Build this in very small layers: first define receipts and states, then pin the blackhole cases with fail-first tests, then implement deterministic receipt-state logic, then add done-but-not-forwarded recovery decisions, then add owner-visible reporting rules and scenario simulations. Keep all early slices file-backed and test-driven before touching any live-session integration.

Tech Stack: Node.js, MJS test runners, file-backed JSON state, OpenClaw subagent/session concepts, docs/runbooks

Task 1: Define dispatch receipt fields

Files:

Modify: docs/runbooks/subagent-anti-blackhole.md

Step 1: Write the receipt field list

Define only dispatch fields:
- runId
- childSessionKey
- dispatchAt
- expectedBy

Step 2: Verify file contains the new field names Run: grep -n "runId\|childSessionKey\|dispatchAt\|expectedBy" docs/runbooks/subagent-anti-blackhole.md Expected: matching lines found

Step 3: Commit

git add docs/runbooks/subagent-anti-blackhole.md
git commit -m "docs: define subagent dispatch receipt fields"

Task 2: Define completion receipt fields

Files:

Modify: docs/runbooks/subagent-anti-blackhole.md

Step 1: Write the completion field list

Define only completion fields:
- completionReceivedAt
- forwardedToMain
- resultSource

Step 2: Verify file contains the new field names Run: grep -n "completionReceivedAt\|forwardedToMain\|resultSource" docs/runbooks/subagent-anti-blackhole.md Expected: matching lines found

Step 3: Commit

git add docs/runbooks/subagent-anti-blackhole.md
git commit -m "docs: define subagent completion receipt fields"

Task 3: Define watchdog statuses

Files:

Modify: docs/runbooks/subagent-anti-blackhole.md

Step 1: Add the status enum

Define:
- active
- suspect_delivery_failure
- done_but_not_forwarded
- completed
- recovered
- blocked

Step 2: Verify status names exist Run: grep -n "suspect_delivery_failure\|done_but_not_forwarded\|recovered" docs/runbooks/subagent-anti-blackhole.md Expected: matching lines found

Step 3: Commit

git add docs/runbooks/subagent-anti-blackhole.md
git commit -m "docs: define subagent watchdog statuses"

Task 4: Define B-class failure modes

Files:

Modify: docs/runbooks/subagent-anti-blackhole.md

Step 1: Write the failure mode bullets

Add:
- done but not forwarded
- no completion event received
- session exists but no result bounce
- unclear slow-run vs delivery failure

Step 2: Verify phrases exist Run: grep -n "done but not forwarded\|completion event\|result bounce\|delivery failure" docs/runbooks/subagent-anti-blackhole.md Expected: matching lines found

Step 3: Commit

git add docs/runbooks/subagent-anti-blackhole.md
git commit -m "docs: define B-class subagent failure modes"

Task 5: Create watchdog script skeleton

Files:

Create: scripts/subagent_delivery_watchdog.mjs

Step 1: Create the script shell

Add CLI parsing and a placeholder JSON response.

Step 2: Verify it runs Run: node scripts/subagent_delivery_watchdog.mjs --compact --input /dev/null || true Expected: script exists and is executable enough for next test work

Step 3: Commit

git add scripts/subagent_delivery_watchdog.mjs
git commit -m "chore: add subagent delivery watchdog skeleton"

Task 6: Create watchdog test skeleton

Files:

Create: scripts/test_subagent_delivery_watchdog.mjs

Step 1: Create the test shell

Add basic harness structure and fixture runner.

Step 2: Verify test file executes Run: node scripts/test_subagent_delivery_watchdog.mjs || true Expected: test runner executes, even if failing

Step 3: Commit

git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: add subagent watchdog test skeleton"

Task 7: Add active-before-SLA test

Files:

Modify: scripts/test_subagent_delivery_watchdog.mjs

Step 1: Write the test

dispatch exists
no completion receipt yet
current time still before SLA
expect active

Step 2: Run test to verify it fails Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: FAIL on missing logic

Step 3: Commit

git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: require active status before SLA breach"

Task 8: Add suspect-delivery-failure test

Files:

Modify: scripts/test_subagent_delivery_watchdog.mjs

Step 1: Write the test

dispatch exists
no completion receipt
current time beyond SLA
expect suspect_delivery_failure

Step 2: Run test to verify it fails Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: FAIL on new assertion

Step 3: Commit

git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: detect suspected delivery failure after SLA"

Task 9: Add completed-status test

Files:

Modify: scripts/test_subagent_delivery_watchdog.mjs

Step 1: Write the test

dispatch exists
completion receipt exists
expect completed

Step 2: Run test to verify it fails Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: FAIL on completed path

Step 3: Commit

git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: close watchdog on completion receipt"

Task 10: Add state shape fixture

Files:

Create: state/subagent-delivery-watchdog/README.md
Create: state/subagent-delivery-watchdog/.gitkeep

Step 1: Define the state JSON shape in README

Include receipt fields and status fields.

Step 2: Verify files exist Run: test -f state/subagent-delivery-watchdog/README.md && test -f state/subagent-delivery-watchdog/.gitkeep && echo OK Expected: OK

Step 3: Commit

git add state/subagent-delivery-watchdog/README.md state/subagent-delivery-watchdog/.gitkeep
git commit -m "docs: define watchdog state storage shape"

Task 11: Implement dispatch receipt write

Files:

Modify: scripts/subagent_delivery_watchdog.mjs

Step 1: Add a function to write dispatch receipt state

Only handle a new dispatch record.

Step 2: Run tests Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: some tests still fail, but dispatch state path exists

Step 3: Commit

git add scripts/subagent_delivery_watchdog.mjs
git commit -m "feat: write subagent dispatch receipt state"

Task 12: Implement completion receipt write

Files:

Modify: scripts/subagent_delivery_watchdog.mjs

Step 1: Add a function to write completion receipt state

Only update completion-related fields.

Step 2: Run tests Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: some tests still fail, but completion data path exists

Step 3: Commit

git add scripts/subagent_delivery_watchdog.mjs
git commit -m "feat: write subagent completion receipt state"

Task 13: Implement status recompute for active/completed/suspect

Files:

Modify: scripts/subagent_delivery_watchdog.mjs

Step 1: Add status recompute logic

Implement only:
- active
- suspect_delivery_failure
- completed

Step 2: Run tests Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: Task 7-9 tests pass

Step 3: Commit

git add scripts/subagent_delivery_watchdog.mjs scripts/test_subagent_delivery_watchdog.mjs
git commit -m "feat: recompute basic watchdog statuses"

Task 14: Add done-but-not-forwarded test

Files:

Modify: scripts/test_subagent_delivery_watchdog.mjs

Step 1: Write the test

child run marked done
no completion receipt in main thread
expect done_but_not_forwarded

Step 2: Run tests to verify it fails Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: FAIL on new assertion

Step 3: Commit

git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: detect done but not forwarded state"

Task 15: Implement done-but-not-forwarded state

Files:

Modify: scripts/subagent_delivery_watchdog.mjs

Step 1: Add done-but-not-forwarded detection

Use child-done signal + missing completion receipt.

Step 2: Run tests Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: done-but-not-forwarded test passes

Step 3: Commit

git add scripts/subagent_delivery_watchdog.mjs scripts/test_subagent_delivery_watchdog.mjs
git commit -m "feat: detect done without forwarded completion"

Task 16: Add first recovery-action test

Files:

Modify: scripts/test_subagent_delivery_watchdog.mjs

Step 1: Write fetch-history recovery test

done but not forwarded
no prior recovery action
expect recovery decision fetch_history

Step 2: Run tests to verify it fails Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: FAIL on recovery decision

Step 3: Commit

git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: fetch history after missing forwarded completion"

Task 17: Implement fetch-history recovery decision

Files:

Modify: scripts/subagent_delivery_watchdog.mjs

Step 1: Add minimal recovery decision logic

Return fetch_history for first-time done-but-not-forwarded.

Step 2: Run tests Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: fetch-history recovery test passes

Step 3: Commit

git add scripts/subagent_delivery_watchdog.mjs scripts/test_subagent_delivery_watchdog.mjs
git commit -m "feat: recover with history fetch first"

Task 18: Add respawn-escalation test

Files:

Modify: scripts/test_subagent_delivery_watchdog.mjs

Step 1: Write the respawn test

recovery already attempted once
still no forwarded completion
expect respawn

Step 2: Run tests to verify it fails Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: FAIL on respawn decision

Step 3: Commit

git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: escalate to respawn after failed recovery"

Task 19: Implement respawn decision

Files:

Modify: scripts/subagent_delivery_watchdog.mjs

Step 1: Add respawn logic

Return respawn when fetch-history path did not recover delivery.

Step 2: Run tests Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: respawn test passes

Step 3: Commit

git add scripts/subagent_delivery_watchdog.mjs scripts/test_subagent_delivery_watchdog.mjs
git commit -m "feat: respawn after failed delivery recovery"

Task 20: Add blocked-escalation test

Files:

Modify: scripts/test_subagent_delivery_watchdog.mjs

Step 1: Write the blocked test

repeated recovery failure
expect blocked plus owner-visible reporting requirement

Step 2: Run tests to verify it fails Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: FAIL on blocked escalation

Step 3: Commit

git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: escalate repeated delivery failures to blocked"

Task 21: Implement blocked escalation

Files:

Modify: scripts/subagent_delivery_watchdog.mjs

Step 1: Add blocked escalation logic

repeated recovery failure -> blocked

Step 2: Run tests Run: node scripts/test_subagent_delivery_watchdog.mjs Expected: blocked escalation test passes

Step 3: Commit

git add scripts/subagent_delivery_watchdog.mjs scripts/test_subagent_delivery_watchdog.mjs
git commit -m "feat: block repeated subagent delivery failures"

Task 22: Add owner-visible reporting rule for suspect state

Files:

Modify: WORKFLOW.md
Modify: AGENTS.md
Modify: docs/runbooks/subagent-anti-blackhole.md

Step 1: Add suspect-state reporting rule

If SLA is crossed with no completion receipt, the owner must be informed.

Step 2: Verify text exists Run: grep -RIn "SLA\|suspect_delivery_failure" WORKFLOW.md AGENTS.md docs/runbooks/subagent-anti-blackhole.md Expected: matching lines found

Step 3: Commit

git add WORKFLOW.md AGENTS.md docs/runbooks/subagent-anti-blackhole.md
git commit -m "docs: require reporting on suspect delivery failure"

Task 23: Add owner-visible reporting rule for done-but-not-forwarded

Files:

Modify: WORKFLOW.md
Modify: AGENTS.md
Modify: docs/runbooks/subagent-anti-blackhole.md

Step 1: Add done-but-not-forwarded reporting rule

Must state that result exists but did not bounce back.

Step 2: Verify text exists Run: grep -RIn "done but not forwarded\|did not bounce back" WORKFLOW.md AGENTS.md docs/runbooks/subagent-anti-blackhole.md Expected: matching lines found

Step 3: Commit

git add WORKFLOW.md AGENTS.md docs/runbooks/subagent-anti-blackhole.md
git commit -m "docs: require reporting on missing forwarded completion"

Task 24: Add rule to fetch history before respawn

Files:

Modify: WORKFLOW.md
Modify: docs/runbooks/subagent-delivery-recovery.md

Step 1: Add the history-first rule

Done-but-not-forwarded should prefer fetch_history before respawn.

Step 2: Verify text exists Run: grep -RIn "fetch_history\|before respawn" WORKFLOW.md docs/runbooks/subagent-delivery-recovery.md Expected: matching lines found

Step 3: Commit

git add WORKFLOW.md docs/runbooks/subagent-delivery-recovery.md
git commit -m "docs: prefer history fetch before respawn"

Task 25: Add no-silent-waiting-after-SLA rule

Files:

Modify: WORKFLOW.md
Modify: AGENTS.md

Step 1: Add the no-silent-waiting rule

Once SLA is crossed, silent waiting is forbidden.

Step 2: Verify text exists Run: grep -RIn "silent waiting\|SLA" WORKFLOW.md AGENTS.md Expected: matching lines found

Step 3: Commit

git add WORKFLOW.md AGENTS.md
git commit -m "docs: forbid silent waiting after subagent SLA"

Task 26: Create blackhole scenario test shell

Files:

Create: scripts/test_subagent_blackhole_scenarios.mjs

Step 1: Create the scenario test shell

Add empty scenario harness.

Step 2: Verify file runs Run: node scripts/test_subagent_blackhole_scenarios.mjs || true Expected: file executes, even if not complete

Step 3: Commit

git add scripts/test_subagent_blackhole_scenarios.mjs
git commit -m "test: add subagent blackhole scenario harness"

Task 27: Add normal-completion scenario

Files:

Modify: scripts/test_subagent_blackhole_scenarios.mjs

Step 1: Write the scenario

dispatch -> completion receipt -> completed

Step 2: Run tests Run: node scripts/test_subagent_blackhole_scenarios.mjs Expected: scenario still may fail until engine wiring is ready

Step 3: Commit

git add scripts/test_subagent_blackhole_scenarios.mjs
git commit -m "test: add normal subagent completion scenario"

Task 28: Add slow-but-active scenario

Files:

Modify: scripts/test_subagent_blackhole_scenarios.mjs

Step 1: Write the scenario

dispatch before SLA -> active

Step 2: Run tests Run: node scripts/test_subagent_blackhole_scenarios.mjs Expected: scenario result captured

Step 3: Commit

git add scripts/test_subagent_blackhole_scenarios.mjs
git commit -m "test: add slow but active subagent scenario"

Task 29: Add done-but-not-forwarded scenario

Files:

Modify: scripts/test_subagent_blackhole_scenarios.mjs

Step 1: Write the scenario

child done -> no completion receipt -> fetch_history

Step 2: Run tests Run: node scripts/test_subagent_blackhole_scenarios.mjs Expected: scenario result captured

Step 3: Commit

git add scripts/test_subagent_blackhole_scenarios.mjs
git commit -m "test: add done but not forwarded scenario"

Task 30: Add missing-completion-event scenario

Files:

Modify: scripts/test_subagent_blackhole_scenarios.mjs

Step 1: Write the scenario

no bounce, no completion receipt, beyond SLA -> suspect delivery failure

Step 2: Run tests Run: node scripts/test_subagent_blackhole_scenarios.mjs Expected: scenario result captured

Step 3: Commit

git add scripts/test_subagent_blackhole_scenarios.mjs
git commit -m "test: add missing completion event scenario"

Task 31: Add repeated-failure escalation scenario

Files:

Modify: scripts/test_subagent_blackhole_scenarios.mjs

Step 1: Write the scenario

fetch_history fails -> respawn fails -> blocked

Step 2: Run tests Run: node scripts/test_subagent_blackhole_scenarios.mjs Expected: scenario result captured

Step 3: Commit

git add scripts/test_subagent_blackhole_scenarios.mjs
git commit -m "test: add repeated blackhole escalation scenario"

Task 32: Run the full local watchdog test set

Files:

Modify if needed: scripts/test_subagent_delivery_watchdog.mjs
Modify if needed: scripts/test_subagent_blackhole_scenarios.mjs

Step 1: Run the combined tests Run:

node scripts/test_subagent_delivery_watchdog.mjs
node scripts/test_subagent_blackhole_scenarios.mjs Expected: PASS

Step 2: Fix only minimal wiring needed for all-pass

Keep changes scoped to watchdog logic/tests.

Step 3: Commit

git add scripts/test_subagent_delivery_watchdog.mjs scripts/test_subagent_blackhole_scenarios.mjs scripts/subagent_delivery_watchdog.mjs
git commit -m "test: pass full subagent blackhole watchdog suite"

Task 33: Peer review watchdog state logic

Files:

Review: scripts/subagent_delivery_watchdog.mjs
Review: scripts/test_subagent_delivery_watchdog.mjs

Step 1: Request reviewer focus on receipt state logic

Verify statuses and transitions match B-class failure goals.

Step 2: Record reviewer verdict

Include commands and findings.

Step 3: Commit any follow-up fixes if needed

# only if reviewer requests changes
git add <changed-files>
git commit -m "fix: address watchdog state review feedback"

Task 34: Peer review recovery decisions

Files:

Review: scripts/subagent_delivery_watchdog.mjs
Review: docs/runbooks/subagent-delivery-recovery.md

Step 1: Request reviewer focus on recovery ordering

Verify fetch-history before respawn and blocked escalation.

Step 2: Record reviewer verdict

Include commands and findings.

Step 3: Commit any follow-up fixes if needed

# only if reviewer requests changes
git add <changed-files>
git commit -m "fix: address recovery decision review feedback"

Task 35: Peer review scenario coverage and handoff

Files:

Review: scripts/test_subagent_blackhole_scenarios.mjs
Review: docs/runbooks/subagent-anti-blackhole.md
Review: docs/runbooks/subagent-delivery-recovery.md

Step 1: Request reviewer focus on blackhole realism

Confirm this targets fake timeout / no-bounce cases, not just slow work.

Step 2: Record verification output

Include exact commands and reviewer verdict.

Step 3: Final state

Leave task in pending_verification; do not mark complete.

19 KiB Raw Blame History

Subagent Anti-Blackhole / Completion-Delivery Watchdog Implementation Plan

Task 1: Define dispatch receipt fields

Task 2: Define completion receipt fields

Task 3: Define watchdog statuses

Task 4: Define B-class failure modes

Task 5: Create watchdog script skeleton

Task 6: Create watchdog test skeleton

Task 7: Add active-before-SLA test

Task 8: Add suspect-delivery-failure test

Task 9: Add completed-status test

Task 10: Add state shape fixture

Task 11: Implement dispatch receipt write

Task 12: Implement completion receipt write

Task 13: Implement status recompute for active/completed/suspect

Task 14: Add done-but-not-forwarded test

Task 15: Implement done-but-not-forwarded state

Task 16: Add first recovery-action test

Task 17: Implement fetch-history recovery decision

Task 18: Add respawn-escalation test

Task 19: Implement respawn decision

Task 20: Add blocked-escalation test

Task 21: Implement blocked escalation

Task 22: Add owner-visible reporting rule for suspect state

Task 23: Add owner-visible reporting rule for done-but-not-forwarded

Task 24: Add rule to fetch history before respawn

Task 25: Add no-silent-waiting-after-SLA rule

Task 26: Create blackhole scenario test shell

Task 27: Add normal-completion scenario

Task 28: Add slow-but-active scenario

Task 29: Add done-but-not-forwarded scenario

Task 30: Add missing-completion-event scenario

Task 31: Add repeated-failure escalation scenario

Task 32: Run the full local watchdog test set

Task 33: Peer review watchdog state logic

Task 34: Peer review recovery decisions

Task 35: Peer review scenario coverage and handoff

19 KiB

Raw Blame History