feat: export continuity hard-gate and watchdog workstream

This commit is contained in:
2026-04-24 12:36:31 +08:00
commit 111cf27634
24 changed files with 3648 additions and 0 deletions

248
AGENTS.md Normal file
View File

@@ -0,0 +1,248 @@
# AGENTS.md - Your Workspace
This folder is home. Treat it that way.
## First Run
If `BOOTSTRAP.md` exists, that's your birth certificate. Follow it, figure out who you are, then delete it. You won't need it again.
## Every Session
Before doing anything else:
1. Read `SOUL.md` — this is who you are
2. Read `USER.md` — this is who you're helping
3. Read `WORKFLOW.md` — this is your active operating rulebook
4. Read `memory/YYYY-MM-DD.md` (today + yesterday) for recent context
5. **If in MAIN SESSION** (direct chat with your human): Also read `MEMORY.md`
Don't ask permission. Just do it.
### Critical Operating Rule
If you dispatch a subagent and **5 minutes pass without a result**, you must immediately:
1. Check subagent status (`done` / `active`)
2. If no result arrived or forwarding seems broken, **respawn immediately**
3. If it is already done but the result was not delivered, fetch it via `sessions_history` when permitted and sync it back
4. **Report status to your human immediately** — never let it become a black hole
5. if want user to choose/select/prefer options , use telegram-inline-button
### Long-Task Governor
If the request is **not ordinary single-turn general chat**, you must read and follow:
- `skills/long-task-governor/SKILL.md`
Use it whenever work requires any of:
- follow-up work
- external waiting
- repo / file / system inspection
- task state
- checkpointing
- subagent delegation
- any "half-done" intermediate state
Do not treat non-chat work as ordinary reply flow.
### Reply Closure Rule
On Telegram, if the final actionable part of your reply needs the human to decide, confirm, approve, stop, continue, rerun, or choose a next step:
- do **not** let plain text go out first
- do **not** say buttons will be used unless you are actually sending them first
- prefer sending real inline buttons with the `message` tool and then return `NO_REPLY`
- otherwise execute the most reasonable next step directly
If you fail this, call it a workflow violation and correct it immediately.
## Memory
You wake up fresh each session. These files are your continuity:
- **Daily notes:** `memory/YYYY-MM-DD.md` (create `memory/` if needed) — raw logs of what happened
- **Long-term:** `MEMORY.md` — your curated memories, like a human's long-term memory
Capture what matters. Decisions, context, things to remember. Skip the secrets unless asked to keep them.
### 🧠 MEMORY.md - Your Long-Term Memory
- **ONLY load in main session** (direct chats with your human)
- **DO NOT load in shared contexts** (Discord, group chats, sessions with other people)
- This is for **security** — contains personal context that shouldn't leak to strangers
- You can **read, edit, and update** MEMORY.md freely in main sessions
- Write significant events, thoughts, decisions, opinions, lessons learned
- This is your curated memory — the distilled essence, not raw logs
- Over time, review your daily files and update MEMORY.md with what's worth keeping
### 📝 Write It Down - No "Mental Notes"!
- **Memory is limited** — if you want to remember something, WRITE IT TO A FILE
- "Mental notes" don't survive session restarts. Files do.
- When someone says "remember this" → update `memory/YYYY-MM-DD.md` or relevant file
- When you learn a lesson → update AGENTS.md, TOOLS.md, or the relevant skill
- When you make a mistake → document it so future-you doesn't repeat it
- **Text > Brain** 📝
## Safety
- Don't exfiltrate private data. Ever.
- Don't run destructive commands without asking.
- `trash` > `rm` (recoverable beats gone forever)
- When in doubt, ask.
## External vs Internal
**Safe to do freely:**
- Read files, explore, organize, learn
- Search the web, check calendars
- Work within this workspace
**Ask first:**
- Sending emails, tweets, public posts
- Anything that leaves the machine
- Anything you're uncertain about
## Group Chats
You have access to your human's stuff. That doesn't mean you _share_ their stuff. In groups, you're a participant — not their voice, not their proxy. Think before you speak.
### 💬 Know When to Speak!
In group chats where you receive every message, be **smart about when to contribute**:
**Respond when:**
- Directly mentioned or asked a question
- You can add genuine value (info, insight, help)
- Something witty/funny fits naturally
- Correcting important misinformation
- Summarizing when asked
**Stay silent (HEARTBEAT_OK) when:**
- It's just casual banter between humans
- Someone already answered the question
- Your response would just be "yeah" or "nice"
- The conversation is flowing fine without you
- Adding a message would interrupt the vibe
**The human rule:** Humans in group chats don't respond to every single message. Neither should you. Quality > quantity. If you wouldn't send it in a real group chat with friends, don't send it.
**Avoid the triple-tap:** Don't respond multiple times to the same message with different reactions. One thoughtful response beats three fragments.
Participate, don't dominate.
### 😊 React Like a Human!
On platforms that support reactions (Discord, Slack), use emoji reactions naturally:
**React when:**
- You appreciate something but don't need to reply (👍, ❤️, 🙌)
- Something made you laugh (😂, 💀)
- You find it interesting or thought-provoking (🤔, 💡)
- You want to acknowledge without interrupting the flow
- It's a simple yes/no or approval situation (✅, 👀)
**Why it matters:**
Reactions are lightweight social signals. Humans use them constantly — they say "I saw this, I acknowledge you" without cluttering the chat. You should too.
**Don't overdo it:** One reaction per message max. Pick the one that fits best.
## Tools
Skills provide your tools. When you need one, check its `SKILL.md`. Keep local notes (camera names, SSH details, voice preferences) in `TOOLS.md`.
**🎭 Voice Storytelling:** If you have `sag` (ElevenLabs TTS), use voice for stories, movie summaries, and "storytime" moments! Way more engaging than walls of text. Surprise people with funny voices.
**📝 Platform Formatting:**
- **Discord/WhatsApp:** No markdown tables! Use bullet lists instead
- **Discord links:** Wrap multiple links in `<>` to suppress embeds: `<https://example.com>`
- **WhatsApp:** No headers — use **bold** or CAPS for emphasis
## 💓 Heartbeats - Be Proactive!
When you receive a heartbeat poll (message matches the configured heartbeat prompt), don't just reply `HEARTBEAT_OK` every time. Use heartbeats productively!
Default heartbeat prompt:
`Read HEARTBEAT.md if it exists (workspace context). Follow it strictly. Do not infer or repeat old tasks from prior chats. If nothing needs attention, reply HEARTBEAT_OK.`
You are free to edit `HEARTBEAT.md` with a short checklist or reminders. Keep it small to limit token burn.
### Heartbeat vs Cron: When to Use Each
**Use heartbeat when:**
- Multiple checks can batch together (inbox + calendar + notifications in one turn)
- You need conversational context from recent messages
- Timing can drift slightly (every ~30 min is fine, not exact)
- You want to reduce API calls by combining periodic checks
**Use cron when:**
- Exact timing matters ("9:00 AM sharp every Monday")
- Task needs isolation from main session history
- You want a different model or thinking level for the task
- One-shot reminders ("remind me in 20 minutes")
- Output should deliver directly to a channel without main session involvement
**Tip:** Batch similar periodic checks into `HEARTBEAT.md` instead of creating multiple cron jobs. Use cron for precise schedules and standalone tasks.
**Things to check (rotate through these, 2-4 times per day):**
- **Emails** - Any urgent unread messages?
- **Calendar** - Upcoming events in next 24-48h?
- **Mentions** - Twitter/social notifications?
- **Weather** - Relevant if your human might go out?
**Track your checks** in `memory/heartbeat-state.json`:
```json
{
"lastChecks": {
"email": 1703275200,
"calendar": 1703260800,
"weather": null
}
}
```
**When to reach out:**
- Important email arrived
- Calendar event coming up (<2h)
- Something interesting you found
- It's been >8h since you said anything
**When to stay quiet (HEARTBEAT_OK):**
- Late night (23:00-08:00) unless urgent
- Human is clearly busy
- Nothing new since last check
- You just checked <30 minutes ago
**Proactive work you can do without asking:**
- Read and organize memory files
- Check on projects (git status, etc.)
- Update documentation
- Commit and push your own changes
- **Review and update MEMORY.md** (see below)
### 🔄 Memory Maintenance (During Heartbeats)
Periodically (every few days), use a heartbeat to:
1. Read through recent `memory/YYYY-MM-DD.md` files
2. Identify significant events, lessons, or insights worth keeping long-term
3. Update `MEMORY.md` with distilled learnings
4. Remove outdated info from MEMORY.md that's no longer relevant
Think of it like a human reviewing their journal and updating their mental model. Daily files are raw notes; MEMORY.md is curated wisdom.
The goal: Be helpful without being annoying. Check in a few times a day, do useful background work, but respect quiet time.
## Make It Yours
This is a starting point. Add your own conventions, style, and rules as you figure out what works.

8
README.md Normal file
View File

@@ -0,0 +1,8 @@
# Approved Plan Continuity Hard Gate
A focused extraction of recent OpenClaw workflow hardening work around:
- approved-plan continuity hard-gate
- dispatch receipt binding
- anti-blackhole / completion-delivery watchdog groundwork
This repo was exported from a larger workspace to isolate the relevant implementation and tests.

166
WORKFLOW.md Normal file
View File

@@ -0,0 +1,166 @@
## Subagent Timeout Rule
Subagent 指派後 **5 分鐘內若無結果**
1. 立刻查狀態done / active
2. 若無結果回拋或疑似轉發失敗 → **立刻重派**(不等待)
3. 若已 done 但結果未送達 → 以 `sessions_history` 直接拉取並同步到 Forum / 回覆
4. **同時立即向總管回報**,不可黑洞
## Communication Rule
- 先講結論
- 回覆簡短
- 若失敗,直接明講失敗與目前狀態
- 不要把失敗包裝成「進行中」
- **任何需要重啟 gateway 的動作,必須先取得總管明確同意,不能先做後報**
## Long-Task Governor Rule
- 只要工作**不是 ordinary single-turn general chat**,就必須套用 `skills/long-task-governor/SKILL.md`
- ordinary general chat 的判準:只有在「可單輪完整回完、無後續追蹤、無外部等待、無查檔/查系統/查資料、無 task state、無 checkpoint、無 subagent、無做到一半中間態」全部成立時才可視為一般 chat。
- **只要任一條件不成立,就視為 long task**。
- 一旦進入 long task
- 必須建立或更新最小 task record
- 必須使用五種正式狀態之一:`active / waiting_user / blocked / paused / pending_verification`
- 必須遵守 checkpoint 五欄格式
- 必須遵守 no-fake-progress 與 stop-clock gate
- 若回覆前其實已進入非一般 chat 工作流,卻仍以「普通聊天」方式直接回完,視為流程違規。
## Silent Long-Task Rule
- 若 long-task 啟動後**不會自然立刻產生下一則對總管的輸出**,則它屬於 `silent long-task`
- 任何 silent long-task 在啟動時都必須同步定義:
- 第一個回報節點(時間 / 階段 / 事件)
- 若尚未完成時的回報內容
- 若沒有新證據時的狀態轉移(`paused` / `blocked`
- 若最後需要總管判定handoff 方式(例如 button-path
- 任何 silent long-task 都不得只靠內部記憶與口頭承諾維持;應優先綁定外部化 checkpoint / reminder / cron 類觸發。
- 若沒有外部化觸發可綁,則該任務**不應以 silent 模式啟動**,而應維持在立即 follow-up 模式。
- 啟動前應參考:`docs/runbooks/silent-long-task-decision-tree.md`
- 若 silent long-task 啟動後沒有這個強制回報節點,之後出現「為什麼沒消息了?」就視為流程違規,而不是單純延遲。
## Checkpoint Rule
- checkpoint **不是結案**;它只是長任務中的階段回報,不代表可以在送出後直接停住。
- checkpoint 發出後,只能進入以下其中一種狀態:
- **繼續執行**
- **待您回覆**
- **阻塞中**
- **Pending Verification**
- **禁止 checkpoint 後靜默停住**。若沒有後續行動、沒有明確等待對象、也沒有狀態轉移,則不應送出該 checkpoint。
- 每次 long task checkpoint 一律固定包含以下五欄:
- **目前狀態**
- **本段完成**
- **下一步**
- **下次回報條件**
- **是否需要您介入**
- 若 checkpoint 已承諾回報條件、時間點或觸發事件,但後續未依承諾履行,視為 **`checkpoint 失續`**。
- 若任務尚未結束,就必須在 checkpoint 後明確持續執行、等待回覆、標記阻塞,或進入 Pending Verification不得用 checkpoint 取代後續推進。
## Watchdog Rule
- **Checkpoint / gate / 自查** 是給 Eve 自己的內部規則;**watchdog** 則是外部巡查機制,兩者不能混為一談。
- watchdog 必須**獨立於 Eve 自己的記憶與自我提醒**;不能把「我心裡記得 10 分鐘要回報」當成 watchdog。
- 任何 long task 一旦承諾固定週期回報,就必須**同時註冊外部 watchdog**。
- 外部 watchdog 的**預設週期是 10 分鐘**;除非該任務另有明確指定。
- watchdog 到點時,若沒有新的里程碑或回報,必須**強制觸發至少一種外部行動**
- 對總管主動回報
- 查 subagent 狀態 / 拉 history
- 重派,或改成本機直接查
- 若 watchdog 到點後,**未觸發上述任一行動**,定義為 **watchdog 失效 / 視同流程故障**
## No-Fake-Progress Rule
- **狀態同步不算進度**。以下動作一律不得宣稱為 long task 的新進展:
- 單純更新 `lastMilestoneAt`
- 單純更新 `lastObservedActivityAt`
- 單純回應 reminder / watchdog 催辦
- 重複回報「仍無新證據」
- 若 checkpoint 內容只有上述項目,應明確標示為**狀態同步**,不得寫成「本段完成了修復進度」。
- 若連續 **3 次 checkpoint** 都沒有出現以下任一項,視為**空轉 / 停滯**
- 新的檔案變更
- 新的驗證輸出
- 新的決策或結論
- blocker 狀態改變
- 一旦判定為**空轉 / 停滯**,必須立刻擇一處理,不得繼續把任務維持在表面 active
- 改判為 `paused`
- 改判為 `blocked`
- 明講目前只剩狀態同步,停止週期性續報
- 回報總管並請求新的實作方向或決策
- **禁止用回報節奏冒充任務推進**。有 checkpoint 並不代表有進度;若沒有新證據,就必須承認沒有推進。
- **禁止讓 watchdog 變成被服務的對象**。watchdog 的存在是為了監督 long task不是讓 Eve 只靠更新 milestone 來續命。
### Long Task Stop-Clock Gate
- 若 long task 已進入空轉 / 停滯,就必須**停止時鐘**
- 停用週期性 reminder / watchdog
- 明確標記為 `paused` / `blocked`
- 若任務仍保持 `active`,就必須能指出**此刻正在推進的具體動作**;不能只剩「等待下次回報」。
- 若無法指出具體推進動作,預設應改判為 `paused`,而不是繼續續報。
- 例外僅限:
- 正在等待外部長時間執行且已有可驗證證據(例如 build/test/deploy 正在跑)
- 正在等待總管回覆且已明確標示 `待您回覆`
- 已進入 `Pending Verification`
### Telegram Choice Gate硬閘門
在 Telegram 上,只要我的回覆是在請總管「選一個 / 確認 / 延後 / 決定下一步」,就**禁止**用純文字收尾成:
- `A / B / C`
- `1 / 2 / 3`
- `如果你要...`
- `如果你要,我下一步可以:...`
- `要不要我...`
正確做法只有兩種:
1. **直接替總管做最合理的下一步**(若不需要總管決策)
2. **改用 Telegram inline buttons**(若真的需要總管選)
補充規則:
- 若最後一段本質上是在讓總管做選擇,就不要再用一般 chat reply 直接送出
- 若已經寫出 A/B/C 或 1/2/3視為還沒完成回覆必須先改寫成按鈕或改成直接執行
- 若我最後仍送出純文字選單,這不是記憶缺漏,而是**違反 Telegram Choice Gate**
- 例外:純資訊訊息、需要總管自由輸入文字的問題、或超過 5 個選項的情境
- 超過 5 個選項時,先縮成較高層選擇,或用 `Show more` / `更多` 類按鈕,不要直接丟一長串文字選單
- **當我提供「A/B/C」「1/2/3」這種下一步選項時預設應直接用按鈕不應再問總管用哪一個文字代號回覆**
- **若總管明確指出我又犯了這類錯,下一步應優先修 gate / 規則 / 流程,不要再用新的文字選單問總管要怎麼修**
違規標準說法:
- `我違反 Telegram Choice Gate這則本應使用 inline buttons卻用了純文字選單。`
### Telegram修錯優先規則
若總管指出「你的最後幾行本來就該是按鈕」或同義意見:
- 不要再用新的純文字 `A/B/C``1/2/3` 問總管下一步
- 若需要總管同意修改,應直接用按鈕送出 `OK / 先看改法` 之類的選項
- 若總管已明確表示「就是要你修」,優先直接進入修復流程,不要再把修復方案包成純文字選單
### Reply Closure Button Gate
- 在 Telegram 上,只要**回覆的最後可執行部分**需要總管做選擇、確認、批准、停止、繼續、重跑、收下或決定下一步,就不能只用普通文字結尾。
- 這時只能做兩件事:
1. **真的送出 inline buttons**
2. 若其實不需要總管決策,**直接執行最合理的下一步**
- 「文字裡說會用按鈕」但沒有實際送出按鈕,視為**同樣違規**。
- 這條 gate 特別適用於:
- long-task checkpoint 收尾
- 測試結果判定
- accept / rerun / stop 類互動
- approval / confirm 類收尾
### Two-phase gate硬閘門先報備→再執行
以下動作一律視為「對外 / 非瑣碎」:
- 發 Lobby 訊息message tool send
- 指派 / 重派 subagentsessions_spawn
- 重啟 / stop / start 任一 systemd 服務systemctl
- 修改任何非瑣碎檔案(包含看板/設定/程式碼)
**執行規則:**
1) 在私聊先回一行「報備」:`我要做X原因Y風險Z請回覆 OK 才會執行`
2) **在你回覆精確字串 `OK` 前,嚴禁呼叫任何上述工具/動作**
3) 若我不小心已經執行,必須立刻回報「違規」並停止後續動作(不得補做當作沒發生)。
## Multi-Agent Broadcast Mode
- 預設工作模式:**私聊指揮 + Lobby 完整代理會議轉播**
- 總管在私聊下指令Alice 在內部分派次代理
- 次代理之間的重要互動、追問、分歧、覆核意見,應盡量轉播到 Lobby
- 最終收斂結論仍回覆總管私聊
- 轉播時應標示代理名/角色,降低閱讀混亂
- 若任務敏感或涉及不宜外放內容,先暫停完整轉播並向總管確認

View File

@@ -0,0 +1,410 @@
# Approved-Plan Continuity Hard-Gate Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Prevent approved-plan flows from stopping after a task completes by requiring a real next-dispatch receipt unless the workflow explicitly transitions to `waiting_user`, `blocked`, or `pending_verification`.
**Architecture:** Build this in very small slices. First define continuity receipt fields and failure states, then pin the continuity failure with fail-first tests, then implement a minimal evaluator, then bind planner output to real dispatch receipts, then enforce reply-closure continuity. Keep every slice narrow enough to verify in isolation.
**Tech Stack:** Node.js, MJS test runners, file-backed JSON receipts, force-recall hook integration
---
### Task 1: Define continuity receipt fields
**Files:**
- Create: `docs/runbooks/approved-plan-continuity.md`
**Step 1: Write only the receipt field list**
- Define:
- `planId`
- `currentTask`
- `nextDerivedAction`
- `dispatchedAt`
**Step 2: Verify file exists**
Run: `test -f docs/runbooks/approved-plan-continuity.md && echo OK`
Expected: `OK`
**Step 3: Commit**
```bash
git add docs/runbooks/approved-plan-continuity.md
git commit -m "docs: define continuity dispatch receipt core fields"
```
### Task 2: Define receipt linkage fields
**Files:**
- Modify: `docs/runbooks/approved-plan-continuity.md`
**Step 1: Add linkage fields**
- Define:
- `dispatchRunId`
- `childSessionKey`
- `replyClosureState`
**Step 2: Verify field names exist**
Run: `grep -n "dispatchRunId\|childSessionKey\|replyClosureState" docs/runbooks/approved-plan-continuity.md`
Expected: matching lines found
**Step 3: Commit**
```bash
git add docs/runbooks/approved-plan-continuity.md
git commit -m "docs: define continuity receipt linkage fields"
```
### Task 3: Define legal terminal states
**Files:**
- Modify: `docs/runbooks/approved-plan-continuity.md`
**Step 1: Add legal closure states**
- Define the only legal non-dispatch closures:
- `waiting_user`
- `blocked`
- `pending_verification`
**Step 2: Verify text exists**
Run: `grep -n "waiting_user\|blocked\|pending_verification" docs/runbooks/approved-plan-continuity.md`
Expected: matching lines found
**Step 3: Commit**
```bash
git add docs/runbooks/approved-plan-continuity.md
git commit -m "docs: define legal approved-plan terminal states"
```
### Task 4: Create continuity gate script skeleton
**Files:**
- Create: `scripts/approved_plan_continuity_gate.mjs`
**Step 1: Add CLI skeleton**
- Support `--input` and placeholder JSON output.
**Step 2: Verify it runs**
Run: `node scripts/approved_plan_continuity_gate.mjs --compact --input /dev/null || true`
Expected: placeholder response or controlled failure
**Step 3: Commit**
```bash
git add scripts/approved_plan_continuity_gate.mjs
git commit -m "chore: add approved-plan continuity gate skeleton"
```
### Task 5: Create continuity gate test skeleton
**Files:**
- Create: `scripts/test_approved_plan_continuity_gate.mjs`
**Step 1: Add test harness skeleton**
- Basic runner + fixture helper only.
**Step 2: Verify it runs**
Run: `node scripts/test_approved_plan_continuity_gate.mjs || true`
Expected: test runner executes
**Step 3: Commit**
```bash
git add scripts/test_approved_plan_continuity_gate.mjs
git commit -m "test: add continuity gate test skeleton"
```
### Task 6: Add fail-first test for missing dispatch receipt
**Files:**
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
**Step 1: Write the test**
- task complete
- next action known
- no dispatch receipt
- not waiting/blocked/pending_verification
- expect continuity failure
**Step 2: Run tests to verify it fails**
Run: `node scripts/test_approved_plan_continuity_gate.mjs`
Expected: FAIL
**Step 3: Commit**
```bash
git add scripts/test_approved_plan_continuity_gate.mjs
git commit -m "test: fail when approved plan step stops without dispatch receipt"
```
### Task 7: Add pass test for existing dispatch receipt
**Files:**
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
**Step 1: Write the test**
- task complete
- next action known
- dispatch receipt exists
- expect pass
**Step 2: Run tests to verify it fails**
Run: `node scripts/test_approved_plan_continuity_gate.mjs`
Expected: FAIL until evaluator exists
**Step 3: Commit**
```bash
git add scripts/test_approved_plan_continuity_gate.mjs
git commit -m "test: allow approved plan step with dispatch receipt"
```
### Task 8: Add pass test for waiting_user closure
**Files:**
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
**Step 1: Write the test**
- task complete
- next action known
- no dispatch receipt
- replyClosureState=`waiting_user`
- expect pass
**Step 2: Run tests to verify it fails**
Run: `node scripts/test_approved_plan_continuity_gate.mjs`
Expected: FAIL until evaluator exists
**Step 3: Commit**
```bash
git add scripts/test_approved_plan_continuity_gate.mjs
git commit -m "test: allow waiting_user continuity closure"
```
### Task 9: Add pass test for blocked closure
**Files:**
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
**Step 1: Write the test**
- replyClosureState=`blocked`
- expect pass
**Step 2: Run tests to verify it fails**
Run: `node scripts/test_approved_plan_continuity_gate.mjs`
Expected: FAIL until evaluator exists
**Step 3: Commit**
```bash
git add scripts/test_approved_plan_continuity_gate.mjs
git commit -m "test: allow blocked continuity closure"
```
### Task 10: Add pass test for pending_verification closure
**Files:**
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
**Step 1: Write the test**
- replyClosureState=`pending_verification`
- expect pass
**Step 2: Run tests to verify it fails**
Run: `node scripts/test_approved_plan_continuity_gate.mjs`
Expected: FAIL until evaluator exists
**Step 3: Commit**
```bash
git add scripts/test_approved_plan_continuity_gate.mjs
git commit -m "test: allow pending verification continuity closure"
```
### Task 11: Implement minimal continuity evaluator
**Files:**
- Modify: `scripts/approved_plan_continuity_gate.mjs`
**Step 1: Add evaluator logic**
- Fail only when:
- approved plan task complete
- next action known
- no dispatch receipt
- and not in legal terminal state
**Step 2: Run tests**
Run: `node scripts/test_approved_plan_continuity_gate.mjs`
Expected: PASS for Tasks 6-10
**Step 3: Commit**
```bash
git add scripts/approved_plan_continuity_gate.mjs scripts/test_approved_plan_continuity_gate.mjs
git commit -m "feat: evaluate approved-plan continuity closure"
```
### Task 12: Create dispatch binding skeleton
**Files:**
- Create: `scripts/approved_plan_dispatch_binding.mjs`
**Step 1: Add CLI skeleton**
- Support input parsing and placeholder receipt output.
**Step 2: Verify it runs**
Run: `node scripts/approved_plan_dispatch_binding.mjs --compact --input /dev/null || true`
Expected: placeholder response or controlled failure
**Step 3: Commit**
```bash
git add scripts/approved_plan_dispatch_binding.mjs
git commit -m "chore: add approved-plan dispatch binding skeleton"
```
### Task 13: Add fail-first test for planner action without bound dispatch
**Files:**
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
**Step 1: Write the test**
- planner returns `derivedAction`
- but no dispatch receipt is written
- expect fail
**Step 2: Run tests to verify it fails**
Run: `node scripts/test_approved_plan_continuity_gate.mjs`
Expected: FAIL
**Step 3: Commit**
```bash
git add scripts/test_approved_plan_continuity_gate.mjs
git commit -m "test: fail when derived action has no bound dispatch"
```
### Task 14: Add pass test for planner action with bound dispatch receipt
**Files:**
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
**Step 1: Write the test**
- planner returns `derivedAction`
- receipt is written
- expect pass
**Step 2: Run tests to verify it fails**
Run: `node scripts/test_approved_plan_continuity_gate.mjs`
Expected: FAIL until binding exists
**Step 3: Commit**
```bash
git add scripts/test_approved_plan_continuity_gate.mjs
git commit -m "test: pass when derived action is bound to dispatch receipt"
```
### Task 15: Define continuity receipt state storage
**Files:**
- Create: `state/approved-plan-continuity/.gitkeep`
- Create: `state/approved-plan-continuity/README.md`
**Step 1: Write the state shape**
- Include receipt filenames and minimum fields.
**Step 2: Verify files exist**
Run: `test -f state/approved-plan-continuity/README.md && test -f state/approved-plan-continuity/.gitkeep && echo OK`
Expected: `OK`
**Step 3: Commit**
```bash
git add state/approved-plan-continuity/.gitkeep state/approved-plan-continuity/README.md
git commit -m "docs: define approved-plan continuity receipt storage"
```
### Task 16: Implement minimal dispatch receipt writer
**Files:**
- Modify: `scripts/approved_plan_dispatch_binding.mjs`
**Step 1: Write dispatch receipts**
- When a known action is truly bound, write file-backed receipt.
**Step 2: Run tests**
Run: `node scripts/test_approved_plan_continuity_gate.mjs`
Expected: binding tests pass
**Step 3: Commit**
```bash
git add scripts/approved_plan_dispatch_binding.mjs scripts/test_approved_plan_continuity_gate.mjs state/approved-plan-continuity/.gitkeep state/approved-plan-continuity/README.md
git commit -m "feat: write approved-plan continuity dispatch receipts"
```
### Task 17: Add fail-first regression for “task done but stopped”
**Files:**
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
**Step 1: Write the regression test**
- completed task
- next step known
- no dispatch receipt
- reply tries to close anyway
- expect violation
**Step 2: Run tests to verify it fails if regression exists**
Run: `node scripts/test_approved_plan_continuity_gate.mjs`
Expected: PASS after fix, but must detect regression if broken
**Step 3: Commit**
```bash
git add scripts/test_approved_plan_continuity_gate.mjs
git commit -m "test: lock regression for task done but stopped"
```
### Task 18: Hook continuity gate into force-recall handler
**Files:**
- Modify: `hooks/force-recall/handler.ts`
**Step 1: Wire continuity gate into reply closure path**
- Enforce continuity before normal closeout.
**Step 2: Run targeted verification**
Run:
- `node scripts/test_approved_plan_continuity_gate.mjs`
- `node scripts/test_force_recall_long_task_preflight.mjs`
- `node --check hooks/force-recall/handler.ts`
Expected: PASS
**Step 3: Commit**
```bash
git add hooks/force-recall/handler.ts scripts/approved_plan_continuity_gate.mjs scripts/approved_plan_dispatch_binding.mjs scripts/test_approved_plan_continuity_gate.mjs
git commit -m "feat: enforce approved-plan continuity at reply closure"
```
### Task 19: Peer review continuity evaluator and binding
**Files:**
- Review: `scripts/approved_plan_continuity_gate.mjs`
- Review: `scripts/approved_plan_dispatch_binding.mjs`
- Review: `scripts/test_approved_plan_continuity_gate.mjs`
**Step 1: Request review**
- Focus: does this really fix continuity failure instead of adding prompt-only guidance?
**Step 2: Record verdict**
- Include commands and findings.
**Step 3: Apply follow-up fixes if needed**
```bash
# only if reviewer requests changes
git add <changed-files>
git commit -m "fix: address continuity gate review feedback"
```
### Task 20: Peer review hook integration and handoff
**Files:**
- Review: `hooks/force-recall/handler.ts`
- Review: `docs/runbooks/approved-plan-continuity.md`
- Review: `state/approved-plan-continuity/README.md`
**Step 1: Request review**
- Focus: can approved-plan task completion still stop without dispatch receipt?
**Step 2: Record verification output**
- Include commands and reviewer verdict.
**Step 3: Final state**
- Leave task in `pending_verification`; do not mark complete.

View File

@@ -0,0 +1,686 @@
# Subagent Anti-Blackhole / Completion-Delivery Watchdog Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Prevent B-class fake timeouts where a subagent finishes, stalls, or loses its return path off-thread and the main conversation never receives a trustworthy completion update.
**Architecture:** Build this in very small layers: first define receipts and states, then pin the blackhole cases with fail-first tests, then implement deterministic receipt-state logic, then add done-but-not-forwarded recovery decisions, then add owner-visible reporting rules and scenario simulations. Keep all early slices file-backed and test-driven before touching any live-session integration.
**Tech Stack:** Node.js, MJS test runners, file-backed JSON state, OpenClaw subagent/session concepts, docs/runbooks
---
### Task 1: Define dispatch receipt fields
**Files:**
- Modify: `docs/runbooks/subagent-anti-blackhole.md`
**Step 1: Write the receipt field list**
- Define only dispatch fields:
- `runId`
- `childSessionKey`
- `dispatchAt`
- `expectedBy`
**Step 2: Verify file contains the new field names**
Run: `grep -n "runId\|childSessionKey\|dispatchAt\|expectedBy" docs/runbooks/subagent-anti-blackhole.md`
Expected: matching lines found
**Step 3: Commit**
```bash
git add docs/runbooks/subagent-anti-blackhole.md
git commit -m "docs: define subagent dispatch receipt fields"
```
### Task 2: Define completion receipt fields
**Files:**
- Modify: `docs/runbooks/subagent-anti-blackhole.md`
**Step 1: Write the completion field list**
- Define only completion fields:
- `completionReceivedAt`
- `forwardedToMain`
- `resultSource`
**Step 2: Verify file contains the new field names**
Run: `grep -n "completionReceivedAt\|forwardedToMain\|resultSource" docs/runbooks/subagent-anti-blackhole.md`
Expected: matching lines found
**Step 3: Commit**
```bash
git add docs/runbooks/subagent-anti-blackhole.md
git commit -m "docs: define subagent completion receipt fields"
```
### Task 3: Define watchdog statuses
**Files:**
- Modify: `docs/runbooks/subagent-anti-blackhole.md`
**Step 1: Add the status enum**
- Define:
- `active`
- `suspect_delivery_failure`
- `done_but_not_forwarded`
- `completed`
- `recovered`
- `blocked`
**Step 2: Verify status names exist**
Run: `grep -n "suspect_delivery_failure\|done_but_not_forwarded\|recovered" docs/runbooks/subagent-anti-blackhole.md`
Expected: matching lines found
**Step 3: Commit**
```bash
git add docs/runbooks/subagent-anti-blackhole.md
git commit -m "docs: define subagent watchdog statuses"
```
### Task 4: Define B-class failure modes
**Files:**
- Modify: `docs/runbooks/subagent-anti-blackhole.md`
**Step 1: Write the failure mode bullets**
- Add:
- done but not forwarded
- no completion event received
- session exists but no result bounce
- unclear slow-run vs delivery failure
**Step 2: Verify phrases exist**
Run: `grep -n "done but not forwarded\|completion event\|result bounce\|delivery failure" docs/runbooks/subagent-anti-blackhole.md`
Expected: matching lines found
**Step 3: Commit**
```bash
git add docs/runbooks/subagent-anti-blackhole.md
git commit -m "docs: define B-class subagent failure modes"
```
### Task 5: Create watchdog script skeleton
**Files:**
- Create: `scripts/subagent_delivery_watchdog.mjs`
**Step 1: Create the script shell**
- Add CLI parsing and a placeholder JSON response.
**Step 2: Verify it runs**
Run: `node scripts/subagent_delivery_watchdog.mjs --compact --input /dev/null || true`
Expected: script exists and is executable enough for next test work
**Step 3: Commit**
```bash
git add scripts/subagent_delivery_watchdog.mjs
git commit -m "chore: add subagent delivery watchdog skeleton"
```
### Task 6: Create watchdog test skeleton
**Files:**
- Create: `scripts/test_subagent_delivery_watchdog.mjs`
**Step 1: Create the test shell**
- Add basic harness structure and fixture runner.
**Step 2: Verify test file executes**
Run: `node scripts/test_subagent_delivery_watchdog.mjs || true`
Expected: test runner executes, even if failing
**Step 3: Commit**
```bash
git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: add subagent watchdog test skeleton"
```
### Task 7: Add active-before-SLA test
**Files:**
- Modify: `scripts/test_subagent_delivery_watchdog.mjs`
**Step 1: Write the test**
- dispatch exists
- no completion receipt yet
- current time still before SLA
- expect `active`
**Step 2: Run test to verify it fails**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: FAIL on missing logic
**Step 3: Commit**
```bash
git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: require active status before SLA breach"
```
### Task 8: Add suspect-delivery-failure test
**Files:**
- Modify: `scripts/test_subagent_delivery_watchdog.mjs`
**Step 1: Write the test**
- dispatch exists
- no completion receipt
- current time beyond SLA
- expect `suspect_delivery_failure`
**Step 2: Run test to verify it fails**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: FAIL on new assertion
**Step 3: Commit**
```bash
git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: detect suspected delivery failure after SLA"
```
### Task 9: Add completed-status test
**Files:**
- Modify: `scripts/test_subagent_delivery_watchdog.mjs`
**Step 1: Write the test**
- dispatch exists
- completion receipt exists
- expect `completed`
**Step 2: Run test to verify it fails**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: FAIL on completed path
**Step 3: Commit**
```bash
git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: close watchdog on completion receipt"
```
### Task 10: Add state shape fixture
**Files:**
- Create: `state/subagent-delivery-watchdog/README.md`
- Create: `state/subagent-delivery-watchdog/.gitkeep`
**Step 1: Define the state JSON shape in README**
- Include receipt fields and status fields.
**Step 2: Verify files exist**
Run: `test -f state/subagent-delivery-watchdog/README.md && test -f state/subagent-delivery-watchdog/.gitkeep && echo OK`
Expected: `OK`
**Step 3: Commit**
```bash
git add state/subagent-delivery-watchdog/README.md state/subagent-delivery-watchdog/.gitkeep
git commit -m "docs: define watchdog state storage shape"
```
### Task 11: Implement dispatch receipt write
**Files:**
- Modify: `scripts/subagent_delivery_watchdog.mjs`
**Step 1: Add a function to write dispatch receipt state**
- Only handle a new dispatch record.
**Step 2: Run tests**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: some tests still fail, but dispatch state path exists
**Step 3: Commit**
```bash
git add scripts/subagent_delivery_watchdog.mjs
git commit -m "feat: write subagent dispatch receipt state"
```
### Task 12: Implement completion receipt write
**Files:**
- Modify: `scripts/subagent_delivery_watchdog.mjs`
**Step 1: Add a function to write completion receipt state**
- Only update completion-related fields.
**Step 2: Run tests**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: some tests still fail, but completion data path exists
**Step 3: Commit**
```bash
git add scripts/subagent_delivery_watchdog.mjs
git commit -m "feat: write subagent completion receipt state"
```
### Task 13: Implement status recompute for active/completed/suspect
**Files:**
- Modify: `scripts/subagent_delivery_watchdog.mjs`
**Step 1: Add status recompute logic**
- Implement only:
- `active`
- `suspect_delivery_failure`
- `completed`
**Step 2: Run tests**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: Task 7-9 tests pass
**Step 3: Commit**
```bash
git add scripts/subagent_delivery_watchdog.mjs scripts/test_subagent_delivery_watchdog.mjs
git commit -m "feat: recompute basic watchdog statuses"
```
### Task 14: Add done-but-not-forwarded test
**Files:**
- Modify: `scripts/test_subagent_delivery_watchdog.mjs`
**Step 1: Write the test**
- child run marked done
- no completion receipt in main thread
- expect `done_but_not_forwarded`
**Step 2: Run tests to verify it fails**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: FAIL on new assertion
**Step 3: Commit**
```bash
git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: detect done but not forwarded state"
```
### Task 15: Implement done-but-not-forwarded state
**Files:**
- Modify: `scripts/subagent_delivery_watchdog.mjs`
**Step 1: Add done-but-not-forwarded detection**
- Use child-done signal + missing completion receipt.
**Step 2: Run tests**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: done-but-not-forwarded test passes
**Step 3: Commit**
```bash
git add scripts/subagent_delivery_watchdog.mjs scripts/test_subagent_delivery_watchdog.mjs
git commit -m "feat: detect done without forwarded completion"
```
### Task 16: Add first recovery-action test
**Files:**
- Modify: `scripts/test_subagent_delivery_watchdog.mjs`
**Step 1: Write fetch-history recovery test**
- done but not forwarded
- no prior recovery action
- expect recovery decision `fetch_history`
**Step 2: Run tests to verify it fails**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: FAIL on recovery decision
**Step 3: Commit**
```bash
git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: fetch history after missing forwarded completion"
```
### Task 17: Implement fetch-history recovery decision
**Files:**
- Modify: `scripts/subagent_delivery_watchdog.mjs`
**Step 1: Add minimal recovery decision logic**
- Return `fetch_history` for first-time done-but-not-forwarded.
**Step 2: Run tests**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: fetch-history recovery test passes
**Step 3: Commit**
```bash
git add scripts/subagent_delivery_watchdog.mjs scripts/test_subagent_delivery_watchdog.mjs
git commit -m "feat: recover with history fetch first"
```
### Task 18: Add respawn-escalation test
**Files:**
- Modify: `scripts/test_subagent_delivery_watchdog.mjs`
**Step 1: Write the respawn test**
- recovery already attempted once
- still no forwarded completion
- expect `respawn`
**Step 2: Run tests to verify it fails**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: FAIL on respawn decision
**Step 3: Commit**
```bash
git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: escalate to respawn after failed recovery"
```
### Task 19: Implement respawn decision
**Files:**
- Modify: `scripts/subagent_delivery_watchdog.mjs`
**Step 1: Add respawn logic**
- Return `respawn` when fetch-history path did not recover delivery.
**Step 2: Run tests**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: respawn test passes
**Step 3: Commit**
```bash
git add scripts/subagent_delivery_watchdog.mjs scripts/test_subagent_delivery_watchdog.mjs
git commit -m "feat: respawn after failed delivery recovery"
```
### Task 20: Add blocked-escalation test
**Files:**
- Modify: `scripts/test_subagent_delivery_watchdog.mjs`
**Step 1: Write the blocked test**
- repeated recovery failure
- expect `blocked` plus owner-visible reporting requirement
**Step 2: Run tests to verify it fails**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: FAIL on blocked escalation
**Step 3: Commit**
```bash
git add scripts/test_subagent_delivery_watchdog.mjs
git commit -m "test: escalate repeated delivery failures to blocked"
```
### Task 21: Implement blocked escalation
**Files:**
- Modify: `scripts/subagent_delivery_watchdog.mjs`
**Step 1: Add blocked escalation logic**
- repeated recovery failure -> `blocked`
**Step 2: Run tests**
Run: `node scripts/test_subagent_delivery_watchdog.mjs`
Expected: blocked escalation test passes
**Step 3: Commit**
```bash
git add scripts/subagent_delivery_watchdog.mjs scripts/test_subagent_delivery_watchdog.mjs
git commit -m "feat: block repeated subagent delivery failures"
```
### Task 22: Add owner-visible reporting rule for suspect state
**Files:**
- Modify: `WORKFLOW.md`
- Modify: `AGENTS.md`
- Modify: `docs/runbooks/subagent-anti-blackhole.md`
**Step 1: Add suspect-state reporting rule**
- If SLA is crossed with no completion receipt, the owner must be informed.
**Step 2: Verify text exists**
Run: `grep -RIn "SLA\|suspect_delivery_failure" WORKFLOW.md AGENTS.md docs/runbooks/subagent-anti-blackhole.md`
Expected: matching lines found
**Step 3: Commit**
```bash
git add WORKFLOW.md AGENTS.md docs/runbooks/subagent-anti-blackhole.md
git commit -m "docs: require reporting on suspect delivery failure"
```
### Task 23: Add owner-visible reporting rule for done-but-not-forwarded
**Files:**
- Modify: `WORKFLOW.md`
- Modify: `AGENTS.md`
- Modify: `docs/runbooks/subagent-anti-blackhole.md`
**Step 1: Add done-but-not-forwarded reporting rule**
- Must state that result exists but did not bounce back.
**Step 2: Verify text exists**
Run: `grep -RIn "done but not forwarded\|did not bounce back" WORKFLOW.md AGENTS.md docs/runbooks/subagent-anti-blackhole.md`
Expected: matching lines found
**Step 3: Commit**
```bash
git add WORKFLOW.md AGENTS.md docs/runbooks/subagent-anti-blackhole.md
git commit -m "docs: require reporting on missing forwarded completion"
```
### Task 24: Add rule to fetch history before respawn
**Files:**
- Modify: `WORKFLOW.md`
- Modify: `docs/runbooks/subagent-delivery-recovery.md`
**Step 1: Add the history-first rule**
- Done-but-not-forwarded should prefer `fetch_history` before `respawn`.
**Step 2: Verify text exists**
Run: `grep -RIn "fetch_history\|before respawn" WORKFLOW.md docs/runbooks/subagent-delivery-recovery.md`
Expected: matching lines found
**Step 3: Commit**
```bash
git add WORKFLOW.md docs/runbooks/subagent-delivery-recovery.md
git commit -m "docs: prefer history fetch before respawn"
```
### Task 25: Add no-silent-waiting-after-SLA rule
**Files:**
- Modify: `WORKFLOW.md`
- Modify: `AGENTS.md`
**Step 1: Add the no-silent-waiting rule**
- Once SLA is crossed, silent waiting is forbidden.
**Step 2: Verify text exists**
Run: `grep -RIn "silent waiting\|SLA" WORKFLOW.md AGENTS.md`
Expected: matching lines found
**Step 3: Commit**
```bash
git add WORKFLOW.md AGENTS.md
git commit -m "docs: forbid silent waiting after subagent SLA"
```
### Task 26: Create blackhole scenario test shell
**Files:**
- Create: `scripts/test_subagent_blackhole_scenarios.mjs`
**Step 1: Create the scenario test shell**
- Add empty scenario harness.
**Step 2: Verify file runs**
Run: `node scripts/test_subagent_blackhole_scenarios.mjs || true`
Expected: file executes, even if not complete
**Step 3: Commit**
```bash
git add scripts/test_subagent_blackhole_scenarios.mjs
git commit -m "test: add subagent blackhole scenario harness"
```
### Task 27: Add normal-completion scenario
**Files:**
- Modify: `scripts/test_subagent_blackhole_scenarios.mjs`
**Step 1: Write the scenario**
- dispatch -> completion receipt -> completed
**Step 2: Run tests**
Run: `node scripts/test_subagent_blackhole_scenarios.mjs`
Expected: scenario still may fail until engine wiring is ready
**Step 3: Commit**
```bash
git add scripts/test_subagent_blackhole_scenarios.mjs
git commit -m "test: add normal subagent completion scenario"
```
### Task 28: Add slow-but-active scenario
**Files:**
- Modify: `scripts/test_subagent_blackhole_scenarios.mjs`
**Step 1: Write the scenario**
- dispatch before SLA -> active
**Step 2: Run tests**
Run: `node scripts/test_subagent_blackhole_scenarios.mjs`
Expected: scenario result captured
**Step 3: Commit**
```bash
git add scripts/test_subagent_blackhole_scenarios.mjs
git commit -m "test: add slow but active subagent scenario"
```
### Task 29: Add done-but-not-forwarded scenario
**Files:**
- Modify: `scripts/test_subagent_blackhole_scenarios.mjs`
**Step 1: Write the scenario**
- child done -> no completion receipt -> fetch_history
**Step 2: Run tests**
Run: `node scripts/test_subagent_blackhole_scenarios.mjs`
Expected: scenario result captured
**Step 3: Commit**
```bash
git add scripts/test_subagent_blackhole_scenarios.mjs
git commit -m "test: add done but not forwarded scenario"
```
### Task 30: Add missing-completion-event scenario
**Files:**
- Modify: `scripts/test_subagent_blackhole_scenarios.mjs`
**Step 1: Write the scenario**
- no bounce, no completion receipt, beyond SLA -> suspect delivery failure
**Step 2: Run tests**
Run: `node scripts/test_subagent_blackhole_scenarios.mjs`
Expected: scenario result captured
**Step 3: Commit**
```bash
git add scripts/test_subagent_blackhole_scenarios.mjs
git commit -m "test: add missing completion event scenario"
```
### Task 31: Add repeated-failure escalation scenario
**Files:**
- Modify: `scripts/test_subagent_blackhole_scenarios.mjs`
**Step 1: Write the scenario**
- fetch_history fails -> respawn fails -> blocked
**Step 2: Run tests**
Run: `node scripts/test_subagent_blackhole_scenarios.mjs`
Expected: scenario result captured
**Step 3: Commit**
```bash
git add scripts/test_subagent_blackhole_scenarios.mjs
git commit -m "test: add repeated blackhole escalation scenario"
```
### Task 32: Run the full local watchdog test set
**Files:**
- Modify if needed: `scripts/test_subagent_delivery_watchdog.mjs`
- Modify if needed: `scripts/test_subagent_blackhole_scenarios.mjs`
**Step 1: Run the combined tests**
Run:
- `node scripts/test_subagent_delivery_watchdog.mjs`
- `node scripts/test_subagent_blackhole_scenarios.mjs`
Expected: PASS
**Step 2: Fix only minimal wiring needed for all-pass**
- Keep changes scoped to watchdog logic/tests.
**Step 3: Commit**
```bash
git add scripts/test_subagent_delivery_watchdog.mjs scripts/test_subagent_blackhole_scenarios.mjs scripts/subagent_delivery_watchdog.mjs
git commit -m "test: pass full subagent blackhole watchdog suite"
```
### Task 33: Peer review watchdog state logic
**Files:**
- Review: `scripts/subagent_delivery_watchdog.mjs`
- Review: `scripts/test_subagent_delivery_watchdog.mjs`
**Step 1: Request reviewer focus on receipt state logic**
- Verify statuses and transitions match B-class failure goals.
**Step 2: Record reviewer verdict**
- Include commands and findings.
**Step 3: Commit any follow-up fixes if needed**
```bash
# only if reviewer requests changes
git add <changed-files>
git commit -m "fix: address watchdog state review feedback"
```
### Task 34: Peer review recovery decisions
**Files:**
- Review: `scripts/subagent_delivery_watchdog.mjs`
- Review: `docs/runbooks/subagent-delivery-recovery.md`
**Step 1: Request reviewer focus on recovery ordering**
- Verify fetch-history before respawn and blocked escalation.
**Step 2: Record reviewer verdict**
- Include commands and findings.
**Step 3: Commit any follow-up fixes if needed**
```bash
# only if reviewer requests changes
git add <changed-files>
git commit -m "fix: address recovery decision review feedback"
```
### Task 35: Peer review scenario coverage and handoff
**Files:**
- Review: `scripts/test_subagent_blackhole_scenarios.mjs`
- Review: `docs/runbooks/subagent-anti-blackhole.md`
- Review: `docs/runbooks/subagent-delivery-recovery.md`
**Step 1: Request reviewer focus on blackhole realism**
- Confirm this targets fake timeout / no-bounce cases, not just slow work.
**Step 2: Record verification output**
- Include exact commands and reviewer verdict.
**Step 3: Final state**
- Leave task in `pending_verification`; do not mark complete.

View File

@@ -0,0 +1,56 @@
# Approved Plan Continuity
## Continuity receipt core fields
### `planId`
- The identifier of the approved plan that the continuity receipt belongs to.
- Use this field to associate the receipt with one specific approved plan.
### `currentTask`
- The task from the approved plan that is currently being executed or has just completed.
- Use this field to record which plan task the receipt is about.
### `nextDerivedAction`
- The next concrete action derived from the current task that should be dispatched to continue the workflow.
- Use this field to record the intended follow-up action for continuity.
### `dispatchedAt`
- The timestamp indicating when the next derived action was actually dispatched.
- Use this field to record when the continuity handoff occurred.
## Continuity receipt linkage fields
### `dispatchRunId`
- The unique identifier for the dispatch run that produced or recorded the next-step continuity handoff.
- Use this field to link the receipt to one concrete dispatch execution, not just a planned action.
- This field is for receipt linkage and traceability only; it does not by itself define continuity-gate pass/fail behavior.
### `childSessionKey`
- The session linkage key for the child session or spawned execution context that receives the dispatched next action.
- Use this field to connect the continuity receipt to the specific downstream session that should carry the workflow forward.
- This field records linkage identity only; it does not by itself imply hook integration or dispatch binding logic.
### `replyClosureState`
- The closure state recorded at the point the current reply is being closed.
- Use this field to state whether the reply closed under a dispatch-linked continuation path or some separately defined terminal closure state.
- This field is defined here as a receipt field only; legal closure states and gate enforcement are defined in later tasks.
## Legal terminal states
These are the only legal non-dispatch terminal states for an approved-plan continuity closure. If a reply closes without a real next-dispatch receipt, `replyClosureState` must be one of the states below.
### `waiting_user`
- Use this state only when the approved-plan workflow cannot continue until the user provides a decision, approval, missing information, or some other explicit user response.
- This state means the workflow is intentionally paused on user input, not silently stopped.
- Do not use this state when the next step could already be dispatched without further user involvement.
### `blocked`
- Use this state only when the approved-plan workflow cannot proceed because of an external blocker, dependency, permission issue, outage, or other constraint that is not resolved by the current executor.
- This state means progress is prevented by a real blocking condition, not by omission of the next dispatch.
- Do not use this state to explain away a missing continuity handoff when execution could still continue.
### `pending_verification`
- Use this state only when the implementation or execution step is done enough that the workflow should stop specifically for verification, validation, review, or confirmation of results.
- This state means the next meaningful action is to verify what was already produced, rather than to dispatch another implementation step immediately.
- Do not use this state for incomplete work that still has an undispatched next action.

View File

@@ -0,0 +1,70 @@
# Subagent Anti-Blackhole Runbook
## Dispatch receipt fields
Dispatch receipt 僅定義子代理派發當下所需的欄位,用來識別本次派發、關聯子 session以及標記預期完成時限。
- `runId`: 本次 subagent dispatch 的唯一執行識別碼。用於把同一次任務派發、後續狀態檢查與回報關聯到同一個 run。
- `childSessionKey`: 子代理 session 的穩定關聯鍵。用於把 dispatch receipt 對應到實際被派發出去的 child session。
- `dispatchAt`: dispatch receipt 寫入時間,也就是主流程實際派發 subagent 的時間戳記。建議使用可排序的標準時間格式。
- `expectedBy`: 依照當次任務 SLA 或預估完成時間計算出的期望完成時間戳記。用於判斷目前仍屬正常執行中,或已超過預期等待窗口。
> 本節僅定義 dispatch receipt 欄位,不涵蓋 completion receipts、watchdog logic、recovery 流程或其他後續 task。
## Minimal example
```json
{
"runId": "run_2026-04-24_001",
"childSessionKey": "agent:engineering:subagent:example",
"dispatchAt": "2026-04-24T10:00:00+08:00",
"expectedBy": "2026-04-24T10:15:00+08:00"
}
```
## Completion receipt fields
Completion receipt 僅定義子代理完成結果被接收到之後所需記錄的欄位,用來區分「子代理已完成」與「結果是否已成功轉交 main conversation」。
- `completionReceivedAt`: 主流程或監看機制實際收到 completion/result 的時間戳記。用於確認子代理何時已經完成並回傳結果,不再只靠 `expectedBy` 推估。
- `forwardedToMain`: 布林欄位,表示該 completion/result 是否已成功轉送到 main conversation。用於區分「已收到結果」與「已完成主線回報」這兩個不同狀態。
- `resultSource`: completion/result 的來源標記,例如來自主動 completion push、補抓回來的 session 狀態,或其他明確來源。用於後續判讀結果是正常送達還是經由補救路徑取得。
> 本節僅定義 completion receipt 欄位,不涵蓋 watchdog logic、recovery 流程、scenario tests 或其他後續 task。
## Watchdog statuses
Watchdog status 僅定義監看子代理完成投遞狀態時可使用的狀態列舉,用於區分仍在正常等待、疑似投遞失敗、結果已存在但未轉交,以及已完成或已卡住等情況。
- `active`: dispatch receipt 已存在,且目前仍在 `expectedBy` 之前,也還沒有任何 completion receipt。表示子代理仍在正常等待窗口內watchdog 只需持續觀察,不應提前視為異常。
- `suspect_delivery_failure`: dispatch receipt 已存在、目前已超過 `expectedBy`,但主流程仍未收到 completion receipt。表示尚無法證明子代理失敗或成功只能判定為疑似 completion delivery 出問題,需進入明確的人工可見關注狀態。
- `done_but_not_forwarded`: 已有可信訊號顯示子代理工作其實做完了,但 main thread 仍沒有對應的 forwarded completion receipt。表示結果可能存在於 child session 或其他回傳路徑上,只是沒有成功 bounce 回主線。
- `completed`: completion receipt 已被主流程接收,且結果已成功進入主線回報路徑。表示此 run 的 watchdog 可視為正常閉合,不再屬於 blackhole 風險案例。
- `recovered`: 先前曾落入 `suspect_delivery_failure``done_but_not_forwarded`,之後透過後續確認或補抓,已把結果重新接回可追蹤狀態。此狀態只定義「已從異常投遞風險中恢復」的語意,不在本 task 提前定義 recovery logic。
- `blocked`: watchdog 已判定目前無法再以被動等待來解釋狀態,且該 run 需要明確升級處理或人工介入。此狀態只定義「已卡住、不可再默默等待」的語意,不在本 task 提前定義 escalation 或處置流程。
> 本節僅定義 watchdog statuses 的語意與邊界,不提前實作 recovery logic、receipt state code、scenario tests 或其他後續 task。
## B-class failure modes
B-class failure modes 指的是「子代理工作本身不一定真的 timeout但主線沒有收到可信 completion 回報」的假 timeout 類型。這一類問題的核心不是先判定 child 一定失敗,而是先區分執行端、事件投遞端與主線轉交端哪一段失聯。
- **done but not forwarded**child session 內已有可信跡象顯示工作完成例如子代理已產出最終回報、session 狀態顯示 done或可確認 completion 已存在於子線;但 main conversation 沒有收到對應的 forwarded result。這類型代表「結果已存在但沒有被成功轉交到主線」。
- **no completion event received**:主流程已完成 dispatch且等待時間已逼近或超過 `expectedBy`,但主線完全沒有收到任何 completion event。此時不能直接斷言 child 一定還在跑,也不能直接斷言 child 已失敗;只能先明確標記為「主線未收到 completion event」避免把 delivery 問題誤判成單純執行逾時。
- **session exists but no result bounce**:可確認 child session 仍存在、可被查到,甚至可見到該 session 有持續活動或已留下結果內容,但沒有任何 result bounce 回到 main conversation。這類型比前一類更明確指出session 並未消失,問題在於結果沒有沿正常回傳路徑反彈回主線。
- **unclear slow-run vs delivery failure**:目前只知道主線等待已超過預期,但還無法分辨 child 是真的慢、仍在執行,還是其實已完成卻發生 delivery failure。這個 failure mode 的定義重點是保留不確定性:在證據不足時,不應把所有超時都歸類成 slow run也不應直接假設是 delivery failure。
> 本節只定義 B-class 假 timeout failure modes 的語意邊界與彼此差異,不提前實作 recovery logic、receipt state code、watchdog script 或 scenario tests。
## Completion receipt example
```json
{
"completionReceivedAt": "2026-04-24T10:12:34+08:00",
"forwardedToMain": true,
"resultSource": "completion_push"
}
```

View File

@@ -0,0 +1,29 @@
---
name: force-recall
description: "Prepend mandatory RULEBOOK/SOUL recall block before the agent sees inbound messages"
homepage: https://docs.openclaw.ai/automation/hooks
metadata:
{ "openclaw": { "emoji": "🧠", "events": ["message:preprocessed"], "always": true } }
---
# Force Recall Hook (MVP)
This hook enforces a **recall gate** by prepending a short, high-salience block to every inbound message *after* media/link enrichment and *before* the agent sees it.
Goal: **Before any technical action/tooling**, the agent must recall key rules from `docs/RULEBOOK.md` + `SOUL.md`.
## Behavior
- Listens on `message:preprocessed`
- Injects a `RECALL_GATE` prefix into `context.bodyForAgent`
- Optional debug: set `OPENCLAW_FORCE_RECALL_DEBUG=1` to append a one-line marker (visible in the agent prompt)
## Why this MVP
OpenClaw hooks currently provide reliable interception at the message boundary (`message:preprocessed`). This is the earliest stable point to force rules into the model's working context without patching core.
## Disable
```bash
openclaw hooks disable force-recall
```

View File

@@ -0,0 +1,532 @@
import fs from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { execFile } from "node:child_process";
import { promisify } from "node:util";
const execFileAsync = promisify(execFile);
const LONG_TASK_WRAPPER_TIMEOUT_MS = 8000;
const LONG_TASK_GATE_LOCK_TIMEOUT_MS = 8000;
const LONG_TASK_AUTO_CHAIN_PLANNER_TIMEOUT_MS = 8000;
type AutoChainPlanResult = {
plannerStatus: string;
derivedAction: string;
dispatchMode: string;
reason: string;
requiredEvidence?: string[];
autoChainAllowed: boolean;
};
type GateLockResult = {
gateRequired: boolean;
gateStatus: "not_applicable" | "pass" | "fail";
reasons?: string[];
requiredEvidence?: Array<{
evidenceKey?: string;
acceptedFields?: string[];
requiredValue?: string;
}>;
allowedResponseModes?: string[];
};
function clamp(s: string, max = 1200): string {
if (!s) return s;
if (s.length <= max) return s;
return s.slice(0, max) + "\n…(truncated)…";
}
async function safeReadText(filePath: string): Promise<string | null> {
try {
const raw = await fs.readFile(filePath, "utf-8");
const trimmed = raw.trim();
return trimmed ? trimmed : null;
} catch {
return null;
}
}
async function getReadableCheckpointArtifact(workspaceDir: string, wrapperResult: any): Promise<{ relativePath: string; absolutePath: string; content: string; } | null> {
const relativePath = typeof wrapperResult?.externalizedCheckpointPath === "string"
? wrapperResult.externalizedCheckpointPath.trim()
: "";
if (!relativePath) return null;
const absolutePath = path.resolve(workspaceDir, relativePath);
try {
const raw = await fs.readFile(absolutePath, "utf-8");
const content = raw.trim();
if (!content) return null;
return { relativePath, absolutePath, content };
} catch {
return null;
}
}
async function runJsonScript(scriptPath: string, workspaceDir: string, input: Record<string, unknown>, timeout: number): Promise<any | null> {
let tempInputPath: string | null = null;
try {
tempInputPath = path.join(
os.tmpdir(),
`openclaw-hook-${path.basename(scriptPath, path.extname(scriptPath))}-${process.pid}-${Date.now()}.json`,
);
await fs.writeFile(tempInputPath, JSON.stringify(input), "utf-8");
const { stdout } = await execFileAsync("node", [scriptPath, "--compact", "--input", tempInputPath], {
cwd: workspaceDir,
maxBuffer: 1024 * 1024,
timeout,
});
return JSON.parse(stdout);
} catch {
return null;
} finally {
if (tempInputPath) {
await fs.unlink(tempInputPath).catch(() => {});
}
}
}
async function runLongTaskWrapper(workspaceDir: string, ctx: any): Promise<any | null> {
const wrapperPath = path.join(workspaceDir, "scripts", "long_task_governor_wrapper.mjs");
const input = {
requestText: (ctx.body ?? ctx.content ?? ctx.bodyForAgent ?? "") as string,
hasFilesOrSystems: false,
needsWaiting: false,
needsSubagent: false,
needsOwnerDecision: false,
canReplyNow: false,
taskName: "Hook preflight classification",
currentStep: "Classifying request at preprocessed hook",
nextStep: "Carry governor recommendation into prompt context",
nextReportCondition: "At next meaningful milestone",
waitingOn: "none",
blocker: "none",
checkpointTrigger: "",
externalizedTrigger: "",
triggerKind: "",
};
return runJsonScript(wrapperPath, workspaceDir, input, LONG_TASK_WRAPPER_TIMEOUT_MS);
}
function buildProgressEvidence(wrapperResult: any, readableCheckpointArtifact: { relativePath: string; absolutePath: string; content: string; } | null): Record<string, unknown> | null {
const candidate = wrapperResult?.progressEvidence;
if (!candidate || typeof candidate !== "object" || Array.isArray(candidate)) {
return null;
}
const progressEvidence: Record<string, unknown> = {};
const sessionKey = typeof candidate.sessionKey === "string"
? candidate.sessionKey.trim()
: "";
if (sessionKey) {
progressEvidence.sessionKey = sessionKey;
}
const runId = typeof candidate.runId === "string"
? candidate.runId.trim()
: "";
if (runId) {
progressEvidence.runId = runId;
}
if (Array.isArray(candidate.modified_files) && candidate.modified_files.length > 0) {
progressEvidence.modified_files = candidate.modified_files;
}
const verificationResult = typeof candidate.verificationResult === "string"
? candidate.verificationResult.trim()
: "";
if (verificationResult) {
progressEvidence.verificationResult = verificationResult;
}
if (readableCheckpointArtifact) {
progressEvidence.checkpointPath = readableCheckpointArtifact.relativePath;
if (!progressEvidence.verificationResult) {
progressEvidence.verificationResult = `checkpoint artifact readable at ${readableCheckpointArtifact.relativePath}`;
}
}
return Object.keys(progressEvidence).length > 0 ? progressEvidence : null;
}
function shouldClaimProgression(wrapperResult: any, progressEvidence: Record<string, unknown> | null): boolean {
if (!wrapperResult || wrapperResult.classification !== "long_task") return false;
if (progressEvidence && Object.keys(progressEvidence).length > 0) return true;
const requiredNextAction = typeof wrapperResult.requiredNextAction === "string"
? wrapperResult.requiredNextAction.trim()
: "";
const progressingActionPrefixes = [
"dispatch_",
"handoff_",
"launch_",
"resume_",
"continue_",
"queue_",
"schedule_",
"run_",
"start_",
"spawn_",
];
if (requiredNextAction && progressingActionPrefixes.some((prefix) => requiredNextAction.startsWith(prefix))) {
return true;
}
return wrapperResult.silentLaunchOk === true;
}
function buildGateLockInput(wrapperResult: any, readableCheckpointArtifact: { relativePath: string; absolutePath: string; content: string; } | null): Record<string, unknown> {
if (!wrapperResult || wrapperResult.classification !== "long_task") {
return { classification: wrapperResult?.classification ?? "general_chat" };
}
const needsOwnerDecision = wrapperResult.needsOwnerDecision === true;
const silentCandidate = wrapperResult.silentCandidate === true;
const progressEvidence = buildProgressEvidence(wrapperResult, readableCheckpointArtifact);
const requiredNextAction = typeof wrapperResult.requiredNextAction === "string"
? wrapperResult.requiredNextAction.trim()
: "";
const hasConcreteExecutionEvidence = Boolean(
requiredNextAction
&& ![
"",
"proceed_with_normal_long_task_flow",
"proceed_with_silent_launch",
"define_first_checkpoint_trigger_before_silent_launch",
"bind_externalized_checkpoint_path_or_abort_silent_launch",
].includes(requiredNextAction),
);
const autoChainNextAction = hasConcreteExecutionEvidence ? requiredNextAction : "";
const executionEvidence = hasConcreteExecutionEvidence
? {
concreteNextAction: requiredNextAction,
}
: null;
const autoChainDispatchEvidence = hasConcreteExecutionEvidence
&& wrapperResult.autoChainDispatchEvidence
&& typeof wrapperResult.autoChainDispatchEvidence === "object"
&& !Array.isArray(wrapperResult.autoChainDispatchEvidence)
? wrapperResult.autoChainDispatchEvidence
: null;
const claimedProgression = shouldClaimProgression(wrapperResult, progressEvidence)
? "already progressing to the next step in background"
: "";
const progressEvidenceReason = claimedProgression && !progressEvidence
? "progression claim requires concrete evidence such as sessionKey, runId, modified_files, or verification result"
: "";
const hasExternalizedCheckpointEvidence = Boolean(readableCheckpointArtifact);
const hasButtonPathClosureEvidence = needsOwnerDecision && wrapperResult.silentLaunchOk === true;
return {
classification: wrapperResult.classification,
silentContinuation: silentCandidate,
claimedExecution: hasConcreteExecutionEvidence || (silentCandidate && wrapperResult.silentLaunchOk !== true),
needsOwnerDecision,
nextStep: hasConcreteExecutionEvidence ? requiredNextAction : "",
requiredNextAction: hasConcreteExecutionEvidence ? requiredNextAction : "",
concreteNextAction: hasConcreteExecutionEvidence ? requiredNextAction : "",
autoChainNextAction,
autoChainDispatchEvidence,
progressionClaim: claimedProgression,
claimedProgression: claimedProgression,
statusSummary: claimedProgression,
executionEvidence,
progressEvidence,
autoChainDispatchEvidenceReason: hasConcreteExecutionEvidence && !autoChainDispatchEvidence
? "explicit auto-chain next action requires dispatched-action evidence"
: "",
progressEvidenceReason,
sessionKey: typeof progressEvidence?.sessionKey === "string" ? progressEvidence.sessionKey : "",
runId: typeof progressEvidence?.runId === "string" ? progressEvidence.runId : "",
modified_files: Array.isArray(progressEvidence?.modified_files) ? progressEvidence.modified_files : [],
verificationResult: typeof progressEvidence?.verificationResult === "string" ? progressEvidence.verificationResult : "",
toolCallEvidence: "",
dispatchEvidence: "",
fileChangeEvidence: "",
verificationEvidence: "",
checkpointArtifactEvidence: hasExternalizedCheckpointEvidence ? readableCheckpointArtifact.relativePath : "",
externalizedCheckpointPath: hasExternalizedCheckpointEvidence ? readableCheckpointArtifact.relativePath : "",
externalizedTrigger: hasExternalizedCheckpointEvidence ? "hook-preflight-checkpoint" : "",
handoffMode: hasButtonPathClosureEvidence ? (wrapperResult.handoff?.mode ?? "button_path") : "direct_reply",
replyClosureMode: hasButtonPathClosureEvidence ? (wrapperResult.handoff?.mode ?? "button_path") : "direct_reply",
};
}
async function runLongTaskGateLock(workspaceDir: string, wrapperResult: any): Promise<GateLockResult | null> {
const gateLockPath = path.join(workspaceDir, "scripts", "long_task_gate_lock.mjs");
const readableCheckpointArtifact = await getReadableCheckpointArtifact(workspaceDir, wrapperResult);
const input = buildGateLockInput(wrapperResult, readableCheckpointArtifact);
return runJsonScript(gateLockPath, workspaceDir, input, LONG_TASK_GATE_LOCK_TIMEOUT_MS);
}
function buildAutoChainPlannerInput(gateLockResult: GateLockResult | null, wrapperResult: any): Record<string, unknown> {
const requiredNextAction = typeof wrapperResult?.requiredNextAction === "string"
? wrapperResult.requiredNextAction.trim()
: "";
const plannerInput: Record<string, unknown> = {
gateStatus: gateLockResult?.gateStatus ?? "not_applicable",
actorStage: "hook_preflight",
requiredNextAction,
};
if (!requiredNextAction) return plannerInput;
if (requiredNextAction === "dispatch_follow_up_subagent") {
plannerInput.actorStage = "implementer_result";
plannerInput.requiredNextAction = "request_spec_review";
if (wrapperResult?.autoChainDispatchEvidence && typeof wrapperResult.autoChainDispatchEvidence === "object" && !Array.isArray(wrapperResult.autoChainDispatchEvidence)) {
plannerInput.executionEvidence = wrapperResult.autoChainDispatchEvidence;
}
return plannerInput;
}
if (requiredNextAction === "dispatch_code_quality_review") {
plannerInput.actorStage = "spec_review";
plannerInput.requiredNextAction = "request_code_quality_review";
plannerInput.reviewOutcome = "pass";
if (wrapperResult?.reviewEvidence && typeof wrapperResult.reviewEvidence === "object" && !Array.isArray(wrapperResult.reviewEvidence)) {
plannerInput.reviewEvidence = wrapperResult.reviewEvidence;
}
return plannerInput;
}
if (requiredNextAction === "dispatch_fix_slice") {
plannerInput.actorStage = "review_result";
plannerInput.requiredNextAction = "fix_review_findings";
plannerInput.blocker = typeof wrapperResult?.silentLaunchReason === "string" && wrapperResult.silentLaunchReason.trim()
? wrapperResult.silentLaunchReason.trim()
: "hook_preflight_blocker";
if (wrapperResult?.blockerEvidence && typeof wrapperResult.blockerEvidence === "object" && !Array.isArray(wrapperResult.blockerEvidence)) {
plannerInput.blockerEvidence = wrapperResult.blockerEvidence;
}
return plannerInput;
}
if (requiredNextAction === "dispatch_spec_review") {
plannerInput.actorStage = "implementer_result";
plannerInput.requiredNextAction = "request_spec_review";
if (wrapperResult?.implementationEvidence && typeof wrapperResult.implementationEvidence === "object" && !Array.isArray(wrapperResult.implementationEvidence)) {
plannerInput.executionEvidence = wrapperResult.implementationEvidence;
}
return plannerInput;
}
return plannerInput;
}
async function runAutoChainPlanner(workspaceDir: string, gateLockResult: GateLockResult | null, wrapperResult: any): Promise<AutoChainPlanResult | null> {
if (!wrapperResult || wrapperResult.classification !== "long_task") return null;
const plannerPath = path.join(workspaceDir, "scripts", "plan_long_task_auto_chain.mjs");
const input = buildAutoChainPlannerInput(gateLockResult, wrapperResult);
return runJsonScript(plannerPath, workspaceDir, input, LONG_TASK_AUTO_CHAIN_PLANNER_TIMEOUT_MS);
}
function buildAutoChainPlanBlock(planResult: AutoChainPlanResult | null): string {
if (!planResult) {
return [
"[LONG_TASK_AUTO_CHAIN_PLAN]",
"plannerStatus=degraded",
"derivedAction=none",
"dispatchMode=no_dispatch",
"autoChainAllowed=false",
"reason=auto-chain planner unavailable during hook preflight",
"[/LONG_TASK_AUTO_CHAIN_PLAN]",
"",
].join("\n");
}
return [
"[LONG_TASK_AUTO_CHAIN_PLAN]",
`plannerStatus=${planResult.plannerStatus}`,
`derivedAction=${planResult.derivedAction}`,
`dispatchMode=${planResult.dispatchMode}`,
`autoChainAllowed=${planResult.autoChainAllowed}`,
`reason=${planResult.reason}`,
...((planResult.requiredEvidence ?? []).map((entry) => `requiredEvidence=${entry}`)),
"[/LONG_TASK_AUTO_CHAIN_PLAN]",
"",
].join("\n");
}
function buildWrapperEnforcement(wrapperResult: any): string[] {
const lines = [
"- Treat this as ingress preflight guidance from the wrapper MVP.",
];
if (wrapperResult.classification === "long_task") {
lines.push("- ENFORCEMENT: This request defaults to long-task governance; do not treat it as ordinary single-turn chat unless you can clearly justify overriding the classifier.");
lines.push("- ENFORCEMENT: If you proceed, prefer explicit task state and checkpoint discipline over ad-hoc continuation.");
}
if (wrapperResult.handoff?.mode === "button_path") {
lines.push("- ENFORCEMENT: Owner decision is expected; plan Telegram button-path early instead of ending with a plain-text menu.");
}
if (wrapperResult.silentCandidate === true && wrapperResult.silentLaunchOk === false) {
lines.push("- ENFORCEMENT: Silent launch is NOT allowed in the current form.");
lines.push("- ENFORCEMENT: Use the recommended fallback before proceeding.");
if (wrapperResult.requiredNextAction) {
lines.push(`- ENFORCEMENT: Required next action = ${wrapperResult.requiredNextAction}`);
}
} else if (wrapperResult.silentCandidate === true && wrapperResult.silentLaunchOk === true) {
lines.push("- ENFORCEMENT: Silent launch is only acceptable if you preserve externalized checkpoint discipline and do not rely on memory alone.");
}
return lines;
}
function buildWrapperHardGate(wrapperResult: any): string[] {
const lines: string[] = [];
if (wrapperResult.classification === "long_task") {
lines.push("- HARD_GATE: If you intend to proceed as ordinary chat, you must explicitly justify why long-task governance does not apply.");
}
if (wrapperResult.handoff?.mode === "button_path") {
lines.push("- HARD_GATE: Do not end this flow with a plain-text choice menu. Use Telegram inline buttons or execute the most reasonable next step directly.");
}
if (wrapperResult.silentCandidate === true && wrapperResult.silentLaunchOk === false) {
lines.push("- HARD_GATE: Do NOT launch or continue this task in silent mode in its current form.");
lines.push("- HARD_GATE: Before any silent execution, satisfy the required next action or downgrade to non-silent follow-up.");
}
return lines;
}
function buildGateLockBlock(gateLockResult: GateLockResult | null): string {
if (!gateLockResult) {
return [
"[LONG_TASK_GATE_LOCK]",
"gateStatus=degraded",
"gateRequired=unknown",
"- ENFORCEMENT: Gate-lock evaluator unavailable; keep existing long-task safeguards in force.",
"- ENFORCEMENT: Do not claim you have progressed into the next task or are already pushing the next step unless you have concrete progress evidence such as a sessionKey, runId, modified_files record, verification result, actual dispatch, tool calls, file changes, or a persisted checkpoint artifact.",
"- ENFORCEMENT: Hook inputs for any progression claim should carry progressEvidence (or equivalent concrete fields) so the gate can verify the claim.",
"- HARD_GATE: Evaluator unavailable is not permission to claim silent continuation or next-task progression without verifiable progress evidence.",
"- HARD_GATE: Fall back to a non-silent, evidence-preserving follow-up if you cannot prove checkpoint state or concrete execution.",
"[/LONG_TASK_GATE_LOCK]",
"",
].join("\n");
}
const lines = [
"[LONG_TASK_GATE_LOCK]",
`gateRequired=${gateLockResult.gateRequired}`,
`gateStatus=${gateLockResult.gateStatus}`,
...(gateLockResult.reasons ?? []).map((reason) => `reason=${reason}`),
...((gateLockResult.requiredEvidence ?? []).map((requirement) => {
const fields = (requirement.acceptedFields ?? []).join(",");
return `requiredEvidence=${requirement.evidenceKey ?? "unknown"};fields=${fields};requiredValue=${requirement.requiredValue ?? "unknown"}`;
})),
...((gateLockResult.allowedResponseModes ?? []).map((mode) => `allowedResponseMode=${mode}`)),
"- ENFORCEMENT: Do not claim you have progressed into the next task or are already pushing the next step unless you have concrete progress evidence such as a sessionKey, runId, modified_files record, verification result, actual dispatch, tool calls, file changes, or a persisted checkpoint artifact.",
"- ENFORCEMENT: Hook input should include progressEvidence (or equivalent concrete fields) whenever a progression claim is present.",
"- ENFORCEMENT: Forbidden path: plain-text handoff that pretends the long task is already continuing without an externalized checkpoint.",
"- ENFORCEMENT: Forbidden path: stating you have already entered the next task/step when the record only contains planning language and no concrete execution evidence.",
"- ENFORCEMENT: If hook input carries autoChainNextAction, it must also carry matching autoChainDispatchEvidence before the gate may pass that auto-chain step.",
];
if (gateLockResult.gateStatus === "fail") {
lines.push("- HARD_GATE: Block any plain-text handoff or silent-continuation claim when externalized checkpoint evidence is missing.");
lines.push("- HARD_GATE: Block any reply path that says you already moved into the next task or are advancing the next step without concrete progress evidence.");
lines.push("- HARD_GATE: If a progression claim exists, the hook input must supply progressEvidence (or equivalent concrete fields) before the claim can pass gate.");
lines.push("- HARD_GATE: Do not say you are already on the next task, already dispatched follow-up work, or already progressing in background unless you can point to a sessionKey, runId, modified_files record, verification result, actual tool execution, file changes, emitted messages, or checkpoint records.");
lines.push("- HARD_GATE: If required evidence is missing, ask for/produce the checkpoint or downgrade to a non-silent, evidence-preserving follow-up.");
lines.push("- HARD_GATE: If autoChainNextAction is explicit, you must actually dispatch it and surface autoChainDispatchEvidence; otherwise the gate fails.");
lines.push("- HARD_GATE: If owner decision is involved, do not replace button-path closure with plain-text handoff.");
}
lines.push("[/LONG_TASK_GATE_LOCK]", "");
return lines.join("\n");
}
/**
* Force Recall hook handler
*
* Event: message:preprocessed
* - Reads docs/RULEBOOK.md and SOUL.md from the resolved workspace
* - Prepends a recall gate block to context.bodyForAgent
* - Optionally injects wrapper MVP classification hints when available
*/
const forceRecall = async (event: any) => {
if (event?.type !== "message" || event?.action !== "preprocessed") return;
const ctx = event.context ?? {};
const workspaceDir: string | undefined = ctx.workspaceDir;
if (!workspaceDir) return;
const rulebookPath = path.join(workspaceDir, "docs", "RULEBOOK.md");
const soulPath = path.join(workspaceDir, "SOUL.md");
const [rulebook, soul, wrapperResult] = await Promise.all([
safeReadText(rulebookPath),
safeReadText(soulPath),
runLongTaskWrapper(workspaceDir, ctx),
]);
const gateLockResult = wrapperResult ? await runLongTaskGateLock(workspaceDir, wrapperResult) : null;
const autoChainPlanResult = wrapperResult ? await runAutoChainPlanner(workspaceDir, gateLockResult, wrapperResult) : null;
if (!rulebook && !soul && !wrapperResult && !gateLockResult && !autoChainPlanResult) return;
const wrapperBlock = wrapperResult
? [
"[LONG_TASK_GOVERNOR_PREFLIGHT]",
`classification=${wrapperResult.classification}`,
`silentCandidate=${wrapperResult.silentCandidate}`,
`needsCheckpoint=${wrapperResult.needsCheckpoint}`,
`needsSubagent=${wrapperResult.needsSubagent}`,
`needsOwnerDecision=${wrapperResult.needsOwnerDecision}`,
`silentLaunchOk=${wrapperResult.silentLaunchOk}`,
wrapperResult.silentLaunchReason ? `silentLaunchReason=${wrapperResult.silentLaunchReason}` : null,
wrapperResult.recommendedFallback ? `recommendedFallback=${wrapperResult.recommendedFallback}` : null,
wrapperResult.requiredNextAction ? `requiredNextAction=${wrapperResult.requiredNextAction}` : null,
wrapperResult.handoff?.mode ? `handoff.mode=${wrapperResult.handoff.mode}` : null,
...buildWrapperEnforcement(wrapperResult),
...buildWrapperHardGate(wrapperResult),
"[/LONG_TASK_GOVERNOR_PREFLIGHT]",
"",
]
.filter(Boolean)
.join("\n")
: "";
const gateLockBlock = buildGateLockBlock(gateLockResult);
const autoChainPlanBlock = buildAutoChainPlanBlock(autoChainPlanResult);
const recallBlock = [
"[RECALL_GATE] Mandatory recall before ANY technical action/tool use.",
"- You MUST consult and follow the key rules from RULEBOOK + SOUL.",
"- If you are about to run tools, change configs, modify code, or delegate agents: restate the applicable rules first.",
"",
wrapperBlock || null,
gateLockBlock,
autoChainPlanBlock,
rulebook ? `RULEBOOK (source: ${rulebookPath}):\n${clamp(rulebook, 1200)}` : null,
soul ? `SOUL (source: ${soulPath}):\n${clamp(soul, 1200)}` : null,
"[/RECALL_GATE]",
"",
]
.filter(Boolean)
.join("\n");
const prior = (ctx.bodyForAgent ?? ctx.body ?? ctx.content ?? "") as string;
const injected = `${recallBlock}${prior ? "\n" + prior : ""}`;
ctx.bodyForAgent = injected;
event.context = ctx;
if (process.env.OPENCLAW_FORCE_RECALL_DEBUG === "1") {
ctx.bodyForAgent += "\n\n[force-recall:debug] injected";
console.log(`[force-recall:debug] injected for chat=${ctx.chatId ?? "?"} msg=${ctx.messageId ?? "?"}`);
}
};
export default forceRecall;

View File

@@ -0,0 +1,109 @@
#!/usr/bin/env node
import fs from 'node:fs';
const LEGAL_TERMINAL_STATES = new Set(['waiting_user', 'blocked', 'pending_verification']);
function parseArgs(argv) {
let inputPath = null;
let compact = false;
for (let i = 0; i < argv.length; i += 1) {
const arg = argv[i];
if (arg === '--input') {
inputPath = argv[i + 1] ?? null;
i += 1;
continue;
}
if (arg.startsWith('--input=')) {
inputPath = arg.slice('--input='.length);
continue;
}
if (arg === '--compact') {
compact = true;
continue;
}
}
return { inputPath, compact };
}
function readInput(inputPath) {
if (!inputPath) {
return {
ok: false,
error: 'missing_required_input',
};
}
try {
const raw = fs.readFileSync(inputPath, 'utf8');
const parsed = JSON.parse(raw);
return {
ok: true,
bytes: Buffer.byteLength(raw, 'utf8'),
preview: raw.slice(0, 0),
parsed,
};
} catch (error) {
return {
ok: false,
error: error instanceof Error ? error.message : String(error),
};
}
}
function evaluateContinuity(payload) {
const taskComplete = payload?.taskState === 'complete';
const nextAction = payload?.nextDerivedAction ?? payload?.derivedAction ?? null;
const nextActionKnown = nextAction != null;
const hasDispatchReceipt = payload?.dispatchReceipt != null;
const closureState = payload?.replyClosureState ?? null;
const isLegalTerminalState = LEGAL_TERMINAL_STATES.has(closureState);
if (taskComplete && nextActionKnown && !hasDispatchReceipt && !isLegalTerminalState) {
return {
ok: false,
status: 'continuity_failure',
verdict: 'continuity_failure',
reason: 'missing_dispatch_receipt',
};
}
return {
ok: true,
status: 'pass',
verdict: 'pass',
};
}
const { inputPath, compact } = parseArgs(process.argv.slice(2));
const input = readInput(inputPath);
const evaluation = input.ok ? evaluateContinuity(input.parsed) : {
ok: false,
status: 'input_error',
verdict: 'input_error',
};
const response = {
...evaluation,
gate: 'approved_plan_continuity',
compact,
inputPath,
input: {
ok: input.ok,
...(input.ok
? {
bytes: input.bytes,
preview: input.preview,
}
: {
error: input.error,
}),
},
};
process.stdout.write(`${JSON.stringify(response)}
`);

View File

@@ -0,0 +1,194 @@
#!/usr/bin/env node
import fs from 'node:fs';
import path from 'node:path';
const DEFAULT_RECEIPT_DIR = path.resolve(process.cwd(), 'state/approved-plan-continuity');
function parseArgs(argv) {
let inputPath = null;
let compact = false;
let receiptDir = DEFAULT_RECEIPT_DIR;
for (let i = 0; i < argv.length; i += 1) {
const arg = argv[i];
if (arg === '--input') {
inputPath = argv[i + 1] ?? null;
i += 1;
continue;
}
if (arg.startsWith('--input=')) {
inputPath = arg.slice('--input='.length);
continue;
}
if (arg === '--receipt-dir') {
receiptDir = argv[i + 1] ? path.resolve(argv[i + 1]) : receiptDir;
i += 1;
continue;
}
if (arg.startsWith('--receipt-dir=')) {
receiptDir = path.resolve(arg.slice('--receipt-dir='.length));
continue;
}
if (arg === '--compact') {
compact = true;
continue;
}
}
return { inputPath, compact, receiptDir };
}
function readInput(inputPath) {
if (!inputPath) {
return {
ok: false,
error: 'missing_required_input',
};
}
try {
const raw = fs.readFileSync(inputPath, 'utf8');
const parsed = JSON.parse(raw);
return {
ok: true,
bytes: Buffer.byteLength(raw, 'utf8'),
parsed,
};
} catch (error) {
return {
ok: false,
error: error instanceof Error ? error.message : String(error),
};
}
}
function slugifySegment(value) {
return String(value)
.trim()
.toLowerCase()
.replace(/[^a-z0-9._-]+/g, '-')
.replace(/^-+|-+$/g, '')
.replace(/-{2,}/g, '-');
}
function buildReceipt(payload) {
const nextAction = payload?.nextDerivedAction ?? payload?.derivedAction ?? null;
const receipt = {
planId: payload?.planId ?? null,
currentTask: payload?.currentTask ?? null,
nextDerivedAction: nextAction,
dispatchedAt: payload?.dispatchedAt ?? null,
dispatchRunId: payload?.dispatchRunId ?? null,
childSessionKey: payload?.childSessionKey ?? null,
replyClosureState: payload?.replyClosureState ?? null,
};
return receipt;
}
function validateReceipt(receipt) {
const missing = [];
for (const field of [
'planId',
'currentTask',
'nextDerivedAction',
'dispatchedAt',
'dispatchRunId',
'childSessionKey',
'replyClosureState',
]) {
if (receipt[field] == null) {
missing.push(field);
}
}
const planIdSafe = slugifySegment(receipt.planId ?? '');
const dispatchRunIdSafe = slugifySegment(receipt.dispatchRunId ?? '');
if (!planIdSafe) missing.push('planId_filesystem_safe');
if (!dispatchRunIdSafe) missing.push('dispatchRunId_filesystem_safe');
return {
ok: missing.length === 0,
missing,
planIdSafe,
dispatchRunIdSafe,
};
}
function writeReceipt({ receipt, receiptDir, planIdSafe, dispatchRunIdSafe }) {
fs.mkdirSync(receiptDir, { recursive: true });
const receiptPath = path.join(receiptDir, `receipt-${planIdSafe}-${dispatchRunIdSafe}.json`);
fs.writeFileSync(receiptPath, `${JSON.stringify(receipt, null, 2)}\n`, 'utf8');
return receiptPath;
}
const { inputPath, compact, receiptDir } = parseArgs(process.argv.slice(2));
const input = readInput(inputPath);
let response;
if (!input.ok) {
response = {
ok: false,
status: 'input_error',
binding: 'approved_plan_dispatch',
compact,
inputPath,
receipt: null,
receiptPath: null,
input: {
ok: false,
error: input.error,
},
};
} else {
const receipt = buildReceipt(input.parsed);
const validation = validateReceipt(receipt);
if (!validation.ok) {
response = {
ok: false,
status: 'missing_required_receipt_fields',
binding: 'approved_plan_dispatch',
compact,
inputPath,
receipt,
receiptPath: null,
missing: validation.missing,
input: {
ok: true,
bytes: input.bytes,
},
};
} else {
const receiptPath = writeReceipt({
receipt,
receiptDir,
planIdSafe: validation.planIdSafe,
dispatchRunIdSafe: validation.dispatchRunIdSafe,
});
response = {
ok: true,
status: 'receipt_written',
binding: 'approved_plan_dispatch',
compact,
inputPath,
receipt,
receiptPath,
input: {
ok: true,
bytes: input.bytes,
},
};
}
}
process.stdout.write(`${JSON.stringify(response)}\n`);

View File

@@ -0,0 +1,285 @@
#!/usr/bin/env node
import fs from 'node:fs';
import path from 'node:path';
import process from 'node:process';
const ROOT_DIR = path.resolve(import.meta.dirname, '..');
const STATE_DIR = path.join(ROOT_DIR, 'state', 'subagent-delivery-watchdog');
function parseArgs(argv) {
const args = {
compact: false,
input: null,
help: false,
};
for (let i = 0; i < argv.length; i += 1) {
const token = argv[i];
if (token === '--compact') {
args.compact = true;
continue;
}
if (token === '--help' || token === '-h') {
args.help = true;
continue;
}
if (token === '--input') {
args.input = argv[i + 1] ?? null;
i += 1;
continue;
}
if (token.startsWith('--input=')) {
args.input = token.slice('--input='.length) || null;
continue;
}
}
return args;
}
function printHelp() {
const lines = [
'Usage: node scripts/subagent_delivery_watchdog.mjs [--compact] [--input <path>]',
'',
'Minimal CLI skeleton for the subagent delivery watchdog.',
];
process.stdout.write(`${lines.join('\n')}\n`);
}
function tryReadInput(inputPath) {
if (!inputPath) {
return {
path: null,
exists: false,
bytes: 0,
preview: '',
};
}
try {
const content = fs.readFileSync(inputPath, 'utf8');
return {
path: inputPath,
exists: true,
bytes: Buffer.byteLength(content, 'utf8'),
preview: content.slice(0, 200),
content,
};
} catch (error) {
return {
path: inputPath,
exists: false,
bytes: 0,
preview: '',
error: error instanceof Error ? error.message : String(error),
};
}
}
function tryParseJson(content) {
if (typeof content !== 'string' || content.length === 0) {
return null;
}
try {
return JSON.parse(content);
} catch {
return null;
}
}
function writeDispatchReceiptState(payload) {
if (!payload || typeof payload !== 'object') {
return null;
}
const { runId, childSessionKey, dispatchAt, expectedBy } = payload;
if (![runId, childSessionKey, dispatchAt, expectedBy].every((value) => typeof value === 'string' && value.length > 0)) {
return null;
}
fs.mkdirSync(STATE_DIR, { recursive: true });
const statePath = path.join(STATE_DIR, `${runId}.json`);
const dispatchRecord = {
runId,
childSessionKey,
dispatchAt,
expectedBy,
};
fs.writeFileSync(statePath, `${JSON.stringify(dispatchRecord, null, 2)}\n`, 'utf8');
return {
statePath,
record: dispatchRecord,
};
}
function writeCompletionReceiptState(payload) {
if (!payload || typeof payload !== 'object') {
return null;
}
const { runId } = payload;
const completionReceivedAt = payload.completionReceivedAt ?? payload.completionReceiptAt ?? null;
const forwardedToMain = payload.forwardedToMain;
const resultSource = payload.resultSource;
if (typeof runId !== 'string' || runId.length === 0) {
return null;
}
const completionUpdates = {};
if (typeof completionReceivedAt === 'string' && completionReceivedAt.length > 0) {
completionUpdates.completionReceivedAt = completionReceivedAt;
}
if (typeof forwardedToMain === 'boolean') {
completionUpdates.forwardedToMain = forwardedToMain;
}
if (typeof resultSource === 'string' && resultSource.length > 0) {
completionUpdates.resultSource = resultSource;
}
if (Object.keys(completionUpdates).length === 0) {
return null;
}
fs.mkdirSync(STATE_DIR, { recursive: true });
const statePath = path.join(STATE_DIR, `${runId}.json`);
let currentRecord = {};
if (fs.existsSync(statePath)) {
try {
currentRecord = JSON.parse(fs.readFileSync(statePath, 'utf8'));
} catch {
currentRecord = {};
}
}
const nextRecord = {
...currentRecord,
runId,
...completionUpdates,
};
fs.writeFileSync(statePath, `${JSON.stringify(nextRecord, null, 2)}\n`, 'utf8');
return {
statePath,
record: nextRecord,
updatedFields: Object.keys(completionUpdates),
};
}
function parseTime(value) {
if (typeof value !== 'string' || value.length === 0) {
return null;
}
const timestamp = Date.parse(value);
return Number.isNaN(timestamp) ? null : timestamp;
}
function recomputeStatus(payload) {
if (!payload || typeof payload !== 'object') {
return 'not_implemented';
}
const completionReceivedAt = payload.completionReceivedAt ?? payload.completionReceiptAt ?? null;
if (parseTime(completionReceivedAt) !== null) {
return 'completed';
}
const hasDispatch = [payload.runId, payload.childSessionKey, payload.dispatchAt, payload.expectedBy].every(
(value) => typeof value === 'string' && value.length > 0,
);
if (!hasDispatch) {
return 'not_implemented';
}
const childRunStatus = typeof payload.childRunStatus === 'string'
? payload.childRunStatus.trim().toLowerCase()
: null;
if (childRunStatus === 'done') {
return 'done_but_not_forwarded';
}
const expectedBy = parseTime(payload.expectedBy);
const currentTime = parseTime(payload.currentTime);
if (expectedBy === null || currentTime === null) {
return 'not_implemented';
}
if (currentTime > expectedBy) {
return 'suspect_delivery_failure';
}
return 'active';
}
function main() {
const args = parseArgs(process.argv.slice(2));
if (args.help) {
printHelp();
process.exit(0);
}
const input = tryReadInput(args.input);
const inputPayload = input.exists ? tryParseJson(input.content) : null;
const dispatchWrite = writeDispatchReceiptState(inputPayload);
const completionWrite = writeCompletionReceiptState(inputPayload);
const status = recomputeStatus(inputPayload);
if ('content' in input) {
delete input.content;
}
const records = [];
if (dispatchWrite) {
records.push(dispatchWrite.record);
}
if (completionWrite) {
records.push(completionWrite.record);
}
const response = {
ok: true,
tool: 'subagent_delivery_watchdog',
version: 'skeleton-v4',
mode: 'receipt-write',
args: {
compact: args.compact,
input: args.input,
},
input,
result: {
status,
message: status === 'not_implemented'
? 'Dispatch and completion receipt writes are implemented; status recompute only handles basic active/suspect/completed states.'
: 'Basic watchdog status recompute completed.',
records,
dispatchReceiptWrite: dispatchWrite,
completionReceiptWrite: completionWrite,
},
};
const spacing = args.compact ? 0 : 2;
process.stdout.write(`${JSON.stringify(response, null, spacing)}\n`);
}
main();

View File

@@ -0,0 +1,421 @@
#!/usr/bin/env node
import { mkdirSync, mkdtempSync, rmSync, writeFileSync } from 'node:fs';
import os from 'node:os';
import path from 'node:path';
import { spawnSync } from 'node:child_process';
import { fileURLToPath } from 'node:url';
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const gateScript = path.join(__dirname, 'approved_plan_continuity_gate.mjs');
function createFixture(files = {}) {
const root = mkdtempSync(path.join(os.tmpdir(), 'approved-plan-continuity-'));
for (const [relativePath, content] of Object.entries(files)) {
const filePath = path.join(root, relativePath);
mkdirSync(path.dirname(filePath), { recursive: true });
writeFileSync(filePath, typeof content === 'string' ? content : `${JSON.stringify(content, null, 2)}\n`);
}
return {
root,
path(...segments) {
return path.join(root, ...segments);
},
cleanup() {
rmSync(root, { recursive: true, force: true });
},
};
}
function runGate({ args = [], stdin = null } = {}) {
const result = spawnSync(process.execPath, [gateScript, ...args], {
input: stdin,
encoding: 'utf8',
});
let json = null;
if (result.stdout && result.stdout.trim()) {
try {
json = JSON.parse(result.stdout);
} catch {
json = null;
}
}
return {
status: result.status,
stdout: result.stdout,
stderr: result.stderr,
json,
};
}
const tests = [
{
name: 'skeleton: gate script responds with placeholder envelope when given fixture input',
run() {
const fixture = createFixture({
'input.json': {
planId: 'plan-skeleton',
currentTask: 'task-5',
},
});
try {
const result = runGate({
args: ['--compact', '--input', fixture.path('input.json')],
});
if (result.status !== 0 && result.status !== null) {
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
}
if (!result.json || typeof result.json !== 'object') {
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
}
if (result.json.gate !== 'approved_plan_continuity') {
throw new Error(`expected gate=approved_plan_continuity, got ${JSON.stringify(result.json.gate)}`);
}
} finally {
fixture.cleanup();
}
},
},
{
name: 'continuity: fails when task is complete, next action is known, no dispatch receipt exists, and closure is not in an allowed terminal state',
run() {
const fixture = createFixture({
'input.json': {
planId: 'plan-missing-dispatch',
currentTask: 'task-6',
taskState: 'complete',
nextDerivedAction: {
type: 'message_subagent',
task: 'continue with task-7',
},
replyClosureState: 'completed',
dispatchReceipt: null,
},
});
try {
const result = runGate({
args: ['--compact', '--input', fixture.path('input.json')],
});
if (result.status !== 0 && result.status !== null) {
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
}
if (!result.json || typeof result.json !== 'object') {
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
}
if (result.json.ok !== false) {
throw new Error(`expected continuity failure ok=false, got ${JSON.stringify(result.json)}`);
}
if (result.json.verdict !== 'continuity_failure') {
throw new Error(`expected verdict=continuity_failure, got ${JSON.stringify(result.json.verdict)}`);
}
} finally {
fixture.cleanup();
}
},
},
{
name: 'continuity: fails when planner returns derivedAction without any bound dispatch receipt',
run() {
const fixture = createFixture({
'input.json': {
planId: 'plan-derived-action-without-bound-dispatch',
currentTask: 'task-6b',
taskState: 'complete',
derivedAction: {
type: 'message_subagent',
task: 'continue with task-7b',
},
replyClosureState: 'completed',
dispatchReceipt: null,
},
});
try {
const result = runGate({
args: ['--compact', '--input', fixture.path('input.json')],
});
if (result.status !== 0 && result.status !== null) {
throw new Error(`expected controlled execution, got status=${result.status}
${result.stderr || result.stdout}`);
}
if (!result.json || typeof result.json !== 'object') {
throw new Error(`expected JSON output
stdout=${result.stdout}`);
}
if (result.json.ok !== false) {
throw new Error(`expected continuity failure ok=false for derivedAction without dispatch receipt, got ${JSON.stringify(result.json)}`);
}
if (result.json.verdict !== 'continuity_failure') {
throw new Error(`expected verdict=continuity_failure for derivedAction without dispatch receipt, got ${JSON.stringify(result.json.verdict)}`);
}
} finally {
fixture.cleanup();
}
},
},
{
name: 'continuity: passes when task is complete, next action is known, and a dispatch receipt already exists',
run() {
const fixture = createFixture({
'input.json': {
planId: 'plan-existing-dispatch',
currentTask: 'task-6',
taskState: 'complete',
nextDerivedAction: {
type: 'message_subagent',
task: 'continue with task-7',
},
replyClosureState: 'completed',
dispatchReceipt: {
planId: 'plan-existing-dispatch',
currentTask: 'task-6',
nextDerivedAction: {
type: 'message_subagent',
task: 'continue with task-7',
},
dispatchedAt: '2026-04-24T11:55:00+08:00',
},
},
});
try {
const result = runGate({
args: ['--compact', '--input', fixture.path('input.json')],
});
if (result.status !== 0 && result.status !== null) {
throw new Error(`expected controlled execution, got status=${result.status}
${result.stderr || result.stdout}`);
}
if (!result.json || typeof result.json !== 'object') {
throw new Error(`expected JSON output
stdout=${result.stdout}`);
}
if (result.json.ok !== true) {
throw new Error(`expected continuity pass ok=true when dispatch receipt exists, got ${JSON.stringify(result.json)}`);
}
} finally {
fixture.cleanup();
}
},
},
{
name: 'continuity: passes when planner returns derivedAction and a bound dispatch receipt already exists',
run() {
const fixture = createFixture({
'input.json': {
planId: 'plan-derived-action-with-bound-dispatch',
currentTask: 'task-6c',
taskState: 'complete',
derivedAction: {
type: 'message_subagent',
task: 'continue with task-7c',
},
replyClosureState: 'completed',
dispatchReceipt: {
planId: 'plan-derived-action-with-bound-dispatch',
currentTask: 'task-6c',
derivedAction: {
type: 'message_subagent',
task: 'continue with task-7c',
},
dispatchedAt: '2026-04-24T12:05:00+08:00',
},
},
});
try {
const result = runGate({
args: ['--compact', '--input', fixture.path('input.json')],
});
if (result.status !== 0 && result.status !== null) {
throw new Error(`expected controlled execution, got status=${result.status}
${result.stderr || result.stdout}`);
}
if (!result.json || typeof result.json !== 'object') {
throw new Error(`expected JSON output
stdout=${result.stdout}`);
}
if (result.json.ok !== true) {
throw new Error(`expected continuity pass ok=true when derivedAction has bound dispatch receipt, got ${JSON.stringify(result.json)}`);
}
} finally {
fixture.cleanup();
}
},
},
{
name: 'continuity: passes when task is complete, next action is known, no dispatch receipt exists, and closure is waiting_user',
run() {
const fixture = createFixture({
'input.json': {
planId: 'plan-waiting-user-closure',
currentTask: 'task-8',
taskState: 'complete',
nextDerivedAction: {
type: 'message_subagent',
task: 'continue with task-9',
},
replyClosureState: 'waiting_user',
dispatchReceipt: null,
},
});
try {
const result = runGate({
args: ['--compact', '--input', fixture.path('input.json')],
});
if (result.status !== 0 && result.status !== null) {
throw new Error(`expected controlled execution, got status=${result.status}
${result.stderr || result.stdout}`);
}
if (!result.json || typeof result.json !== 'object') {
throw new Error(`expected JSON output
stdout=${result.stdout}`);
}
if (result.json.ok !== true) {
throw new Error(`expected continuity pass ok=true when closure is waiting_user, got ${JSON.stringify(result.json)}`);
}
} finally {
fixture.cleanup();
}
},
},
{
name: 'continuity: passes when task is complete, next action is known, no dispatch receipt exists, and closure is pending_verification',
run() {
const fixture = createFixture({
'input.json': {
planId: 'plan-pending-verification-closure',
currentTask: 'task-8b',
taskState: 'complete',
nextDerivedAction: {
type: 'message_subagent',
task: 'continue with task-9',
},
replyClosureState: 'pending_verification',
dispatchReceipt: null,
},
});
try {
const result = runGate({
args: ['--compact', '--input', fixture.path('input.json')],
});
if (result.status !== 0 && result.status !== null) {
throw new Error(`expected controlled execution, got status=${result.status}
${result.stderr || result.stdout}`);
}
if (!result.json || typeof result.json !== 'object') {
throw new Error(`expected JSON output
stdout=${result.stdout}`);
}
if (result.json.ok !== true) {
throw new Error(`expected continuity pass ok=true when closure is pending_verification, got ${JSON.stringify(result.json)}`);
}
} finally {
fixture.cleanup();
}
},
},
{
name: 'continuity: passes when task is complete, next action is known, no dispatch receipt exists, and closure is blocked',
run() {
const fixture = createFixture({
'input.json': {
planId: 'plan-blocked-closure',
currentTask: 'task-9',
taskState: 'complete',
nextDerivedAction: {
type: 'message_subagent',
task: 'continue with task-10',
},
replyClosureState: 'blocked',
dispatchReceipt: null,
},
});
try {
const result = runGate({
args: ['--compact', '--input', fixture.path('input.json')],
});
if (result.status !== 0 && result.status !== null) {
throw new Error(`expected controlled execution, got status=${result.status}
${result.stderr || result.stdout}`);
}
if (!result.json || typeof result.json !== 'object') {
throw new Error(`expected JSON output
stdout=${result.stdout}`);
}
if (result.json.ok !== true) {
throw new Error(`expected continuity pass ok=true when closure is blocked, got ${JSON.stringify(result.json)}`);
}
} finally {
fixture.cleanup();
}
},
},
];
const results = [];
let failed = false;
for (const test of tests) {
try {
test.run();
results.push({ test: test.name, ok: true });
} catch (error) {
failed = true;
results.push({
test: test.name,
ok: false,
error: error instanceof Error ? error.message : String(error),
});
}
}
const summary = {
total: tests.length,
passed: results.filter((entry) => entry.ok).length,
failed: results.filter((entry) => !entry.ok).length,
};
process.stdout.write(`${JSON.stringify({ summary, results }, null, 2)}\n`);
if (failed) process.exit(1);

View File

@@ -0,0 +1,245 @@
#!/usr/bin/env node
import assert from 'node:assert/strict';
import { mkdtempSync, rmSync, writeFileSync } from 'node:fs';
import { tmpdir } from 'node:os';
import path from 'node:path';
import process from 'node:process';
import { spawnSync } from 'node:child_process';
const ROOT_DIR = path.resolve(import.meta.dirname, '..');
const WATCHDOG_SCRIPT = path.join(ROOT_DIR, 'scripts', 'subagent_delivery_watchdog.mjs');
function createFixtureRunner() {
const fixtureRoot = mkdtempSync(path.join(tmpdir(), 'subagent-watchdog-test-'));
function writeFixture(name, content) {
const fixturePath = path.join(fixtureRoot, name);
const body = typeof content === 'string' ? content : JSON.stringify(content, null, 2);
writeFileSync(fixturePath, body);
return fixturePath;
}
function runWatchdog(args = [], options = {}) {
const result = spawnSync(process.execPath, [WATCHDOG_SCRIPT, ...args], {
cwd: ROOT_DIR,
encoding: 'utf8',
...options,
});
return {
status: result.status,
signal: result.signal,
stdout: result.stdout ?? '',
stderr: result.stderr ?? '',
error: result.error ?? null,
};
}
function cleanup() {
rmSync(fixtureRoot, { recursive: true, force: true });
}
return {
fixtureRoot,
writeFixture,
runWatchdog,
cleanup,
};
}
const tests = [];
function test(name, fn) {
tests.push({ name, fn });
}
function printResult(prefix, name, detail = '') {
const suffix = detail ? ` ${detail}` : '';
process.stdout.write(`${prefix} ${name}${suffix}\n`);
}
test('fixture runner can invoke watchdog skeleton with a generated input file', () => {
const runner = createFixtureRunner();
try {
const inputPath = runner.writeFixture('dispatch.json', {
runId: 'fixture-run-001',
childSessionKey: 'session:test',
});
const result = runner.runWatchdog(['--compact', '--input', inputPath]);
assert.equal(result.status, 0, `expected zero exit status, got ${result.status}\n${result.stderr}`);
assert.equal(result.stderr, '');
const payload = JSON.parse(result.stdout);
assert.equal(payload.ok, true);
assert.equal(payload.tool, 'subagent_delivery_watchdog');
assert.equal(payload.result.status, 'not_implemented');
assert.equal(payload.input.path, inputPath);
assert.equal(payload.input.exists, true);
} finally {
runner.cleanup();
}
});
test('watchdog reports active before SLA when dispatch exists and no completion receipt has arrived yet', () => {
const runner = createFixtureRunner();
try {
const inputPath = runner.writeFixture('dispatch-before-sla.json', {
runId: 'fixture-run-active-before-sla',
childSessionKey: 'session:active-before-sla',
dispatchAt: '2026-04-24T10:00:00.000Z',
expectedBy: '2026-04-24T10:10:00.000Z',
currentTime: '2026-04-24T10:05:00.000Z',
});
const result = runner.runWatchdog(['--compact', '--input', inputPath]);
assert.equal(result.status, 0, `expected zero exit status, got ${result.status}
${result.stderr}`);
assert.equal(result.stderr, '');
const payload = JSON.parse(result.stdout);
assert.equal(payload.ok, true);
assert.equal(payload.input.path, inputPath);
assert.equal(payload.input.exists, true);
assert.equal(payload.result.status, 'active');
} finally {
runner.cleanup();
}
});
test('watchdog reports suspect delivery failure after SLA when dispatch exists and no completion receipt has arrived yet', () => {
const runner = createFixtureRunner();
try {
const inputPath = runner.writeFixture('dispatch-beyond-sla.json', {
runId: 'fixture-run-suspect-delivery-failure',
childSessionKey: 'session:suspect-delivery-failure',
dispatchAt: '2026-04-24T10:00:00.000Z',
expectedBy: '2026-04-24T10:10:00.000Z',
currentTime: '2026-04-24T10:15:00.000Z',
});
const result = runner.runWatchdog(['--compact', '--input', inputPath]);
assert.equal(result.status, 0, `expected zero exit status, got ${result.status}
${result.stderr}`);
assert.equal(result.stderr, '');
const payload = JSON.parse(result.stdout);
assert.equal(payload.ok, true);
assert.equal(payload.input.path, inputPath);
assert.equal(payload.input.exists, true);
assert.equal(payload.result.status, 'suspect_delivery_failure');
} finally {
runner.cleanup();
}
});
test('watchdog reports completed when dispatch exists and completion receipt has arrived', () => {
const runner = createFixtureRunner();
try {
const inputPath = runner.writeFixture('dispatch-completed.json', {
runId: 'fixture-run-completed',
childSessionKey: 'session:completed',
dispatchAt: '2026-04-24T10:00:00.000Z',
expectedBy: '2026-04-24T10:10:00.000Z',
currentTime: '2026-04-24T10:05:00.000Z',
completionReceiptAt: '2026-04-24T10:04:00.000Z',
});
const result = runner.runWatchdog(['--compact', '--input', inputPath]);
assert.equal(result.status, 0, `expected zero exit status, got ${result.status}
${result.stderr}`);
assert.equal(result.stderr, '');
const payload = JSON.parse(result.stdout);
assert.equal(payload.ok, true);
assert.equal(payload.input.path, inputPath);
assert.equal(payload.input.exists, true);
assert.equal(payload.result.status, 'completed');
} finally {
runner.cleanup();
}
});
test('watchdog reports done but not forwarded when child run is marked done without a main-thread completion receipt', () => {
const runner = createFixtureRunner();
try {
const inputPath = runner.writeFixture('dispatch-done-not-forwarded.json', {
runId: 'fixture-run-done-not-forwarded',
childSessionKey: 'session:done-not-forwarded',
dispatchAt: '2026-04-24T10:00:00.000Z',
expectedBy: '2026-04-24T10:10:00.000Z',
currentTime: '2026-04-24T10:05:00.000Z',
childRunStatus: 'done',
});
const result = runner.runWatchdog(['--compact', '--input', inputPath]);
assert.equal(result.status, 0, `expected zero exit status, got ${result.status}
${result.stderr}`);
assert.equal(result.stderr, '');
const payload = JSON.parse(result.stdout);
assert.equal(payload.ok, true);
assert.equal(payload.input.path, inputPath);
assert.equal(payload.input.exists, true);
assert.equal(payload.result.status, 'done_but_not_forwarded');
} finally {
runner.cleanup();
}
});
test('fixture runner exposes missing-input behavior for future fail-first cases', () => {
const runner = createFixtureRunner();
try {
const missingPath = path.join(runner.fixtureRoot, 'missing.json');
const result = runner.runWatchdog(['--compact', '--input', missingPath]);
assert.equal(result.status, 0, `expected zero exit status, got ${result.status}\n${result.stderr}`);
const payload = JSON.parse(result.stdout);
assert.equal(payload.ok, true);
assert.equal(payload.input.path, missingPath);
assert.equal(payload.input.exists, false);
assert.equal(payload.result.status, 'not_implemented');
} finally {
runner.cleanup();
}
});
function main() {
let passed = 0;
for (const { name, fn } of tests) {
try {
fn();
passed += 1;
printResult('PASS', name);
} catch (error) {
printResult('FAIL', name, error instanceof Error ? `- ${error.message}` : `- ${String(error)}`);
if (error instanceof Error && error.stack) {
process.stderr.write(`${error.stack}\n`);
}
process.exitCode = 1;
}
}
const failed = tests.length - passed;
process.stdout.write(`\nSummary: ${passed} passed, ${failed} failed, ${tests.length} total\n`);
}
main();

View File

View File

@@ -0,0 +1,62 @@
# Approved Plan Continuity Receipt Storage
This directory stores file-backed continuity receipts for approved-plan flows.
## Scope
This storage definition is intentionally minimal.
It defines only the receipt location, minimum receipt shape, and filename convention for continuity receipts.
It does **not** implement receipt writing, hook integration, dispatch orchestration, or gate evaluation logic.
## Receipt file format
- Format: JSON
- Encoding: UTF-8
- One receipt per file
## Minimum receipt shape
Each continuity receipt file must contain a JSON object with at least these fields:
- `planId`
- `currentTask`
- `nextDerivedAction`
- `dispatchedAt`
- `dispatchRunId`
- `childSessionKey`
- `replyClosureState`
### Minimal example
```json
{
"planId": "plan_2026_04_24_example",
"currentTask": "Task 15",
"nextDerivedAction": "dispatch next approved-plan task",
"dispatchedAt": "2026-04-24T12:00:00.000+08:00",
"dispatchRunId": "dispatch_2026_04_24_example",
"childSessionKey": "agent:engineering:subtask-example",
"replyClosureState": "open"
}
```
## Filename convention
Continuity receipt filenames must follow this pattern:
```text
receipt-<planId>-<dispatchRunId>.json
```
## Naming rules
- `<planId>` should match the receipt `planId`
- `<dispatchRunId>` should match the receipt `dispatchRunId`
- Use lowercase kebab-case or other filesystem-safe identifiers
- Do not reuse one file for multiple dispatch runs
## State interpretation
- A receipt in this directory represents a persisted continuity dispatch record for one approved-plan dispatch run.
- `replyClosureState` is stored alongside the dispatch linkage so later tasks can distinguish an active dispatch record from an allowed non-dispatch closure state.
- Legal non-dispatch closure values are defined by the plan/runbook logic outside this storage README.

View File

@@ -0,0 +1,12 @@
{
"planId": "plan_2026_04_24_example",
"currentTask": "task-16",
"nextDerivedAction": {
"type": "message_subagent",
"task": "continue with task-17"
},
"dispatchedAt": "2026-04-24T12:24:00.000+08:00",
"dispatchRunId": "dispatch_2026_04_24_example",
"childSessionKey": "agent:engineering:subtask-example",
"replyClosureState": "open"
}

View File

@@ -0,0 +1,81 @@
# Subagent Delivery Watchdog State Shape
This directory is reserved for file-backed state used by the subagent delivery watchdog.
## Purpose
The watchdog tracks whether a subagent dispatch has a matching completion receipt and whether the main thread has enough evidence to classify the run state without guessing.
This task defines the **state JSON shape only**. It does **not** implement receipt write logic, status recomputation, recovery behavior, or live integration.
## Suggested file model
One JSON document per dispatched subagent run.
Example path pattern:
- `state/subagent-delivery-watchdog/<runId>.json`
## State JSON shape
```json
{
"runId": "run_2026_04_24_abc123",
"childSessionKey": "agent:engineering:subagent:cd236af1-7d4a-4f4e-bccd-04e4f9a96c02",
"dispatchAt": "2026-04-24T10:40:00+08:00",
"expectedBy": "2026-04-24T10:50:00+08:00",
"completionReceivedAt": null,
"forwardedToMain": false,
"resultSource": null,
"status": "active",
"statusUpdatedAt": "2026-04-24T10:40:00+08:00",
"statusReason": "Dispatch receipt exists and SLA has not been crossed.",
"recoveryAction": null,
"recoveryAttemptCount": 0,
"lastRecoveryAt": null,
"notes": []
}
```
## Receipt fields
### Dispatch receipt fields
- `runId`: unique identifier for the dispatched subagent run.
- `childSessionKey`: session key or stable child-session identifier used to correlate the run.
- `dispatchAt`: ISO-8601 timestamp for when the subagent was dispatched.
- `expectedBy`: ISO-8601 timestamp for the watchdog SLA / expected completion deadline.
### Completion receipt fields
- `completionReceivedAt`: ISO-8601 timestamp for when a completion receipt was observed by the owner thread; `null` if not yet observed.
- `forwardedToMain`: boolean indicating whether the completion/result was confirmed forwarded back to the main thread.
- `resultSource`: source label for the result evidence, for example `completion_event`, `history_fetch`, or `manual_recovery`; `null` if no completion evidence exists yet.
## Status fields
- `status`: current watchdog classification. Expected values include:
- `active`
- `suspect_delivery_failure`
- `done_but_not_forwarded`
- `completed`
- `recovered`
- `blocked`
- `statusUpdatedAt`: ISO-8601 timestamp of the latest status evaluation/update.
- `statusReason`: short human-readable explanation for why the current status was assigned.
## Optional supporting fields
These fields are not a substitute for the required receipt/status fields, but they can support later tasks safely.
- `recoveryAction`: pending or last recovery decision, if any.
- `recoveryAttemptCount`: number of recovery attempts already made.
- `lastRecoveryAt`: ISO-8601 timestamp of the last recovery attempt.
- `notes`: append-only diagnostic notes.
## Constraints
- Receipt fields and status fields must remain explicit in stored state.
- `completionReceivedAt`, `resultSource`, and recovery-related fields may be `null` before any completion signal exists.
- `forwardedToMain` should remain `false` until the return path to the main thread is actually confirmed.
- Status must be derived from evidence; later implementation should not infer success without a receipt or equivalent recovery proof.

View File

@@ -0,0 +1,6 @@
{
"runId": "fixture-run-active-before-sla",
"childSessionKey": "session:active-before-sla",
"dispatchAt": "2026-04-24T10:00:00.000Z",
"expectedBy": "2026-04-24T10:10:00.000Z"
}

View File

@@ -0,0 +1,7 @@
{
"runId": "fixture-run-completed",
"childSessionKey": "session:completed",
"dispatchAt": "2026-04-24T10:00:00.000Z",
"expectedBy": "2026-04-24T10:10:00.000Z",
"completionReceivedAt": "2026-04-24T10:04:00.000Z"
}

View File

@@ -0,0 +1,6 @@
{
"runId": "fixture-run-done-not-forwarded",
"childSessionKey": "session:done-not-forwarded",
"dispatchAt": "2026-04-24T10:00:00.000Z",
"expectedBy": "2026-04-24T10:10:00.000Z"
}

View File

@@ -0,0 +1,6 @@
{
"runId": "fixture-run-suspect-delivery-failure",
"childSessionKey": "session:suspect-delivery-failure",
"dispatchAt": "2026-04-24T10:00:00.000Z",
"expectedBy": "2026-04-24T10:10:00.000Z"
}

View File

@@ -0,0 +1,9 @@
{
"runId": "preview-completion-write",
"childSessionKey": "session:preview",
"dispatchAt": "2026-04-24T10:00:00.000Z",
"expectedBy": "2026-04-24T10:10:00.000Z",
"completionReceivedAt": "2026-04-24T10:04:00.000Z",
"forwardedToMain": false,
"resultSource": "child_history"
}