feat: export continuity hard-gate and watchdog workstream

2026-04-24 12:36:31 +08:00
commit 111cf27634
24 changed files with 3648 additions and 0 deletions
--- a/docs/runbooks/approved-plan-continuity.md
+++ b/docs/runbooks/approved-plan-continuity.md
@@ -0,0 +1,56 @@
+# Approved Plan Continuity
+
+## Continuity receipt core fields
+
+### `planId`
+- The identifier of the approved plan that the continuity receipt belongs to.
+- Use this field to associate the receipt with one specific approved plan.
+
+### `currentTask`
+- The task from the approved plan that is currently being executed or has just completed.
+- Use this field to record which plan task the receipt is about.
+
+### `nextDerivedAction`
+- The next concrete action derived from the current task that should be dispatched to continue the workflow.
+- Use this field to record the intended follow-up action for continuity.
+
+### `dispatchedAt`
+- The timestamp indicating when the next derived action was actually dispatched.
+- Use this field to record when the continuity handoff occurred.
+
+## Continuity receipt linkage fields
+
+### `dispatchRunId`
+- The unique identifier for the dispatch run that produced or recorded the next-step continuity handoff.
+- Use this field to link the receipt to one concrete dispatch execution, not just a planned action.
+- This field is for receipt linkage and traceability only; it does not by itself define continuity-gate pass/fail behavior.
+
+### `childSessionKey`
+- The session linkage key for the child session or spawned execution context that receives the dispatched next action.
+- Use this field to connect the continuity receipt to the specific downstream session that should carry the workflow forward.
+- This field records linkage identity only; it does not by itself imply hook integration or dispatch binding logic.
+
+### `replyClosureState`
+- The closure state recorded at the point the current reply is being closed.
+- Use this field to state whether the reply closed under a dispatch-linked continuation path or some separately defined terminal closure state.
+- This field is defined here as a receipt field only; legal closure states and gate enforcement are defined in later tasks.
+
+
+## Legal terminal states
+
+These are the only legal non-dispatch terminal states for an approved-plan continuity closure. If a reply closes without a real next-dispatch receipt, `replyClosureState` must be one of the states below.
+
+### `waiting_user`
+- Use this state only when the approved-plan workflow cannot continue until the user provides a decision, approval, missing information, or some other explicit user response.
+- This state means the workflow is intentionally paused on user input, not silently stopped.
+- Do not use this state when the next step could already be dispatched without further user involvement.
+
+### `blocked`
+- Use this state only when the approved-plan workflow cannot proceed because of an external blocker, dependency, permission issue, outage, or other constraint that is not resolved by the current executor.
+- This state means progress is prevented by a real blocking condition, not by omission of the next dispatch.
+- Do not use this state to explain away a missing continuity handoff when execution could still continue.
+
+### `pending_verification`
+- Use this state only when the implementation or execution step is done enough that the workflow should stop specifically for verification, validation, review, or confirmation of results.
+- This state means the next meaningful action is to verify what was already produced, rather than to dispatch another implementation step immediately.
+- Do not use this state for incomplete work that still has an undispatched next action.
--- a/docs/runbooks/subagent-anti-blackhole.md
+++ b/docs/runbooks/subagent-anti-blackhole.md
@@ -0,0 +1,70 @@
+# Subagent Anti-Blackhole Runbook
+
+## Dispatch receipt fields
+
+Dispatch receipt 僅定義子代理派發當下所需的欄位，用來識別本次派發、關聯子 session，以及標記預期完成時限。
+
+- `runId`: 本次 subagent dispatch 的唯一執行識別碼。用於把同一次任務派發、後續狀態檢查與回報關聯到同一個 run。
+- `childSessionKey`: 子代理 session 的穩定關聯鍵。用於把 dispatch receipt 對應到實際被派發出去的 child session。
+- `dispatchAt`: dispatch receipt 寫入時間，也就是主流程實際派發 subagent 的時間戳記。建議使用可排序的標準時間格式。
+- `expectedBy`: 依照當次任務 SLA 或預估完成時間計算出的期望完成時間戳記。用於判斷目前仍屬正常執行中，或已超過預期等待窗口。
+
+> 本節僅定義 dispatch receipt 欄位，不涵蓋 completion receipts、watchdog logic、recovery 流程或其他後續 task。
+
+## Minimal example
+
+```json
+{
+  "runId": "run_2026-04-24_001",
+  "childSessionKey": "agent:engineering:subagent:example",
+  "dispatchAt": "2026-04-24T10:00:00+08:00",
+  "expectedBy": "2026-04-24T10:15:00+08:00"
+}
+```
+
+## Completion receipt fields
+
+Completion receipt 僅定義子代理完成結果被接收到之後所需記錄的欄位，用來區分「子代理已完成」與「結果是否已成功轉交 main conversation」。
+
+- `completionReceivedAt`: 主流程或監看機制實際收到 completion/result 的時間戳記。用於確認子代理何時已經完成並回傳結果，不再只靠 `expectedBy` 推估。
+- `forwardedToMain`: 布林欄位，表示該 completion/result 是否已成功轉送到 main conversation。用於區分「已收到結果」與「已完成主線回報」這兩個不同狀態。
+- `resultSource`: completion/result 的來源標記，例如來自主動 completion push、補抓回來的 session 狀態，或其他明確來源。用於後續判讀結果是正常送達還是經由補救路徑取得。
+
+> 本節僅定義 completion receipt 欄位，不涵蓋 watchdog logic、recovery 流程、scenario tests 或其他後續 task。
+
+
+## Watchdog statuses
+
+Watchdog status 僅定義監看子代理完成投遞狀態時可使用的狀態列舉，用於區分仍在正常等待、疑似投遞失敗、結果已存在但未轉交，以及已完成或已卡住等情況。
+
+- `active`: dispatch receipt 已存在，且目前仍在 `expectedBy` 之前，也還沒有任何 completion receipt。表示子代理仍在正常等待窗口內，watchdog 只需持續觀察，不應提前視為異常。
+- `suspect_delivery_failure`: dispatch receipt 已存在、目前已超過 `expectedBy`，但主流程仍未收到 completion receipt。表示尚無法證明子代理失敗或成功，只能判定為疑似 completion delivery 出問題，需進入明確的人工可見關注狀態。
+- `done_but_not_forwarded`: 已有可信訊號顯示子代理工作其實做完了，但 main thread 仍沒有對應的 forwarded completion receipt。表示結果可能存在於 child session 或其他回傳路徑上，只是沒有成功 bounce 回主線。
+- `completed`: completion receipt 已被主流程接收，且結果已成功進入主線回報路徑。表示此 run 的 watchdog 可視為正常閉合，不再屬於 blackhole 風險案例。
+- `recovered`: 先前曾落入 `suspect_delivery_failure` 或 `done_but_not_forwarded`，之後透過後續確認或補抓，已把結果重新接回可追蹤狀態。此狀態只定義「已從異常投遞風險中恢復」的語意，不在本 task 提前定義 recovery logic。
+- `blocked`: watchdog 已判定目前無法再以被動等待來解釋狀態，且該 run 需要明確升級處理或人工介入。此狀態只定義「已卡住、不可再默默等待」的語意，不在本 task 提前定義 escalation 或處置流程。
+
+> 本節僅定義 watchdog statuses 的語意與邊界，不提前實作 recovery logic、receipt state code、scenario tests 或其他後續 task。
+
+
+
+## B-class failure modes
+
+B-class failure modes 指的是「子代理工作本身不一定真的 timeout，但主線沒有收到可信 completion 回報」的假 timeout 類型。這一類問題的核心不是先判定 child 一定失敗，而是先區分執行端、事件投遞端與主線轉交端哪一段失聯。
+
+- **done but not forwarded**：child session 內已有可信跡象顯示工作完成，例如子代理已產出最終回報、session 狀態顯示 done，或可確認 completion 已存在於子線；但 main conversation 沒有收到對應的 forwarded result。這類型代表「結果已存在，但沒有被成功轉交到主線」。
+- **no completion event received**：主流程已完成 dispatch，且等待時間已逼近或超過 `expectedBy`，但主線完全沒有收到任何 completion event。此時不能直接斷言 child 一定還在跑，也不能直接斷言 child 已失敗；只能先明確標記為「主線未收到 completion event」，避免把 delivery 問題誤判成單純執行逾時。
+- **session exists but no result bounce**：可確認 child session 仍存在、可被查到，甚至可見到該 session 有持續活動或已留下結果內容，但沒有任何 result bounce 回到 main conversation。這類型比前一類更明確指出：session 並未消失，問題在於結果沒有沿正常回傳路徑反彈回主線。
+- **unclear slow-run vs delivery failure**：目前只知道主線等待已超過預期，但還無法分辨 child 是真的慢、仍在執行，還是其實已完成卻發生 delivery failure。這個 failure mode 的定義重點是保留不確定性：在證據不足時，不應把所有超時都歸類成 slow run，也不應直接假設是 delivery failure。
+
+> 本節只定義 B-class 假 timeout failure modes 的語意邊界與彼此差異，不提前實作 recovery logic、receipt state code、watchdog script 或 scenario tests。
+
+## Completion receipt example
+
+```json
+{
+  "completionReceivedAt": "2026-04-24T10:12:34+08:00",
+  "forwardedToMain": true,
+  "resultSource": "completion_push"
+}
+```