From 545dfde7ca303b6e526e7e4ea5412c6f3d7b42bd Mon Sep 17 00:00:00 2001
From: Eve <eve@local>
Date: Wed, 22 Apr 2026 15:12:37 +0800
Subject: [PATCH] Add decision tree for silent long-task handling

---
 WORKFLOW.md                                   |   1 +
 .../silent-long-task-decision-tree.md         | 104 ++++++++++++++++++
 2 files changed, 105 insertions(+)
 create mode 100644 docs/runbooks/silent-long-task-decision-tree.md

diff --git a/WORKFLOW.md b/WORKFLOW.md
index 36cbe76..ae7af75 100644
--- a/WORKFLOW.md
+++ b/WORKFLOW.md
@@ -36,6 +36,7 @@ Subagent 指派後 **5 分鐘內若無結果**：
   - 若最後需要總管判定，handoff 方式（例如 button-path）
 - 任何 silent long-task 都不得只靠內部記憶與口頭承諾維持；應優先綁定外部化 checkpoint / reminder / cron 類觸發。
 - 若沒有外部化觸發可綁，則該任務**不應以 silent 模式啟動**，而應維持在立即 follow-up 模式。
+- 啟動前應參考：`docs/runbooks/silent-long-task-decision-tree.md`
 - 若 silent long-task 啟動後沒有這個強制回報節點，之後出現「為什麼沒消息了？」就視為流程違規，而不是單純延遲。
 
 ## Checkpoint Rule
diff --git a/docs/runbooks/silent-long-task-decision-tree.md b/docs/runbooks/silent-long-task-decision-tree.md
new file mode 100644
index 0000000..57fbddd
--- /dev/null
+++ b/docs/runbooks/silent-long-task-decision-tree.md
@@ -0,0 +1,104 @@
+# Silent Long-Task Decision Tree
+
+Use this decision tree before launching any long-task that may become silent.
+
+## Step 1 — Will this finish in the current turn?
+
+If **yes**:
+- do **not** make it silent
+- keep it in immediate follow-up mode
+- finish or report within the current conversational flow
+
+If **no**:
+- continue to Step 2
+
+## Step 2 — Will the task stop naturally producing user-visible output for a while?
+
+Examples:
+- research before reporting back
+- debugging before a finding exists
+- waiting on subagents
+- background execution
+- long verification or staged analysis
+
+If **no**:
+- keep it as ordinary long-task with immediate follow-up
+- no silent-mode launch needed
+
+If **yes**:
+- this is a **silent long-task**
+- continue to Step 3
+
+## Step 3 — Can an externalized checkpoint trigger be bound?
+
+Valid options include:
+- cron reminder
+- forced checkpoint trigger design
+- another reliable external wake path
+
+If **yes**:
+- define the first checkpoint trigger
+- define the unfinished-task report shape
+- define the no-evidence downgrade path (`paused` / `blocked`)
+- define final owner handoff if needed
+- then launch as silent long-task
+
+If **no**:
+- do **not** launch in silent mode
+- keep it non-silent
+- either stay in immediate follow-up mode or break the work into smaller visible steps
+
+## Step 4 — Does the final result require owner judgement?
+
+Examples:
+- pass / fail
+- accept / reject
+- rerun / stop
+- approval / confirm
+
+If **yes**:
+- pre-plan the final handoff path
+- on Telegram, prefer button-path
+- do not wait until the final paragraph to think about buttons
+
+If **no**:
+- standard checkpoint and completion path is enough
+
+## Step 5 — What if the checkpoint fires and the task is unfinished?
+
+Required response:
+- report current status honestly
+- show latest real evidence
+- show next step
+- show next report condition
+
+Do **not**:
+- pretend progress without evidence
+- leave the task cosmetically `active`
+
+## Step 6 — What if the checkpoint fires and there is no new evidence?
+
+Required response:
+- downgrade to `paused` or `blocked`
+- explain why there is no new evidence
+- stop pretending continuous progress
+
+## Practical summary
+
+### Use silent long-task mode when:
+- the task truly needs time between outputs
+- you can bind an externalized checkpoint path
+- you know how unfinished / no-evidence / final handoff states will be handled
+
+### Do not use silent long-task mode when:
+- the work can finish in the current turn
+- you cannot bind an externalized checkpoint path
+- you are only relying on memory or intention to "remember to report back"
+
+## Coverage against known failure cases
+
+This decision tree is meant to prevent:
+- "I’ll go do this and report back" blackholes
+- full tests that silently disappear between start and result
+- delegated/subagent waits with no forced follow-up
+- research/debug tasks that claim activity but produce no evidence