Enforce button-first ordering for Telegram decision closures

2026-04-22 12:13:35 +08:00
parent 26948948b2
commit a85403aa77
4 changed files with 20 additions and 7 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -46,9 +46,10 @@ Do not treat non-chat work as ordinary reply flow.
 ### Reply Closure Rule
 On Telegram, if the final actionable part of your reply needs the human to decide, confirm, approve, stop, continue, rerun, or choose a next step:
- do **not** end with plain text only
+- do **not** let plain text go out first
- do **not** say buttons will be used unless you are actually sending them
+- do **not** say buttons will be used unless you are actually sending them first
- either send real inline buttons or execute the most reasonable next step directly
+- prefer sending real inline buttons with the `message` tool and then return `NO_REPLY`
 - otherwise execute the most reasonable next step directly
 If you fail this, call it a workflow violation and correct it immediately.
--- a/WORKFLOW_GATES.md
+++ b/WORKFLOW_GATES.md
@@ -2,14 +2,23 @@
 ## Reply-Closure Button Gate
-When operating on Telegram, if the **final actionable part** of a reply requires the user to choose, confirm, approve, stop, continue, or select a next step, the assistant must not end with plain text alone.
+When operating on Telegram, if the **final actionable part** of a reply requires the user to choose, confirm, approve, stop, continue, rerun, accept, or select a next step, the assistant must not let plain text reach the user first.
 ### Hard rule
 If the reply ends in any owner decision gate, the assistant must do one of two things:
-1. **Send real inline buttons**
+1. **Send real inline buttons first via the `message` tool**
 2. **Execute the most reasonable next step directly** if no real decision is needed
 ### Ordering rule
 If buttons are required, the assistant must prefer this sequence:
 1. send the actual button message
 2. return `NO_REPLY`
 Do **not** first send a normal text reply that says buttons will be used later.
 Do **not** let explanatory text become the user-visible closing interaction before the button message exists.
 ### Forbidden behavior
 These are violations when used as the closing interaction on Telegram:
@@ -19,6 +28,7 @@ These are violations when used as the closing interaction on Telegram:
 - `如果你要，我可以...`
 - `要不要我繼續？`
 - saying buttons will be used, but not actually sending them
 - sending explanation first, and only later sending buttons after being corrected
 ### Required interpretation
@@ -31,7 +41,7 @@ The gate applies to:
 ### Violation standard
-If the assistant reaches a user-decision closure and no real inline buttons were delivered, treat it as a workflow violation even if the reply mentioned buttons in text.
+If the assistant reaches a user-decision closure and no real inline buttons were delivered first, treat it as a workflow violation even if the reply mentioned buttons in text.
 ### Corrective rule
--- a/memory/2026-04-22.md
+++ b/memory/2026-04-22.md
@@ -50,3 +50,4 @@
 - live test 期間實際抓到一個缺口：在 Telegram 上做 long-task checkpoint / next-step 收尾時，我仍可能退回純文字 `1 / 2 / 3` 選單。
 - 因此已先補一條 skill 規則：`Telegram interaction guard`，禁止 governed long-task flow 在 Telegram 上用純文字選單收尾；若需要選擇，必須改用 inline buttons，否則直接執行最合理下一步。
 - 後續再往前補強到更直接的操作層：新增 `Reply Closure Button Gate` 概念，明確規定只要回覆最後的可執行部分需要總管決定、確認、批准、停止、繼續、重跑或選下一步，就不能只在文字裡說會用按鈕，必須真的送出 inline buttons，或直接執行最合理下一步。
 - 在最短回歸測試後又確認一個更前面的根因：不只要「有按鈕」，而是**若需要按鈕，必須讓按鈕先出場**，不能先送普通文字再補按鈕；因此已把規則再收緊成 ordering rule：優先用 `message` 工具送真按鈕，然後回 `NO_REPLY`。
--- a/skills/long-task-governor/SKILL.md
+++ b/skills/long-task-governor/SKILL.md
@@ -195,8 +195,9 @@ But this skill itself is the governance layer and should remain independently ma
 When operating under long-task governance on Telegram:
 - do not end checkpoints, test progress, or next-step decisions with plain-text menus like `1 / 2 / 3` or `A / B / C`
 - if a choice is genuinely required, use Telegram inline buttons
 - if buttons are required, send them first via the `message` tool rather than first producing a normal text reply
 - if no real choice is needed, execute the most reasonable next step directly
- if the assistant accidentally emits a plain-text choice menu, treat that as a workflow violation and convert the lesson into a permanent rule
+- if the assistant accidentally emits a plain-text choice menu, or says buttons will be used without actually sending them first, treat that as a workflow violation and convert the lesson into a permanent rule
 This prevents governed long-task flows from degrading back into ambiguous text-only decision gates.