From 450b99fa5b4fba0056bd1705f6d568a3a438365e Mon Sep 17 00:00:00 2001
From: Eve <eve@local>
Date: Wed, 22 Apr 2026 12:41:42 +0800
Subject: [PATCH] Prefer button-driven flows for Telegram test verdicts

---
 WORKFLOW_GATES.md                  | 11 +++++++++++
 skills/long-task-governor/SKILL.md |  2 ++
 2 files changed, 13 insertions(+)

diff --git a/WORKFLOW_GATES.md b/WORKFLOW_GATES.md
index e7eb84f..796b129 100644
--- a/WORKFLOW_GATES.md
+++ b/WORKFLOW_GATES.md
@@ -31,6 +31,17 @@ This especially applies to:
 
 If the endpoint is predictably a user decision, the assistant should structure the run so that the final user-facing handoff is already prepared as a button interaction.
 
+### Button-driven test rule
+
+If a test or validation flow is known in advance to end in a Telegram pass/fail, accept/reject, or rerun/stop decision, do not start that test in the ordinary text-reply lane.
+
+Instead, the test must be treated as a **button-driven flow** from the beginning:
+1. use normal text only for internal progress while no user decision handoff is needed
+2. once the flow is designed around a final owner verdict, prepare the ending as a `message`-tool button handoff
+3. for short regression tests whose whole purpose is to verify button closure behavior, prefer opening and closing through the button path itself rather than narrating a long plain-text test body first
+
+This rule exists because repeatedly "planning to use buttons at the end" still leaks plain text first.
+
 ### Forbidden behavior
 
 These are violations when used as the closing interaction on Telegram:
diff --git a/skills/long-task-governor/SKILL.md b/skills/long-task-governor/SKILL.md
index c9571d9..8894349 100644
--- a/skills/long-task-governor/SKILL.md
+++ b/skills/long-task-governor/SKILL.md
@@ -178,6 +178,7 @@ When this skill applies:
 5. before claiming progress, check for real evidence
 6. if no evidence and no concrete action, stop the clock
 7. if the run is clearly heading toward a user pass/fail or accept/reject judgement on Telegram, prepare a button-path before the final handoff
+8. if the entire test itself exists to validate Telegram decision closure, run it as a button-driven flow rather than a normal long plain-text report
 
 ---
 
@@ -198,6 +199,7 @@ When operating under long-task governance on Telegram:
 - if a choice is genuinely required, use Telegram inline buttons
 - if buttons are required, send them first via the `message` tool rather than first producing a normal text reply
 - if the workflow can already predict the final handoff is a user judgement, move to a button-path before the final closing paragraph
+- if the test's whole point is to validate button closure, prefer a button-driven flow from the outset
 - if no real choice is needed, execute the most reasonable next step directly
 - if the assistant accidentally emits a plain-text choice menu, or says buttons will be used without actually sending them first, treat that as a workflow violation and convert the lesson into a permanent rule