109 lines
4.8 KiB
Markdown
109 lines
4.8 KiB
Markdown
# WORKFLOW_GATES.md
|
||
|
||
## Reply-Closure Button Gate
|
||
|
||
When operating on Telegram, if the **final actionable part** of a reply requires the user to choose, confirm, approve, stop, continue, rerun, accept, or select a next step, the assistant must not let plain text reach the user first.
|
||
|
||
### Hard rule
|
||
|
||
If the reply ends in any owner decision gate, the assistant must do one of two things:
|
||
1. **Send real inline buttons first via the `message` tool**
|
||
2. **Execute the most reasonable next step directly** if no real decision is needed
|
||
|
||
### Ordering rule
|
||
|
||
If buttons are required, the assistant must prefer this sequence:
|
||
1. send the actual button message
|
||
2. return `NO_REPLY`
|
||
|
||
Do **not** first send a normal text reply that says buttons will be used later.
|
||
Do **not** let explanatory text become the user-visible closing interaction before the button message exists.
|
||
|
||
### Early-routing rule
|
||
|
||
If a workflow can already predict that its final step will require user judgement or acceptance, it must switch to a **button-path** early instead of waiting until the last paragraph.
|
||
|
||
This especially applies to:
|
||
- full long-task tests
|
||
- regression tests
|
||
- acceptance / fail verdict flows
|
||
- any report whose natural ending is "please judge this result"
|
||
|
||
If the endpoint is predictably a user decision, the assistant should structure the run so that the final user-facing handoff is already prepared as a button interaction.
|
||
|
||
### Externalized Silent Long-Task Gate
|
||
|
||
If a long-task is started and it will **not naturally produce an immediate next user-visible message**, it is a silent long-task and must not rely only on assistant memory.
|
||
|
||
A silent long-task must be externalized at startup by defining or binding:
|
||
1. the **first forced checkpoint trigger** (time, stage, or event)
|
||
2. what to report if the task is **not yet finished** by that checkpoint
|
||
3. how to downgrade status if there is **no new evidence** (`paused` / `blocked`)
|
||
4. how final owner handoff will work if a user decision is expected
|
||
5. whether an actual external trigger should be bound (for example cron/reminder) or whether the task must remain non-silent
|
||
|
||
If no externalized checkpoint mechanism exists, the task must **not** be launched as silent. It must stay in immediate follow-up mode instead.
|
||
|
||
This applies to any silent long-task pattern, including but not limited to:
|
||
- research
|
||
- investigation
|
||
- debugging
|
||
- delegation / waiting on subagents
|
||
- background execution
|
||
- staged analysis
|
||
- long-running verification
|
||
- full tests / regression tests
|
||
- any "I’ll go do this and report back" workflow
|
||
|
||
### Failure rule
|
||
|
||
If a silent long-task was started without an externalized checkpoint path, and the user later has to ask "why is there no update?", treat that as a workflow failure / checkpoint-lost condition.
|
||
|
||
### Button-driven test rule
|
||
|
||
If a test or validation flow is known in advance to end in a Telegram pass/fail, accept/reject, or rerun/stop decision, do not start that test in the ordinary text-reply lane.
|
||
|
||
Instead, the test must be treated as a **button-driven flow** from the beginning:
|
||
1. use normal text only for internal progress while no user decision handoff is needed
|
||
2. once the flow is designed around a final owner verdict, prepare the ending as a `message`-tool button handoff
|
||
3. for short regression tests whose whole purpose is to verify button closure behavior, prefer opening and closing through the button path itself rather than narrating a long plain-text test body first
|
||
|
||
This rule exists because repeatedly "planning to use buttons at the end" still leaks plain text first.
|
||
|
||
### Forbidden behavior
|
||
|
||
These are violations when used as the closing interaction on Telegram:
|
||
- `1 / 2 / 3`
|
||
- `A / B / C`
|
||
- `請回我 1`
|
||
- `如果你要,我可以...`
|
||
- `要不要我繼續?`
|
||
- saying buttons will be used, but not actually sending them
|
||
- sending explanation first, and only later sending buttons after being corrected
|
||
- running a full test/report in plain text even though the result is obviously heading toward a pass/fail owner decision
|
||
- starting a silent long-task without any explicit externalized checkpoint path
|
||
|
||
### Required interpretation
|
||
|
||
The gate applies to:
|
||
- long-task checkpoints
|
||
- test progress summaries
|
||
- approval requests
|
||
- next-step choices
|
||
- accept / rerun / stop style decisions
|
||
- pass / fail verdict requests
|
||
- silent long-task launches
|
||
|
||
### Violation standard
|
||
|
||
If the assistant reaches a user-decision closure and no real inline buttons were delivered first, treat it as a workflow violation even if the reply mentioned buttons in text.
|
||
|
||
If a silent long-task goes dark because no externalized checkpoint path was defined, treat it as a workflow violation even if the task later resumes.
|
||
|
||
### Corrective rule
|
||
|
||
If this violation happens:
|
||
- acknowledge the violation plainly
|
||
- immediately send the real button message or status recovery update
|
||
- record the lesson into workflow / memory if it exposed a missing rule
|