Clarify externalized silent long-task policy

This commit is contained in:
Eve
2026-04-22 14:46:11 +08:00
parent 83be99a6bb
commit 52f7f0a557
3 changed files with 26 additions and 17 deletions

View File

@@ -31,9 +31,18 @@ This especially applies to:
If the endpoint is predictably a user decision, the assistant should structure the run so that the final user-facing handoff is already prepared as a button interaction.
### Silent Long-Task Checkpoint Gate
### Externalized Silent Long-Task Gate
If a long-task is started and it will **not naturally produce an immediate next user-visible message**, it must define a forced reporting checkpoint at startup.
If a long-task is started and it will **not naturally produce an immediate next user-visible message**, it is a silent long-task and must not rely only on assistant memory.
A silent long-task must be externalized at startup by defining or binding:
1. the **first forced checkpoint trigger** (time, stage, or event)
2. what to report if the task is **not yet finished** by that checkpoint
3. how to downgrade status if there is **no new evidence** (`paused` / `blocked`)
4. how final owner handoff will work if a user decision is expected
5. whether an actual external trigger should be bound (for example cron/reminder) or whether the task must remain non-silent
If no externalized checkpoint mechanism exists, the task must **not** be launched as silent. It must stay in immediate follow-up mode instead.
This applies to any silent long-task pattern, including but not limited to:
- research
@@ -46,15 +55,9 @@ This applies to any silent long-task pattern, including but not limited to:
- full tests / regression tests
- any "Ill go do this and report back" workflow
At startup, the task must define:
1. the **first checkpoint trigger** (time, stage, or event)
2. what to report if the task is **not yet finished** by that checkpoint
3. how to downgrade status if there is **no new evidence** (`paused` / `blocked`)
4. how final owner handoff will work if a user decision is expected
### Failure rule
If a silent long-task was started without a forced reporting checkpoint, and the user later has to ask "why is there no update?", treat that as a workflow failure / checkpoint-lost condition.
If a silent long-task was started without an externalized checkpoint path, and the user later has to ask "why is there no update?", treat that as a workflow failure / checkpoint-lost condition.
### Button-driven test rule
@@ -78,7 +81,7 @@ These are violations when used as the closing interaction on Telegram:
- saying buttons will be used, but not actually sending them
- sending explanation first, and only later sending buttons after being corrected
- running a full test/report in plain text even though the result is obviously heading toward a pass/fail owner decision
- starting a silent long-task without any explicit forced reporting checkpoint
- starting a silent long-task without any explicit externalized checkpoint path
### Required interpretation
@@ -95,7 +98,7 @@ The gate applies to:
If the assistant reaches a user-decision closure and no real inline buttons were delivered first, treat it as a workflow violation even if the reply mentioned buttons in text.
If a silent long-task goes dark because no forced reporting checkpoint was defined, treat it as a workflow violation even if the task later resumes.
If a silent long-task goes dark because no externalized checkpoint path was defined, treat it as a workflow violation even if the task later resumes.
### Corrective rule