274 lines
8.6 KiB
Markdown
274 lines
8.6 KiB
Markdown
---
|
|
name: long-task-governor
|
|
description: Use when a request is not ordinary single-turn chat and needs task governance, checkpoint discipline, state management, or anti-stall rules
|
|
---
|
|
|
|
# Long-Task Governor
|
|
|
|
Use this skill whenever work is **not ordinary general chat**.
|
|
|
|
## Core Rule
|
|
|
|
If the request cannot be fully completed within a single conversational reply **without** follow-up work, external waiting, task state, staged execution, or owner-visible progress tracking, treat it as a **long task**.
|
|
|
|
In short:
|
|
|
|
- **ordinary general chat** → answer directly
|
|
- **everything else** → enter long-task governance
|
|
|
|
## What this skill governs
|
|
|
|
This skill provides the workflow layer for:
|
|
- decision split: `general_chat` vs `long_task`
|
|
- long-task state machine
|
|
- checkpoint structure
|
|
- no-fake-progress enforcement
|
|
- stop-clock / anti-stall handling
|
|
|
|
This skill is intentionally implemented as a **maintainable external workflow layer**, not a core OpenClaw runtime patch.
|
|
|
|
---
|
|
|
|
## 1. General chat vs long task
|
|
|
|
### `general_chat`
|
|
Only classify as ordinary chat when **all** are true:
|
|
- can be fully answered now
|
|
- no follow-up work
|
|
- no external waiting
|
|
- no repo / file / logs / system inspection needed
|
|
- no task state needed
|
|
- no checkpoint needed
|
|
- no subagent needed
|
|
- no "half-done" intermediate state
|
|
|
|
### `long_task`
|
|
If any condition above is false, treat the work as `long_task`.
|
|
|
|
Examples:
|
|
- inspect logs and report back
|
|
- compare two implementations after reading files
|
|
- modify code or config
|
|
- deploy or verify
|
|
- delegate to subagents
|
|
- anything that needs checkpointing or future follow-up
|
|
|
|
---
|
|
|
|
## 2. Required state machine
|
|
|
|
A governed long task may only be in one of these states:
|
|
- `active`
|
|
- `waiting_user`
|
|
- `blocked`
|
|
- `paused`
|
|
- `pending_verification`
|
|
|
|
Do not use vague state words like:
|
|
- ongoing
|
|
- still working
|
|
- processing
|
|
- in progress
|
|
|
|
### Meaning
|
|
|
|
**active**
|
|
- there is a real concrete next action happening now
|
|
- not just waiting for time to pass
|
|
- not just waiting for the next reminder
|
|
|
|
**waiting_user**
|
|
- the task cannot reasonably continue until the user answers or decides something
|
|
|
|
**blocked**
|
|
- a concrete blocker prevents progress
|
|
- the blocker and unblock condition must be explicit
|
|
|
|
**paused**
|
|
- the task is intentionally not advancing
|
|
- stop pretending it is still actively moving
|
|
|
|
**pending_verification**
|
|
- implementation or investigation reached a meaningful checkpoint
|
|
- not complete by assistant authority
|
|
- waiting for verification / owner judgement
|
|
|
|
---
|
|
|
|
## 3. Minimal long-task record
|
|
|
|
When entering long-task governance, create or update a record with at least:
|
|
- `task_name`
|
|
- `status`
|
|
- `current_step`
|
|
- `next_step`
|
|
- `last_milestone_at`
|
|
- `next_report_condition`
|
|
- `waiting_on`
|
|
- `blocker`
|
|
- `evidence_count`
|
|
|
|
Use the template file `task-record-template.md` in this skill directory.
|
|
|
|
If using the wrapper MVP (`scripts/long_task_governor_wrapper.mjs`), consume:
|
|
- `classification`
|
|
- `silentCandidate`
|
|
- `taskRecord`
|
|
- `silentLaunchOk`
|
|
- `handoff.mode`
|
|
|
|
as the first machine-readable pass before applying the human-facing workflow.
|
|
|
|
---
|
|
|
|
## 4. Checkpoint protocol
|
|
|
|
Every long-task checkpoint must contain exactly these five sections:
|
|
- 目前狀態
|
|
- 本段完成
|
|
- 下一步
|
|
- 下次回報條件
|
|
- 是否需要您介入
|
|
|
|
Checkpoint is **not** completion.
|
|
After any checkpoint, the task must clearly remain in one of:
|
|
- `active`
|
|
- `waiting_user`
|
|
- `blocked`
|
|
- `paused`
|
|
- `pending_verification`
|
|
|
|
Never send a checkpoint and then silently stall.
|
|
|
|
Use `checkpoint-template.md` in this skill directory.
|
|
|
|
---
|
|
|
|
## 5. Silent long-task governance
|
|
|
|
If a long-task will not naturally emit an immediate next user-visible message, treat it as a **silent long-task**.
|
|
|
|
A silent long-task must define at startup:
|
|
- the first forced checkpoint trigger
|
|
- what to report if not yet complete
|
|
- what status transition to use if there is no new evidence
|
|
- how final owner handoff will happen if a decision is expected
|
|
- whether an externalized checkpoint mechanism is actually bound, or whether the task must remain non-silent
|
|
|
|
Silent long-tasks must not rely on memory, intention, or implied future follow-up.
|
|
If the user later has to ask where the update went, the flow is considered failed.
|
|
|
|
If no externalized checkpoint path can be created safely, do not launch the task in silent mode.
|
|
|
|
For reminder payload design, use:
|
|
- `docs/runbooks/silent-long-task-reminder-contract.md`
|
|
|
|
For launch decisions, use:
|
|
- `docs/runbooks/silent-long-task-decision-tree.md`
|
|
- `docs/runbooks/silent-long-task-launch-template.md`
|
|
|
|
If using the wrapper MVP:
|
|
- `silentCandidate = true` means you must evaluate silent launch rules
|
|
- `silentLaunchOk = false` means you must not proceed as silent
|
|
- `handoff.mode = button_path` means prepare Telegram button-path early
|
|
|
|
### Verification notes for this slice
|
|
|
|
Keep verification lightweight and reproducible:
|
|
- **wrapper fixture runner**: run `node scripts/test_long_task_governor_wrapper.mjs`
|
|
- **force-recall smoke test**: run `node scripts/test_force_recall_long_task_preflight.mjs`
|
|
- record the command output in the task checkpoint or PR notes when this skill changes
|
|
|
|
### Current non-goals for this slice
|
|
|
|
This slice documents decisioning and preflight guidance only. It does **not** yet include:
|
|
- auto-send messages
|
|
- bound cron / reminder wiring
|
|
- gateway-level blocking or hard enforcement beyond the current preflight wording
|
|
|
|
---
|
|
|
|
## 6. No-fake-progress rule
|
|
|
|
Only count as real progress if there is at least one of:
|
|
- new file change
|
|
- new verification output
|
|
- new decision or conclusion
|
|
- blocker state changed
|
|
- new external result / reply
|
|
|
|
The following are **not progress**:
|
|
- only updating timestamps
|
|
- only responding to reminders
|
|
- repeating "still working"
|
|
- status sync without new facts
|
|
|
|
If you only have status sync, say it is status sync. Do not dress it up as progress.
|
|
|
|
---
|
|
|
|
## 7. Stop-clock gate
|
|
|
|
If repeated checkpoints show no new evidence, do **not** keep the task cosmetically active.
|
|
You must choose one:
|
|
- `paused`
|
|
- `blocked`
|
|
- explicit request for new direction
|
|
- stop periodic progress claims
|
|
|
|
`active` requires a concrete ongoing action.
|
|
Without that, default to `paused`.
|
|
|
|
---
|
|
|
|
## 8. How to use this skill in practice
|
|
|
|
When this skill applies:
|
|
1. decide if the request is ordinary chat or long task
|
|
2. if available, run the wrapper MVP first as the machine-readable classification/bootstrap layer
|
|
3. if long task, create/update a task record
|
|
4. if the task is silent, define the first forced checkpoint before proceeding
|
|
5. if the task is silent, externalize the checkpoint path or keep the task non-silent
|
|
6. if reporting progress, use the 5-part checkpoint structure
|
|
7. before claiming progress, check for real evidence
|
|
8. if no evidence and no concrete action, stop the clock
|
|
9. if the run is clearly heading toward a user pass/fail or accept/reject judgement on Telegram, prepare a button-path before the final handoff
|
|
10. if the entire test itself exists to validate Telegram decision closure, run it as a button-driven flow rather than a normal long plain-text report
|
|
|
|
---
|
|
|
|
## 9. Integration guidance
|
|
|
|
This skill should be paired with:
|
|
- the current session `WORKFLOW.md`
|
|
- optional kanban sync
|
|
- optional Lobby record
|
|
- subagent use when needed
|
|
|
|
But this skill itself is the governance layer and should remain independently maintainable.
|
|
|
|
### Telegram interaction guard
|
|
|
|
When operating under long-task governance on Telegram:
|
|
- do not end checkpoints, test progress, or next-step decisions with plain-text menus like `1 / 2 / 3` or `A / B / C`
|
|
- if a choice is genuinely required, use Telegram inline buttons
|
|
- if buttons are required, send them first via the `message` tool rather than first producing a normal text reply
|
|
- if the workflow can already predict the final handoff is a user judgement, move to a button-path before the final closing paragraph
|
|
- if the test's whole point is to validate button closure, prefer a button-driven flow from the outset
|
|
- if no real choice is needed, execute the most reasonable next step directly
|
|
- if the assistant accidentally emits a plain-text choice menu, or says buttons will be used without actually sending them first, treat that as a workflow violation and convert the lesson into a permanent rule
|
|
|
|
This prevents governed long-task flows from degrading back into ambiguous text-only decision gates.
|
|
|
|
---
|
|
|
|
## 10. Success criteria
|
|
|
|
This skill is working correctly when:
|
|
- non-chat work always enters a governed state
|
|
- silent long-tasks never go dark without a predeclared and externalized checkpoint path
|
|
- checkpoints no longer cause silent stalls
|
|
- no-evidence updates are not mislabeled as progress
|
|
- stalled work becomes `paused` / `blocked` instead of fake-`active`
|
|
- owner can always tell the real state of the work
|