feat: sync auto-next obligation gate hardening
This commit is contained in:
114
README.md
114
README.md
@@ -2,78 +2,102 @@
|
||||
|
||||
## 中文說明
|
||||
|
||||
這個 repo 是從較大的 OpenClaw workspace 中抽出的焦點工作流成果,主題是:
|
||||
這個 repo 目前聚焦兩條與 continuity 直接相關的成果:
|
||||
|
||||
- **approved plan continuity hard-gate**
|
||||
- **dispatch receipt binding**
|
||||
- **anti-blackhole / completion-delivery watchdog groundwork**
|
||||
- **auto-next obligation gate**
|
||||
|
||||
目標是避免兩類問題持續發生:
|
||||
目標是避免以下兩種 failure:
|
||||
|
||||
1. **continuity failure / auto-next break**
|
||||
2. **subagent anti-blackhole / fake timeout**
|
||||
- 任務已完成
|
||||
- 下一步已知
|
||||
- 但沒有真的 dispatch 下一顆 task
|
||||
- 流程卻還是被當成正常收尾
|
||||
|
||||
2. **task-boundary stop / 口頭續跑**
|
||||
- 同一份 approved plan 內其實應該 auto-next
|
||||
- 但主代理停在 task boundary
|
||||
- 用 checkpoint / 口頭回報 / session metadata 取代真正 dispatch
|
||||
|
||||
## 目前已完成
|
||||
|
||||
### A. Continuity hard-gate
|
||||
- continuity evaluator
|
||||
- dispatch receipt binding groundwork
|
||||
- `derivedAction` continuity binding
|
||||
- receipt validator 最小欄位驗證
|
||||
- `derivedAction` / `nextDerivedAction` 納入 continuity 判定
|
||||
- `dry_run_dispatch` 不得冒充真 receipt
|
||||
- fake receipt authority 最小收緊
|
||||
- hook integration 已接入
|
||||
- fake receipt 不得放行
|
||||
- hook integration 已接入 `hooks/force-recall/handler.ts`
|
||||
|
||||
### B. Anti-blackhole watchdog recovery
|
||||
- watchdog status recompute
|
||||
- 最小 recovery decision 閉環:
|
||||
- `fetch_history`
|
||||
- `respawn`
|
||||
- `blocked`
|
||||
- owner-visible reporting payload
|
||||
- scenario matrix tests
|
||||
### B. Auto-next obligation gate
|
||||
- 新 failure reason:`missing_auto_next_dispatch`
|
||||
- 同一份 approved plan 中,若:
|
||||
- 當前 task 已完成
|
||||
- 下一顆 task 已知
|
||||
- `sameApprovedPlan=true`
|
||||
- `taskBoundaryStop=true`
|
||||
- 非 `waiting_user` / `blocked` / `pending_verification`
|
||||
- 非 `highRiskStop`
|
||||
- 且沒有真實 next dispatch receipt
|
||||
- ⇒ 直接 fail,不得停在 boundary 等主人再說「繼續」
|
||||
- receipt linkage hardening:receipt 現在要對到要求的 next-task handoff,而不是只要存在就算過
|
||||
- 新增最小 linkage 欄位:`nextTaskId`
|
||||
- checkpoint / session metadata / stale receipt / dry-run planner intent 不得冒充 auto-next dispatch proof
|
||||
|
||||
## 驗證狀態
|
||||
- `node scripts/test_approved_plan_continuity_gate.mjs` → `17 passed / 0 failed`
|
||||
- `node scripts/test_force_recall_long_task_preflight.mjs` → PASS
|
||||
- `node --check hooks/force-recall/handler.ts` → PASS
|
||||
- `node --check scripts/approved_plan_continuity_gate.mjs` → PASS
|
||||
- `node --check scripts/approved_plan_dispatch_binding.mjs` → PASS
|
||||
|
||||
## 目前限制
|
||||
- continuity 仍偏 prompt-level hard-gate integration
|
||||
- watchdog recovery 目前驗收的是 decision / reporting / test slice,不是 live integration
|
||||
- 目前仍主要鎖在 continuity / force-recall 路徑,不是所有 entry points。
|
||||
- `sameApprovedPlan` 的上游證據仍可再更硬。
|
||||
- continuity plugin MVP 仍在後續產品化中,尚未整理成可直接讓其他 OpenClaw 安裝的插件包。
|
||||
|
||||
## 下一步建議
|
||||
1. continuity runtime enforcement hardening
|
||||
2. watchdog live recovery integration
|
||||
3. escalation / receipt contract hardening
|
||||
## 下一步
|
||||
1. continuity 收尾覆核
|
||||
2. 回到 continuity plugin MVP
|
||||
3. 把目前 continuity 內核抽成可安裝、可設定、可測試、可依雙語 README 套用的插件 MVP
|
||||
|
||||
---
|
||||
|
||||
## English Description
|
||||
|
||||
This repository is a focused export from a larger OpenClaw workspace covering:
|
||||
This repository currently focuses on two continuity-related hardening slices:
|
||||
|
||||
- **approved plan continuity hard-gate**
|
||||
- **anti-blackhole / completion-delivery watchdog recovery**
|
||||
- **auto-next obligation gate**
|
||||
|
||||
It prevents two core failure classes:
|
||||
|
||||
1. **continuity failure / auto-next break**
|
||||
2. **task-boundary stop disguised as progress**
|
||||
|
||||
## Current State
|
||||
|
||||
### A. Continuity hard-gate
|
||||
- continuity evaluator
|
||||
- dispatch receipt binding groundwork
|
||||
- `derivedAction` continuity binding
|
||||
- `dry_run_dispatch` no longer accepted as a real receipt
|
||||
- fake receipt authority tightened
|
||||
- hook integration present
|
||||
- minimum receipt validation
|
||||
- `derivedAction` / `nextDerivedAction` continuity handling
|
||||
- `dry_run_dispatch` rejected as real receipt
|
||||
- fake receipt rejected
|
||||
- hook integration in `hooks/force-recall/handler.ts`
|
||||
|
||||
### B. Anti-blackhole watchdog recovery
|
||||
- watchdog status recompute
|
||||
- minimal recovery-decision loop:
|
||||
- `fetch_history`
|
||||
- `respawn`
|
||||
- `blocked`
|
||||
- owner-visible reporting payload
|
||||
- scenario matrix tests
|
||||
### B. Auto-next obligation gate
|
||||
- explicit failure reason: `missing_auto_next_dispatch`
|
||||
- task-boundary stop is now treated as continuity failure when same-plan auto-next is obligatory
|
||||
- receipt linkage hardening via `nextTaskId`
|
||||
- checkpoint / session metadata / stale receipt / dry-run intent can no longer stand in for real auto-next dispatch proof
|
||||
|
||||
## Validation
|
||||
- continuity gate tests passing
|
||||
- force-recall preflight passing
|
||||
- syntax checks passing
|
||||
|
||||
## Current Limitations
|
||||
- continuity remains prompt-level rather than engine-level
|
||||
- watchdog recovery is validated as a decision/reporting/test slice, not live execution integration
|
||||
|
||||
## Suggested Next Steps
|
||||
1. continuity runtime enforcement hardening
|
||||
2. watchdog live recovery integration
|
||||
3. escalation / receipt contract hardening
|
||||
- scoped mainly to the continuity / force-recall path
|
||||
- upstream `sameApprovedPlan` evidence can still be hardened further
|
||||
- plugin packaging is still pending
|
||||
|
||||
195
docs/plans/2026-04-24-auto-next-obligation-gate.md
Normal file
195
docs/plans/2026-04-24-auto-next-obligation-gate.md
Normal file
@@ -0,0 +1,195 @@
|
||||
# Auto-Next Obligation Gate Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Enforce that an approved plan may not stop at a task boundary when the current task is complete and the next task is already known, unless the closure is explicitly `waiting_user`, `blocked`, `pending_verification`, or a separately-declared high-risk stop point; otherwise the flow must auto-dispatch the next task and any stop is a continuity failure.
|
||||
|
||||
**Architecture:** Extend the current approved-plan continuity gate from a passive “missing dispatch receipt” detector into an obligation gate that evaluates task-boundary stops as first-class failures. Keep the design minimal: preserve the current receipt-based truth model, add explicit stop-intent / high-risk-stop metadata, fail first in tests, then wire the hook path so dry-run planner intent is no longer enough when the next task is deterministically known. Design the slices so the same evaluator can later be extracted into the continuity plugin MVP without changing the behavior contract.
|
||||
|
||||
**Tech Stack:** Node.js ESM scripts, TypeScript hook integration, JSON input/output envelopes, file-backed dispatch receipts, script-level tests, continuity plugin MVP compatibility layer
|
||||
|
||||
---
|
||||
|
||||
## Context Baseline
|
||||
|
||||
The current repo already has a partial continuity hard gate:
|
||||
|
||||
- `scripts/approved_plan_continuity_gate.mjs`
|
||||
- fails only when `taskState=complete` + next action known + no valid `dispatchReceipt` + closure not in legal terminal states
|
||||
- `scripts/approved_plan_dispatch_binding.mjs`
|
||||
- writes receipt files once a dispatch is actually bound
|
||||
- `hooks/force-recall/handler.ts`
|
||||
- builds continuity input from wrapper/planner state and injects the continuity block
|
||||
- `scripts/test_approved_plan_continuity_gate.mjs`
|
||||
- already covers missing receipt, fake receipt, valid receipt, and legal terminal states
|
||||
- `docs/plans/2026-04-24-continuity-plugin-mvp.md`
|
||||
- already assumes this continuity behavior will later be extracted into a plugin
|
||||
|
||||
The remaining gap is narrower and more specific:
|
||||
|
||||
1. The current gate says “don’t close if a known next action exists but no real receipt exists.”
|
||||
2. But it does not yet model the stronger obligation: **if the next task in the same approved plan is already known and not blocked by an allowed stop condition, the system must auto-next dispatch instead of pausing at the boundary.**
|
||||
3. This means the failure is not only “missing receipt,” but also **“stopped at a task boundary when auto-next was obligatory.”**
|
||||
4. We need a minimal extension that preserves existing receipt truth, avoids speculative dispatch, and remains compatible with continuity-plugin extraction.
|
||||
|
||||
## Target Behavior Contract
|
||||
|
||||
When all of the following are true:
|
||||
|
||||
- current workflow is inside the same approved plan
|
||||
- current task is complete
|
||||
- the next task is known / derivable as a concrete next task
|
||||
- closure state is not `waiting_user`, `blocked`, or `pending_verification`
|
||||
- no explicit high-risk stop point is active
|
||||
|
||||
Then:
|
||||
|
||||
- the system must not stop at the task boundary
|
||||
- the execution layer must auto-dispatch the next task
|
||||
- a real dispatch receipt must exist for the next task handoff
|
||||
- otherwise the reply/hook path must produce a continuity failure
|
||||
|
||||
When any of the following are true, auto-next is not obligatory:
|
||||
|
||||
- closure state is `waiting_user`
|
||||
- closure state is `blocked`
|
||||
- closure state is `pending_verification`
|
||||
- an explicit high-risk stop point is active
|
||||
- no next task is known
|
||||
- current task is not complete
|
||||
- plan scope is absent or ambiguous
|
||||
|
||||
## Required New Concepts
|
||||
|
||||
To keep the design minimal, introduce only the following new concepts:
|
||||
|
||||
- `nextTaskKnown`: boolean or derivable fact that the next task in the same approved plan is known
|
||||
- `sameApprovedPlan`: boolean proving the next task belongs to the same approved plan, not merely a generic next action
|
||||
- `taskBoundaryStop`: boolean indicating the system is trying to end the current reply at a completed-task boundary instead of dispatching onward
|
||||
- `highRiskStop`: boolean indicating an allowed explicit stop point outside the normal legal closure states
|
||||
- `autoNextObligatory`: derived evaluator result when auto-next must happen now
|
||||
- `reason=missing_auto_next_dispatch` (or equivalent canonical reason) for the new failure mode
|
||||
|
||||
Do not widen this into a generalized workflow engine or arbitrary planner ontology in this slice.
|
||||
|
||||
## Current Gap
|
||||
|
||||
- current continuity gate checks a known next action, but it does not specifically require that the next task is the next task in the same approved plan
|
||||
- current hook can surface planner-derived action from dry-run planning, but planner intent is not a real dispatch and does not prove continuity actually happened
|
||||
- current dispatch binding writes receipts once dispatch is actually bound, but the gate does not yet express "must auto-dispatch now" as its own obligation at the task boundary
|
||||
- current legal terminal states are hard-coded and do not include explicit `highRiskStop` metadata
|
||||
|
||||
## Non-goals
|
||||
|
||||
- generalized multi-plan scheduling
|
||||
- speculative dispatch when the next task is ambiguous
|
||||
- removing current receipt validation
|
||||
- implementing continuity-plugin extraction in this slice
|
||||
|
||||
## Canonical Task-Boundary Stop Scenario
|
||||
|
||||
This is the scenario the implementation must lock down:
|
||||
|
||||
1. Approved plan has ordered tasks, e.g. Task 8 -> Task 9.
|
||||
2. Task 8 just completed.
|
||||
3. Task 9 is already known from the same approved plan.
|
||||
4. The agent emits a normal closeout / handoff / “next I can continue with Task 9” style response.
|
||||
5. No real auto-dispatch receipt exists for Task 9.
|
||||
6. Closure is not `waiting_user`, `blocked`, `pending_verification`.
|
||||
7. No high-risk stop point is active.
|
||||
|
||||
Expected outcome:
|
||||
|
||||
- continuity gate fails
|
||||
- hook output explicitly forbids stopping at this task boundary
|
||||
- system must route to auto-next dispatch path or continuity failure path
|
||||
- dry-run planner intent alone does not satisfy the obligation
|
||||
|
||||
---
|
||||
|
||||
## Verification Record
|
||||
|
||||
### Commands run
|
||||
|
||||
```bash
|
||||
node --check hooks/force-recall/handler.ts
|
||||
node --check scripts/approved_plan_continuity_gate.mjs
|
||||
node --check scripts/approved_plan_dispatch_binding.mjs
|
||||
node scripts/test_approved_plan_continuity_gate.mjs
|
||||
node scripts/test_force_recall_long_task_preflight.mjs
|
||||
```
|
||||
|
||||
### Result summary
|
||||
|
||||
- `node --check hooks/force-recall/handler.ts` ✅
|
||||
- `node --check scripts/approved_plan_continuity_gate.mjs` ✅
|
||||
- `node --check scripts/approved_plan_dispatch_binding.mjs` ✅
|
||||
- `node scripts/test_approved_plan_continuity_gate.mjs` ✅ `17/17 passed`
|
||||
- `node scripts/test_force_recall_long_task_preflight.mjs` ✅
|
||||
|
||||
### What was hardened in this slice
|
||||
|
||||
- continuity evaluator now rejects receipts that do not match the required `planId`, `currentTask`, and expected next dispatch action
|
||||
- minimal receipt linkage field `nextTaskId` was added so the evaluator can distinguish the required next-task dispatch from a stale or unrelated receipt
|
||||
- continuity tests now fail when the receipt links to the wrong next task
|
||||
- continuity tests now fail when a receipt only contains checkpoint/session-style metadata instead of real dispatch linkage
|
||||
- hook preflight verification still confirms that dry-run planner intent alone does not satisfy continuity, and that the failure reason remains `missing_auto_next_dispatch`
|
||||
|
||||
### Deliberately deferred
|
||||
|
||||
- stronger upstream source-of-truth for `sameApprovedPlan`
|
||||
- broader non-`force-recall` entry-point enforcement
|
||||
- continuity plugin extraction work
|
||||
|
||||
---
|
||||
|
||||
## Minimal Enforcement Design Summary
|
||||
|
||||
The enforcement should stay intentionally small:
|
||||
|
||||
1. **Keep receipt truth model**
|
||||
- a real dispatch receipt remains the pass proof
|
||||
- planner intent alone is not proof
|
||||
|
||||
2. **Add one stronger evaluator branch**
|
||||
- when the next task in the same approved plan is known and the current reply is stopping at a completed-task boundary, auto-next becomes obligatory
|
||||
- missing receipt in this branch is a dedicated continuity failure
|
||||
|
||||
3. **Allow only narrow exemptions**
|
||||
- `waiting_user`
|
||||
- `blocked`
|
||||
- `pending_verification`
|
||||
- `highRiskStop=true`
|
||||
|
||||
4. **Keep hook integration thin**
|
||||
- hook computes structured booleans
|
||||
- evaluator makes the decision
|
||||
- hook renders the reason-specific block
|
||||
|
||||
5. **Preserve plugin extraction path**
|
||||
- no hook-only business logic
|
||||
- no receipt-store / evaluator coupling
|
||||
- no prompt-only policy with no machine-checkable input
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [x] A completed task in the same approved plan cannot stop at a boundary when the next task is known unless an allowed exemption applies.
|
||||
- [x] The continuity evaluator emits a dedicated failure for missing required auto-next dispatch.
|
||||
- [x] A real dispatch receipt is still required; dry-run planner output alone cannot pass.
|
||||
- [x] Legal closure states `waiting_user`, `blocked`, `pending_verification` still pass unchanged.
|
||||
- [x] Explicit `highRiskStop` bypass is supported and test-covered.
|
||||
- [x] Hook output clearly explains the auto-next obligation failure.
|
||||
- [x] Script-level continuity tests pass.
|
||||
- [x] Hook smoke tests pass.
|
||||
- [ ] The plan documents how this behavior migrates cleanly into the continuity plugin MVP.
|
||||
|
||||
## Risks / Open Questions
|
||||
|
||||
1. The current hook may not yet expose a strong enough source of truth for `sameApprovedPlan`; if so, one narrow upstream metadata field may be needed.
|
||||
2. `highRiskStop` may not currently exist in structured input, so the first implementation may need a conservative default of `false` until an upstream gate can set it explicitly.
|
||||
3. Receipt schema may still need one future compatibility pass if downstream writers have not yet been upgraded to emit `nextTaskId` everywhere continuity depends on same-plan auto-next proof.
|
||||
4. This slice deliberately does not solve non-hook entry points or general workflow orchestration.
|
||||
|
||||
## Status
|
||||
|
||||
pending verification / reviewer checked
|
||||
@@ -35,6 +35,10 @@
|
||||
- Use this field to state whether the reply closed under a dispatch-linked continuation path or some separately defined terminal closure state.
|
||||
- This field is defined here as a receipt field only; legal closure states and gate enforcement are defined in later tasks.
|
||||
|
||||
### `nextTaskId`
|
||||
- The identifier of the required next task when continuity depends on a same-plan auto-next transition.
|
||||
- Use this field only to prove that the receipt links to the exact next task that had to be dispatched.
|
||||
- This field is the minimal hardening field for next-task linkage; it prevents unrelated dispatches, checkpoints, or stale receipts from spoofing continuity pass.
|
||||
|
||||
## Legal terminal states
|
||||
|
||||
|
||||
88
docs/runbooks/auto-next-obligation-gate.md
Normal file
88
docs/runbooks/auto-next-obligation-gate.md
Normal file
@@ -0,0 +1,88 @@
|
||||
# Auto-Next Obligation Gate
|
||||
|
||||
## Purpose
|
||||
|
||||
This runbook defines the approved-plan continuity rule that a workflow may not stop at a completed-task boundary when the next task in the same approved plan is already known and continuation is still allowed.
|
||||
|
||||
## When auto-next is obligatory
|
||||
|
||||
Auto-next is obligatory when all of the following are true:
|
||||
|
||||
- the current workflow is inside the same approved plan
|
||||
- the current task is complete
|
||||
- the next task is known
|
||||
- the system is attempting a task-boundary stop instead of continuing execution
|
||||
- reply closure state is not `waiting_user`
|
||||
- reply closure state is not `blocked`
|
||||
- reply closure state is not `pending_verification`
|
||||
- `highRiskStop` is not active
|
||||
|
||||
In this state, the system must auto-dispatch the next task and record a real dispatch receipt. A dry-run planner result or stated intent to continue is not enough.
|
||||
|
||||
## Legal non-auto-next closures
|
||||
|
||||
The following are legal non-auto-next closures even when a next task exists:
|
||||
|
||||
- `waiting_user`
|
||||
- `blocked`
|
||||
- `pending_verification`
|
||||
|
||||
These states are the only normal closure states that can stop without auto-next dispatch.
|
||||
|
||||
## Allowed non-closure exception
|
||||
|
||||
The following explicit exception may bypass auto-next obligation without using the normal legal terminal closure states:
|
||||
|
||||
- `highRiskStop`
|
||||
|
||||
`highRiskStop` means the workflow is intentionally stopping at an explicit high-risk stop point and therefore does not have to auto-dispatch the next task yet.
|
||||
|
||||
## Forbidden behavior
|
||||
|
||||
The following behavior is forbidden:
|
||||
|
||||
- completed task
|
||||
- next task known
|
||||
- same approved plan
|
||||
- normal closeout or handoff language
|
||||
- no real dispatch receipt for the next task
|
||||
- no legal closure state
|
||||
- no `highRiskStop`
|
||||
|
||||
A completed task in the same approved plan must not end with “I can continue with the next task” style closeout unless the next task has actually been dispatched.
|
||||
|
||||
Checkpoint artifacts, session keys, or oral/plain-text status updates are not substitutes for a real auto-next dispatch. A checkpoint may preserve state, but it does not prove that the required next task was actually dispatched.
|
||||
|
||||
## Canonical failure condition
|
||||
|
||||
If all of the following are true:
|
||||
|
||||
- task is complete
|
||||
- next task is known
|
||||
- next task belongs to the same approved plan
|
||||
- the system is stopping at a task boundary
|
||||
- no valid dispatch receipt exists
|
||||
- closure is not `waiting_user`, `blocked`, or `pending_verification`
|
||||
- `highRiskStop` is false
|
||||
|
||||
Then the continuity gate must fail and treat the stop as an auto-next obligation violation.
|
||||
|
||||
## Canonical failure table
|
||||
|
||||
| Task complete | Next task known | Same approved plan | Boundary stop | Receipt | Closure / exception | Expected |
|
||||
| --- | --- | --- | --- | --- | --- | --- |
|
||||
| yes | yes | yes | yes | no | completed closure | FAIL |
|
||||
| yes | yes | yes | yes | valid receipt | completed closure | PASS |
|
||||
| yes | yes | yes | yes | no | `waiting_user` | PASS |
|
||||
| yes | yes | yes | yes | no | `blocked` | PASS |
|
||||
| yes | yes | yes | yes | no | `pending_verification` | PASS |
|
||||
| yes | yes | yes | yes | no | `highRiskStop` | PASS |
|
||||
|
||||
## Notes for implementation
|
||||
|
||||
- The obligation applies only when the next task is known within the same approved plan.
|
||||
- A generic next action is not enough unless it proves the same approved plan task transition.
|
||||
- A real dispatch receipt remains the source of truth for whether auto-next actually happened.
|
||||
- Receipt linkage should include the required next-task identity when the evaluator needs to distinguish a real next-task dispatch from a stale or unrelated dispatch.
|
||||
- Checkpoint/session metadata alone must not satisfy the receipt proof.
|
||||
- This rule is intentionally minimal so it can later move into the continuity plugin without changing the behavior contract.
|
||||
@@ -356,6 +356,11 @@ function buildApprovedPlanContinuityInput(wrapperResult: any, autoChainPlanResul
|
||||
: (wrapperResult?.handoff?.mode === "button_path" ? "waiting_user" : "completed");
|
||||
|
||||
const dispatchReceipt = wrapperResult?.dispatchReceipt ?? null;
|
||||
const nextTaskKnown = wrapperResult?.nextTaskKnown === true
|
||||
|| (plannerDerivedAction != null && typeof autoChainPlanResult?.derivedAction === 'string' && autoChainPlanResult.derivedAction !== 'none');
|
||||
const sameApprovedPlan = wrapperResult?.sameApprovedPlan === true || plannerDerivedAction != null;
|
||||
const taskBoundaryStop = wrapperResult?.taskBoundaryStop === true || replyClosureState === 'completed';
|
||||
const highRiskStop = wrapperResult?.highRiskStop === true;
|
||||
|
||||
return {
|
||||
planId: wrapperResult?.planId ?? "hook-preflight-approved-plan",
|
||||
@@ -364,6 +369,10 @@ function buildApprovedPlanContinuityInput(wrapperResult: any, autoChainPlanResul
|
||||
nextDerivedAction,
|
||||
replyClosureState,
|
||||
dispatchReceipt,
|
||||
nextTaskKnown,
|
||||
sameApprovedPlan,
|
||||
taskBoundaryStop,
|
||||
highRiskStop,
|
||||
};
|
||||
}
|
||||
|
||||
@@ -387,8 +396,13 @@ function buildApprovedPlanContinuityBlock(result: ApprovedPlanContinuityResult |
|
||||
|
||||
if (result.ok === false) {
|
||||
lines.push("- HARD_GATE: Do not close out this reply as normal completion.");
|
||||
if (result.reason === 'missing_auto_next_dispatch') {
|
||||
lines.push("- HARD_GATE: Do not stop at this completed-task boundary.");
|
||||
lines.push("- HARD_GATE: Auto-dispatch the next task in the same approved plan, unless waiting_user, blocked, pending_verification, or high-risk stop applies.");
|
||||
} else {
|
||||
lines.push("- HARD_GATE: Route back to continuity failure until a real next dispatch receipt exists, unless closure state is waiting_user, blocked, or pending_verification.");
|
||||
}
|
||||
}
|
||||
|
||||
lines.push("[/APPROVED_PLAN_CONTINUITY_GATE]", "");
|
||||
return lines.join("\n");
|
||||
|
||||
@@ -11,6 +11,10 @@ function isObject(value) {
|
||||
return value != null && typeof value === 'object' && !Array.isArray(value);
|
||||
}
|
||||
|
||||
function normalizeAction(action) {
|
||||
return JSON.stringify(action ?? null);
|
||||
}
|
||||
|
||||
function hasValidDispatchReceipt(receipt) {
|
||||
if (!isObject(receipt)) return false;
|
||||
if (!isNonEmptyString(receipt.planId)) return false;
|
||||
@@ -20,6 +24,27 @@ function hasValidDispatchReceipt(receipt) {
|
||||
return true;
|
||||
}
|
||||
|
||||
function receiptMatchesPayload(payload, receipt) {
|
||||
if (!hasValidDispatchReceipt(receipt)) return false;
|
||||
|
||||
const expectedPlanId = payload?.planId;
|
||||
if (isNonEmptyString(expectedPlanId) && receipt.planId !== expectedPlanId) return false;
|
||||
|
||||
const expectedCurrentTask = payload?.currentTask;
|
||||
if (isNonEmptyString(expectedCurrentTask) && receipt.currentTask !== expectedCurrentTask) return false;
|
||||
|
||||
const expectedNextTask = payload?.nextTaskId ?? payload?.nextTaskKey ?? null;
|
||||
const receiptNextTask = receipt?.nextTaskId ?? receipt?.nextTaskKey ?? null;
|
||||
if (isNonEmptyString(expectedNextTask) && receiptNextTask !== expectedNextTask) return false;
|
||||
|
||||
const expectedNextAction = payload?.nextDerivedAction ?? payload?.derivedAction ?? null;
|
||||
if (expectedNextAction != null && normalizeAction(receipt.nextDerivedAction) !== normalizeAction(expectedNextAction)) {
|
||||
return false;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
function parseArgs(argv) {
|
||||
let inputPath = null;
|
||||
let compact = false;
|
||||
@@ -76,11 +101,39 @@ function evaluateContinuity(payload) {
|
||||
const taskComplete = payload?.taskState === 'complete';
|
||||
const nextAction = payload?.nextDerivedAction ?? payload?.derivedAction ?? null;
|
||||
const nextActionKnown = nextAction != null;
|
||||
const hasDispatchReceipt = hasValidDispatchReceipt(payload?.dispatchReceipt ?? null);
|
||||
const explicitNextTaskKnown = payload?.nextTaskKnown === true;
|
||||
const sameApprovedPlan = payload?.sameApprovedPlan === true;
|
||||
const taskBoundaryStop = payload?.taskBoundaryStop === true;
|
||||
const highRiskStop = payload?.highRiskStop === true;
|
||||
const closureState = payload?.replyClosureState ?? null;
|
||||
const isLegalTerminalState = LEGAL_TERMINAL_STATES.has(closureState);
|
||||
const hasDispatchReceipt = receiptMatchesPayload(payload, payload?.dispatchReceipt ?? null);
|
||||
const autoNextObligatory = taskComplete
|
||||
&& explicitNextTaskKnown
|
||||
&& sameApprovedPlan
|
||||
&& taskBoundaryStop
|
||||
&& !isLegalTerminalState
|
||||
&& !highRiskStop;
|
||||
|
||||
if (taskComplete && nextActionKnown && !hasDispatchReceipt && !isLegalTerminalState) {
|
||||
if (autoNextObligatory && !hasDispatchReceipt) {
|
||||
return {
|
||||
ok: false,
|
||||
status: 'continuity_failure',
|
||||
verdict: 'continuity_failure',
|
||||
reason: 'missing_auto_next_dispatch',
|
||||
};
|
||||
}
|
||||
|
||||
if (taskComplete && nextActionKnown && !hasDispatchReceipt && !isLegalTerminalState && !highRiskStop && !('sameApprovedPlan' in (payload ?? {}))) {
|
||||
return {
|
||||
ok: false,
|
||||
status: 'continuity_failure',
|
||||
verdict: 'continuity_failure',
|
||||
reason: 'missing_dispatch_receipt',
|
||||
};
|
||||
}
|
||||
|
||||
if (taskComplete && nextActionKnown && !hasDispatchReceipt && !isLegalTerminalState && !highRiskStop && sameApprovedPlan && !taskBoundaryStop && !explicitNextTaskKnown) {
|
||||
return {
|
||||
ok: false,
|
||||
status: 'continuity_failure',
|
||||
@@ -122,5 +175,4 @@ const response = {
|
||||
},
|
||||
};
|
||||
|
||||
process.stdout.write(`${JSON.stringify(response)}
|
||||
`);
|
||||
process.stdout.write(`${JSON.stringify(response)}\n`);
|
||||
|
||||
@@ -81,6 +81,7 @@ function buildReceipt(payload) {
|
||||
const receipt = {
|
||||
planId: payload?.planId ?? null,
|
||||
currentTask: payload?.currentTask ?? null,
|
||||
nextTaskId: payload?.nextTaskId ?? null,
|
||||
nextDerivedAction: nextAction,
|
||||
dispatchedAt: payload?.dispatchedAt ?? null,
|
||||
dispatchRunId: payload?.dispatchRunId ?? null,
|
||||
@@ -97,6 +98,7 @@ function validateReceipt(receipt) {
|
||||
for (const field of [
|
||||
'planId',
|
||||
'currentTask',
|
||||
'nextTaskId',
|
||||
'nextDerivedAction',
|
||||
'dispatchedAt',
|
||||
'dispatchRunId',
|
||||
|
||||
@@ -168,6 +168,288 @@ const tests = [
|
||||
}
|
||||
},
|
||||
},
|
||||
{
|
||||
name: 'auto-next obligation: fails when approved plan stops at completed-task boundary without auto-next dispatch',
|
||||
run() {
|
||||
const fixture = createFixture({
|
||||
'input.json': {
|
||||
planId: 'plan-auto-next-core',
|
||||
currentTask: 'task-8',
|
||||
taskState: 'complete',
|
||||
nextTaskKnown: true,
|
||||
sameApprovedPlan: true,
|
||||
taskBoundaryStop: true,
|
||||
nextTaskId: 'task-9',
|
||||
nextDerivedAction: {
|
||||
type: 'message_subagent',
|
||||
task: 'continue with task-9',
|
||||
},
|
||||
replyClosureState: 'completed',
|
||||
highRiskStop: false,
|
||||
dispatchReceipt: null,
|
||||
},
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== false) throw new Error(`expected auto-next continuity failure ok=false, got ${JSON.stringify(result.json)}`);
|
||||
if (result.json.verdict !== 'continuity_failure') throw new Error(`expected verdict=continuity_failure, got ${JSON.stringify(result.json.verdict)}`);
|
||||
if (result.json.reason !== 'missing_auto_next_dispatch') throw new Error(`expected reason=missing_auto_next_dispatch, got ${JSON.stringify(result.json.reason)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
},
|
||||
},
|
||||
{
|
||||
name: 'auto-next obligation: fails when only dry-run derived action exists at completed-task boundary',
|
||||
run() {
|
||||
const fixture = createFixture({
|
||||
'input.json': {
|
||||
planId: 'plan-auto-next-dry-run-only',
|
||||
currentTask: 'task-8b',
|
||||
taskState: 'complete',
|
||||
nextTaskKnown: true,
|
||||
sameApprovedPlan: true,
|
||||
taskBoundaryStop: true,
|
||||
nextTaskId: 'task-9b',
|
||||
derivedAction: {
|
||||
type: 'message_subagent',
|
||||
task: 'continue with task-9b',
|
||||
},
|
||||
replyClosureState: 'completed',
|
||||
highRiskStop: false,
|
||||
dispatchReceipt: null,
|
||||
},
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== false) throw new Error(`expected auto-next continuity failure ok=false, got ${JSON.stringify(result.json)}`);
|
||||
if (result.json.verdict !== 'continuity_failure') throw new Error(`expected verdict=continuity_failure, got ${JSON.stringify(result.json.verdict)}`);
|
||||
if (result.json.reason !== 'missing_auto_next_dispatch') throw new Error(`expected reason=missing_auto_next_dispatch, got ${JSON.stringify(result.json.reason)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
},
|
||||
},
|
||||
{
|
||||
name: 'auto-next obligation: passes when explicit high-risk stop is active',
|
||||
run() {
|
||||
const fixture = createFixture({
|
||||
'input.json': {
|
||||
planId: 'plan-auto-next-high-risk-stop',
|
||||
currentTask: 'task-8c',
|
||||
taskState: 'complete',
|
||||
nextTaskKnown: true,
|
||||
sameApprovedPlan: true,
|
||||
taskBoundaryStop: true,
|
||||
nextTaskId: 'task-9c',
|
||||
nextDerivedAction: {
|
||||
type: 'message_subagent',
|
||||
task: 'continue with task-9c',
|
||||
},
|
||||
replyClosureState: 'completed',
|
||||
highRiskStop: true,
|
||||
dispatchReceipt: null,
|
||||
},
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== true) throw new Error(`expected continuity pass ok=true when highRiskStop=true, got ${JSON.stringify(result.json)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
},
|
||||
},
|
||||
{
|
||||
name: 'auto-next obligation: passes when next task is not known',
|
||||
run() {
|
||||
const fixture = createFixture({
|
||||
'input.json': {
|
||||
planId: 'plan-auto-next-unknown-next-task',
|
||||
currentTask: 'task-8d',
|
||||
taskState: 'complete',
|
||||
nextTaskKnown: false,
|
||||
sameApprovedPlan: true,
|
||||
taskBoundaryStop: true,
|
||||
replyClosureState: 'completed',
|
||||
highRiskStop: false,
|
||||
dispatchReceipt: null,
|
||||
},
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== true) throw new Error(`expected pass when nextTaskKnown=false, got ${JSON.stringify(result.json)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
},
|
||||
},
|
||||
{
|
||||
name: 'auto-next obligation: passes when next action is not in the same approved plan',
|
||||
run() {
|
||||
const fixture = createFixture({
|
||||
'input.json': {
|
||||
planId: 'plan-auto-next-other-plan',
|
||||
currentTask: 'task-8e',
|
||||
taskState: 'complete',
|
||||
nextTaskKnown: true,
|
||||
sameApprovedPlan: false,
|
||||
taskBoundaryStop: true,
|
||||
nextTaskId: 'task-other',
|
||||
nextDerivedAction: {
|
||||
type: 'message_subagent',
|
||||
task: 'continue with unrelated task',
|
||||
},
|
||||
replyClosureState: 'completed',
|
||||
highRiskStop: false,
|
||||
dispatchReceipt: null,
|
||||
},
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== true) throw new Error(`expected pass when sameApprovedPlan=false, got ${JSON.stringify(result.json)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
},
|
||||
},
|
||||
{
|
||||
name: 'auto-next obligation: fails when receipt exists but next-task linkage is stale or mismatched',
|
||||
run() {
|
||||
const fixture = createFixture({
|
||||
'input.json': {
|
||||
planId: 'plan-auto-next-linkage-mismatch',
|
||||
currentTask: 'task-8f',
|
||||
taskState: 'complete',
|
||||
nextTaskKnown: true,
|
||||
sameApprovedPlan: true,
|
||||
taskBoundaryStop: true,
|
||||
nextTaskId: 'task-9f',
|
||||
nextDerivedAction: {
|
||||
type: 'message_subagent',
|
||||
task: 'continue with task-9f',
|
||||
},
|
||||
replyClosureState: 'completed',
|
||||
highRiskStop: false,
|
||||
dispatchReceipt: {
|
||||
planId: 'plan-auto-next-linkage-mismatch',
|
||||
currentTask: 'task-8f',
|
||||
nextTaskId: 'task-10f',
|
||||
nextDerivedAction: {
|
||||
type: 'message_subagent',
|
||||
task: 'continue with task-10f',
|
||||
},
|
||||
dispatchedAt: '2026-04-24T16:00:00+08:00',
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== false) throw new Error(`expected linkage mismatch to fail, got ${JSON.stringify(result.json)}`);
|
||||
if (result.json.reason !== 'missing_auto_next_dispatch') throw new Error(`expected linkage mismatch reason=missing_auto_next_dispatch, got ${JSON.stringify(result.json.reason)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
},
|
||||
},
|
||||
{
|
||||
name: 'auto-next obligation: passes when receipt links to the required next task',
|
||||
run() {
|
||||
const fixture = createFixture({
|
||||
'input.json': {
|
||||
planId: 'plan-auto-next-linkage-match',
|
||||
currentTask: 'task-8g',
|
||||
taskState: 'complete',
|
||||
nextTaskKnown: true,
|
||||
sameApprovedPlan: true,
|
||||
taskBoundaryStop: true,
|
||||
nextTaskId: 'task-9g',
|
||||
nextDerivedAction: {
|
||||
type: 'message_subagent',
|
||||
task: 'continue with task-9g',
|
||||
},
|
||||
replyClosureState: 'completed',
|
||||
highRiskStop: false,
|
||||
dispatchReceipt: {
|
||||
planId: 'plan-auto-next-linkage-match',
|
||||
currentTask: 'task-8g',
|
||||
nextTaskId: 'task-9g',
|
||||
nextDerivedAction: {
|
||||
type: 'message_subagent',
|
||||
task: 'continue with task-9g',
|
||||
},
|
||||
dispatchedAt: '2026-04-24T16:05:00+08:00',
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== true) throw new Error(`expected linkage-matched receipt to pass, got ${JSON.stringify(result.json)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
},
|
||||
},
|
||||
{
|
||||
name: 'auto-next obligation: fails when receipt only proves checkpoint/session metadata without actual dispatch linkage',
|
||||
run() {
|
||||
const fixture = createFixture({
|
||||
'input.json': {
|
||||
planId: 'plan-auto-next-checkpoint-spoof',
|
||||
currentTask: 'task-8h',
|
||||
taskState: 'complete',
|
||||
nextTaskKnown: true,
|
||||
sameApprovedPlan: true,
|
||||
taskBoundaryStop: true,
|
||||
nextTaskId: 'task-9h',
|
||||
nextDerivedAction: {
|
||||
type: 'message_subagent',
|
||||
task: 'continue with task-9h',
|
||||
},
|
||||
replyClosureState: 'completed',
|
||||
highRiskStop: false,
|
||||
dispatchReceipt: {
|
||||
planId: 'plan-auto-next-checkpoint-spoof',
|
||||
currentTask: 'task-8h',
|
||||
nextTaskId: 'task-9h',
|
||||
checkpointPath: 'checkpoints/task-8h.json',
|
||||
sessionKey: 'task-8h',
|
||||
dispatchedAt: '2026-04-24T16:10:00+08:00',
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== false) throw new Error(`expected checkpoint-only receipt to fail, got ${JSON.stringify(result.json)}`);
|
||||
if (result.json.reason !== 'missing_auto_next_dispatch') throw new Error(`expected checkpoint-only reason=missing_auto_next_dispatch, got ${JSON.stringify(result.json.reason)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
},
|
||||
},
|
||||
{
|
||||
name: 'continuity: fails when dispatchReceipt is a fake non-null object without minimum receipt fields',
|
||||
run() {
|
||||
@@ -188,35 +470,17 @@ const tests = [
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({
|
||||
args: ['--compact', '--input', fixture.path('input.json')],
|
||||
});
|
||||
|
||||
if (result.status !== 0 && result.status !== null) {
|
||||
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
}
|
||||
|
||||
if (!result.json || typeof result.json !== 'object') {
|
||||
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
}
|
||||
|
||||
if (result.json.ok !== false) {
|
||||
throw new Error(`expected continuity failure ok=false for fake dispatch receipt, got ${JSON.stringify(result.json)}`);
|
||||
}
|
||||
|
||||
if (result.json.verdict !== 'continuity_failure') {
|
||||
throw new Error(`expected verdict=continuity_failure for fake dispatch receipt, got ${JSON.stringify(result.json.verdict)}`);
|
||||
}
|
||||
|
||||
if (result.json.reason !== 'missing_dispatch_receipt') {
|
||||
throw new Error(`expected reason=missing_dispatch_receipt for fake dispatch receipt, got ${JSON.stringify(result.json.reason)}`);
|
||||
}
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== false) throw new Error(`expected continuity failure ok=false for fake dispatch receipt, got ${JSON.stringify(result.json)}`);
|
||||
if (result.json.verdict !== 'continuity_failure') throw new Error(`expected verdict=continuity_failure for fake dispatch receipt, got ${JSON.stringify(result.json.verdict)}`);
|
||||
if (result.json.reason !== 'missing_dispatch_receipt') throw new Error(`expected reason=missing_dispatch_receipt for fake dispatch receipt, got ${JSON.stringify(result.json.reason)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
},
|
||||
},
|
||||
|
||||
{
|
||||
name: 'continuity: passes when task is complete, next action is known, and a dispatch receipt already exists',
|
||||
run() {
|
||||
@@ -243,27 +507,15 @@ const tests = [
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({
|
||||
args: ['--compact', '--input', fixture.path('input.json')],
|
||||
});
|
||||
|
||||
if (result.status !== 0 && result.status !== null) {
|
||||
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
}
|
||||
|
||||
if (!result.json || typeof result.json !== 'object') {
|
||||
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
}
|
||||
|
||||
if (result.json.ok !== true) {
|
||||
throw new Error(`expected continuity pass ok=true when dispatch receipt exists, got ${JSON.stringify(result.json)}`);
|
||||
}
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== true) throw new Error(`expected continuity pass ok=true when dispatch receipt exists, got ${JSON.stringify(result.json)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
},
|
||||
},
|
||||
|
||||
{
|
||||
name: 'continuity: passes when planner returns derivedAction and a bound dispatch receipt already exists',
|
||||
run() {
|
||||
@@ -290,27 +542,15 @@ const tests = [
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({
|
||||
args: ['--compact', '--input', fixture.path('input.json')],
|
||||
});
|
||||
|
||||
if (result.status !== 0 && result.status !== null) {
|
||||
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
}
|
||||
|
||||
if (!result.json || typeof result.json !== 'object') {
|
||||
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
}
|
||||
|
||||
if (result.json.ok !== true) {
|
||||
throw new Error(`expected continuity pass ok=true when derivedAction has bound dispatch receipt, got ${JSON.stringify(result.json)}`);
|
||||
}
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== true) throw new Error(`expected continuity pass ok=true when derivedAction has bound dispatch receipt, got ${JSON.stringify(result.json)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
},
|
||||
},
|
||||
|
||||
{
|
||||
name: 'continuity: passes when task is complete, next action is known, no dispatch receipt exists, and closure is waiting_user',
|
||||
run() {
|
||||
@@ -329,27 +569,15 @@ const tests = [
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({
|
||||
args: ['--compact', '--input', fixture.path('input.json')],
|
||||
});
|
||||
|
||||
if (result.status !== 0 && result.status !== null) {
|
||||
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
}
|
||||
|
||||
if (!result.json || typeof result.json !== 'object') {
|
||||
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
}
|
||||
|
||||
if (result.json.ok !== true) {
|
||||
throw new Error(`expected continuity pass ok=true when closure is waiting_user, got ${JSON.stringify(result.json)}`);
|
||||
}
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== true) throw new Error(`expected continuity pass ok=true when closure is waiting_user, got ${JSON.stringify(result.json)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
},
|
||||
},
|
||||
|
||||
{
|
||||
name: 'continuity: passes when task is complete, next action is known, no dispatch receipt exists, and closure is pending_verification',
|
||||
run() {
|
||||
@@ -368,27 +596,15 @@ const tests = [
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({
|
||||
args: ['--compact', '--input', fixture.path('input.json')],
|
||||
});
|
||||
|
||||
if (result.status !== 0 && result.status !== null) {
|
||||
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
}
|
||||
|
||||
if (!result.json || typeof result.json !== 'object') {
|
||||
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
}
|
||||
|
||||
if (result.json.ok !== true) {
|
||||
throw new Error(`expected continuity pass ok=true when closure is pending_verification, got ${JSON.stringify(result.json)}`);
|
||||
}
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== true) throw new Error(`expected continuity pass ok=true when closure is pending_verification, got ${JSON.stringify(result.json)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
},
|
||||
},
|
||||
|
||||
{
|
||||
name: 'continuity: passes when task is complete, next action is known, no dispatch receipt exists, and closure is blocked',
|
||||
run() {
|
||||
@@ -407,21 +623,10 @@ const tests = [
|
||||
});
|
||||
|
||||
try {
|
||||
const result = runGate({
|
||||
args: ['--compact', '--input', fixture.path('input.json')],
|
||||
});
|
||||
|
||||
if (result.status !== 0 && result.status !== null) {
|
||||
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
}
|
||||
|
||||
if (!result.json || typeof result.json !== 'object') {
|
||||
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
}
|
||||
|
||||
if (result.json.ok !== true) {
|
||||
throw new Error(`expected continuity pass ok=true when closure is blocked, got ${JSON.stringify(result.json)}`);
|
||||
}
|
||||
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||
if (result.json.ok !== true) throw new Error(`expected continuity pass ok=true when closure is blocked, got ${JSON.stringify(result.json)}`);
|
||||
} finally {
|
||||
fixture.cleanup();
|
||||
}
|
||||
|
||||
@@ -312,8 +312,10 @@ async function main() {
|
||||
assert.match(passInjected, /\[APPROVED_PLAN_CONTINUITY_GATE\]/, 'hook pass-path should emit approved-plan continuity gate block');
|
||||
assert.match(passInjected, /status=continuity_failure/, 'hook pass-path should fail continuity when planner only returns dry-run dispatch without a real receipt');
|
||||
assert.match(passInjected, /verdict=continuity_failure/, 'hook pass-path should expose continuity failure verdict when no real dispatch receipt exists');
|
||||
assert.match(passInjected, /reason=missing_dispatch_receipt/, 'hook pass-path should require a real dispatch receipt instead of treating dry-run dispatch as one');
|
||||
assert.match(passInjected, /Route back to continuity failure until a real next dispatch receipt exists/, 'hook pass-path should hard-gate normal closeout until a real receipt exists');
|
||||
assert.match(passInjected, /reason=missing_auto_next_dispatch/, 'hook pass-path should require auto-next dispatch proof instead of treating dry-run dispatch as enough');
|
||||
assert.match(passInjected, /Do not stop at this completed-task boundary/, 'hook pass-path should explicitly forbid stopping at the completed-task boundary');
|
||||
assert.match(passInjected, /Auto-dispatch the next task in the same approved plan, unless waiting_user, blocked, pending_verification, or high-risk stop applies/, 'hook pass-path should explain the auto-next obligation exceptions');
|
||||
assert.match(passInjected, /Do not stop at this completed-task boundary/, 'hook pass-path should hard-gate the completed-task boundary');
|
||||
assert.doesNotMatch(passInjected, /\[APPROVED_PLAN_CONTINUITY_GATE\][\s\S]*status=pass/, 'hook pass-path should not let approved-plan continuity pass on dry-run dispatch alone');
|
||||
|
||||
const failInjected = await withPatchedWrapper(buildWrapperScript({
|
||||
|
||||
Reference in New Issue
Block a user