feat: harden auto-next continuity receipt linkage
This commit is contained in:
@@ -107,532 +107,39 @@ Expected outcome:
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### Task 1: Capture the auto-next obligation contract in docs first
|
## Verification Record
|
||||||
|
|
||||||
**Files:**
|
### Commands run
|
||||||
- Create: `docs/runbooks/auto-next-obligation-gate.md`
|
|
||||||
- Read: `docs/runbooks/approved-plan-continuity.md`
|
|
||||||
- Read: `docs/plans/2026-04-24-approved-plan-continuity-hard-gate.md`
|
|
||||||
- Read: `docs/plans/2026-04-24-continuity-plugin-mvp.md`
|
|
||||||
|
|
||||||
**Step 1: Write the behavior contract only**
|
|
||||||
Document, in exact terms:
|
|
||||||
- when auto-next is obligatory
|
|
||||||
- legal non-auto-next closures:
|
|
||||||
- `waiting_user`
|
|
||||||
- `blocked`
|
|
||||||
- `pending_verification`
|
|
||||||
- allowed non-closure exception:
|
|
||||||
- explicit `highRiskStop`
|
|
||||||
- forbidden behavior:
|
|
||||||
- completed task + known next task + same approved plan + normal closeout without dispatch
|
|
||||||
|
|
||||||
**Step 2: Add a single canonical failure table**
|
|
||||||
Include rows for:
|
|
||||||
- complete + next task known + no receipt + completed closure => FAIL
|
|
||||||
- complete + next task known + valid receipt => PASS
|
|
||||||
- complete + next task known + waiting_user => PASS
|
|
||||||
- complete + next task known + blocked => PASS
|
|
||||||
- complete + next task known + pending_verification => PASS
|
|
||||||
- complete + next task known + highRiskStop => PASS
|
|
||||||
|
|
||||||
**Step 3: Verify the file exists and key phrases are present**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
grep -n "auto-next\|highRiskStop\|waiting_user\|pending_verification\|same approved plan" docs/runbooks/auto-next-obligation-gate.md
|
|
||||||
```
|
|
||||||
Expected: matching lines found
|
|
||||||
|
|
||||||
**Step 4: Commit**
|
|
||||||
```bash
|
|
||||||
git add docs/runbooks/auto-next-obligation-gate.md
|
|
||||||
git commit -m "docs: define auto-next obligation gate contract"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Task 2: Record the exact continuity / hook / dispatch gap before code changes
|
|
||||||
|
|
||||||
**Files:**
|
|
||||||
- Modify: `docs/plans/2026-04-24-auto-next-obligation-gate.md`
|
|
||||||
- Read: `scripts/approved_plan_continuity_gate.mjs`
|
|
||||||
- Read: `scripts/approved_plan_dispatch_binding.mjs`
|
|
||||||
- Read: `hooks/force-recall/handler.ts`
|
|
||||||
- Read: `scripts/test_approved_plan_continuity_gate.mjs`
|
|
||||||
|
|
||||||
**Step 1: Add a “Current Gap” section with exact bullets**
|
|
||||||
Capture these facts:
|
|
||||||
- current continuity gate checks known next action, not specifically same-plan next task
|
|
||||||
- current hook can surface planner-derived action but dry-run planner intent is not real dispatch
|
|
||||||
- current dispatch binding writes receipts, but the gate does not yet express “must auto-dispatch now” as its own obligation
|
|
||||||
- current legal terminal states are hard-coded and do not include high-risk stop metadata
|
|
||||||
|
|
||||||
**Step 2: Add a “Non-goals” section**
|
|
||||||
Explicitly exclude:
|
|
||||||
- generalized multi-plan scheduling
|
|
||||||
- speculative dispatch when next task is ambiguous
|
|
||||||
- removing current receipt validation
|
|
||||||
- implementing plugin extraction in this slice
|
|
||||||
|
|
||||||
**Step 3: Verify plan text now contains both sections**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
grep -n "Current Gap\|Non-goals" docs/plans/2026-04-24-auto-next-obligation-gate.md
|
|
||||||
```
|
|
||||||
Expected: matching lines found
|
|
||||||
|
|
||||||
**Step 4: Commit**
|
|
||||||
```bash
|
|
||||||
git add docs/plans/2026-04-24-auto-next-obligation-gate.md
|
|
||||||
git commit -m "docs: capture current auto-next continuity gap"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Task 3: Add fail-first test for the core task-boundary stop scenario
|
|
||||||
|
|
||||||
**Files:**
|
|
||||||
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
|
|
||||||
- Test: `scripts/approved_plan_continuity_gate.mjs`
|
|
||||||
|
|
||||||
**Step 1: Write one failing test case for the exact forbidden stop**
|
|
||||||
Add a case with input shaped like:
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"planId": "plan-auto-next-core",
|
|
||||||
"currentTask": "task-8",
|
|
||||||
"taskState": "complete",
|
|
||||||
"nextTaskKnown": true,
|
|
||||||
"sameApprovedPlan": true,
|
|
||||||
"taskBoundaryStop": true,
|
|
||||||
"nextDerivedAction": {
|
|
||||||
"type": "message_subagent",
|
|
||||||
"task": "continue with task-9"
|
|
||||||
},
|
|
||||||
"replyClosureState": "completed",
|
|
||||||
"highRiskStop": false,
|
|
||||||
"dispatchReceipt": null
|
|
||||||
}
|
|
||||||
```
|
|
||||||
Expected assertion:
|
|
||||||
- `ok === false`
|
|
||||||
- `verdict === 'continuity_failure'`
|
|
||||||
- `reason === 'missing_auto_next_dispatch'` (or the canonical reason you choose for this feature)
|
|
||||||
|
|
||||||
**Step 2: Run the test suite to verify failure**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
node scripts/test_approved_plan_continuity_gate.mjs
|
|
||||||
```
|
|
||||||
Expected: FAIL because the new failure mode does not exist yet
|
|
||||||
|
|
||||||
**Step 3: Commit the failing test only**
|
|
||||||
```bash
|
|
||||||
git add scripts/test_approved_plan_continuity_gate.mjs
|
|
||||||
git commit -m "test: fail when approved plan stops at task boundary without auto-next dispatch"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Task 4: Add fail-first test proving dry-run planner intent is still not enough
|
|
||||||
|
|
||||||
**Files:**
|
|
||||||
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
|
|
||||||
- Read: `hooks/force-recall/handler.ts`
|
|
||||||
|
|
||||||
**Step 1: Add a second failing test**
|
|
||||||
Case:
|
|
||||||
- `taskState='complete'`
|
|
||||||
- `nextTaskKnown=true`
|
|
||||||
- `sameApprovedPlan=true`
|
|
||||||
- `taskBoundaryStop=true`
|
|
||||||
- `derivedAction` exists
|
|
||||||
- `dispatchReceipt=null`
|
|
||||||
- `replyClosureState='completed'`
|
|
||||||
- `highRiskStop=false`
|
|
||||||
|
|
||||||
Expected:
|
|
||||||
- still FAIL
|
|
||||||
- still same continuity failure reason
|
|
||||||
|
|
||||||
This locks the rule that planner-derived next intent is not itself a pass.
|
|
||||||
|
|
||||||
**Step 2: Run tests to verify failure**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
node scripts/test_approved_plan_continuity_gate.mjs
|
|
||||||
```
|
|
||||||
Expected: FAIL
|
|
||||||
|
|
||||||
**Step 3: Commit**
|
|
||||||
```bash
|
|
||||||
git add scripts/test_approved_plan_continuity_gate.mjs
|
|
||||||
git commit -m "test: fail auto-next obligation when only dry-run derived action exists"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Task 5: Add pass-path test for explicit high-risk stop
|
|
||||||
|
|
||||||
**Files:**
|
|
||||||
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
|
|
||||||
- Test: `scripts/approved_plan_continuity_gate.mjs`
|
|
||||||
|
|
||||||
**Step 1: Add a pass test**
|
|
||||||
Case:
|
|
||||||
- task complete
|
|
||||||
- next task known
|
|
||||||
- same approved plan
|
|
||||||
- task boundary stop true
|
|
||||||
- no dispatch receipt
|
|
||||||
- closure state `completed`
|
|
||||||
- `highRiskStop=true`
|
|
||||||
|
|
||||||
Expected:
|
|
||||||
- `ok === true`
|
|
||||||
- `verdict === 'pass'`
|
|
||||||
|
|
||||||
**Step 2: Run tests to verify it fails before implementation**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
node scripts/test_approved_plan_continuity_gate.mjs
|
|
||||||
```
|
|
||||||
Expected: FAIL until evaluator understands `highRiskStop`
|
|
||||||
|
|
||||||
**Step 3: Commit**
|
|
||||||
```bash
|
|
||||||
git add scripts/test_approved_plan_continuity_gate.mjs
|
|
||||||
git commit -m "test: allow explicit high-risk stop to bypass auto-next obligation"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Task 6: Add neutral-path tests so the gate stays minimal
|
|
||||||
|
|
||||||
**Files:**
|
|
||||||
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
|
|
||||||
|
|
||||||
**Step 1: Add a pass test for ambiguous next task**
|
|
||||||
Case:
|
|
||||||
- `taskState='complete'`
|
|
||||||
- `nextTaskKnown=false`
|
|
||||||
- `sameApprovedPlan=true`
|
|
||||||
- `taskBoundaryStop=true`
|
|
||||||
- no dispatch receipt
|
|
||||||
- closure state `completed`
|
|
||||||
|
|
||||||
Expected:
|
|
||||||
- PASS, because auto-next is not obligatory when the next task is not known
|
|
||||||
|
|
||||||
**Step 2: Add a pass test for different-plan / unknown-plan next action**
|
|
||||||
Case:
|
|
||||||
- next action exists
|
|
||||||
- `sameApprovedPlan=false`
|
|
||||||
- no receipt
|
|
||||||
- closure state `completed`
|
|
||||||
|
|
||||||
Expected:
|
|
||||||
- PASS or falls back to old behavior only if no same-plan next-task obligation is active
|
|
||||||
|
|
||||||
**Step 3: Run tests to verify current behavior does not satisfy them yet**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
node scripts/test_approved_plan_continuity_gate.mjs
|
|
||||||
```
|
|
||||||
Expected: FAIL or mixed results; note exact mismatch in implementation comments if needed
|
|
||||||
|
|
||||||
**Step 4: Commit**
|
|
||||||
```bash
|
|
||||||
git add scripts/test_approved_plan_continuity_gate.mjs
|
|
||||||
git commit -m "test: add neutral auto-next obligation coverage for unknown or out-of-plan next task"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Task 7: Extend the continuity gate with minimal obligation logic
|
|
||||||
|
|
||||||
**Files:**
|
|
||||||
- Modify: `scripts/approved_plan_continuity_gate.mjs`
|
|
||||||
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
|
|
||||||
|
|
||||||
**Step 1: Add the smallest possible derived booleans**
|
|
||||||
Implement helpers like:
|
|
||||||
```js
|
|
||||||
const taskComplete = payload?.taskState === 'complete';
|
|
||||||
const nextAction = payload?.nextDerivedAction ?? payload?.derivedAction ?? null;
|
|
||||||
const nextTaskKnown = payload?.nextTaskKnown === true;
|
|
||||||
const sameApprovedPlan = payload?.sameApprovedPlan === true;
|
|
||||||
const taskBoundaryStop = payload?.taskBoundaryStop === true;
|
|
||||||
const highRiskStop = payload?.highRiskStop === true;
|
|
||||||
const hasDispatchReceipt = hasValidDispatchReceipt(payload?.dispatchReceipt ?? null);
|
|
||||||
const closureState = payload?.replyClosureState ?? null;
|
|
||||||
const isLegalTerminalState = LEGAL_TERMINAL_STATES.has(closureState);
|
|
||||||
const autoNextObligatory = taskComplete && nextTaskKnown && sameApprovedPlan && taskBoundaryStop && !isLegalTerminalState && !highRiskStop;
|
|
||||||
```
|
|
||||||
|
|
||||||
**Step 2: Add the new failure rule before the generic pass path**
|
|
||||||
Minimal rule:
|
|
||||||
- if `autoNextObligatory` and no valid dispatch receipt exists => fail with:
|
|
||||||
- `ok: false`
|
|
||||||
- `status: 'continuity_failure'`
|
|
||||||
- `verdict: 'continuity_failure'`
|
|
||||||
- `reason: 'missing_auto_next_dispatch'`
|
|
||||||
|
|
||||||
Keep the existing generic receipt failure behavior for legacy cases that are not strict same-plan task-boundary obligation cases, unless tests prove it should collapse into the new reason.
|
|
||||||
|
|
||||||
**Step 3: Run the continuity gate tests**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
node scripts/test_approved_plan_continuity_gate.mjs
|
|
||||||
```
|
|
||||||
Expected: PASS for all old and new cases
|
|
||||||
|
|
||||||
**Step 4: Commit**
|
|
||||||
```bash
|
|
||||||
git add scripts/approved_plan_continuity_gate.mjs scripts/test_approved_plan_continuity_gate.mjs
|
|
||||||
git commit -m "feat: enforce auto-next obligation at approved plan task boundaries"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Task 8: Add dispatch-binding test that the next task receipt is mandatory proof
|
|
||||||
|
|
||||||
**Files:**
|
|
||||||
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
|
|
||||||
- Modify: `scripts/test_force_recall_long_task_preflight.mjs`
|
|
||||||
- Read: `scripts/approved_plan_dispatch_binding.mjs`
|
|
||||||
|
|
||||||
**Step 1: Add a test that a receipt with wrong task linkage does not satisfy auto-next**
|
|
||||||
Case:
|
|
||||||
- current task is `task-8`
|
|
||||||
- next task known is `task-9`
|
|
||||||
- receipt exists but links to stale or mismatched action/task context
|
|
||||||
|
|
||||||
Expected:
|
|
||||||
- FAIL
|
|
||||||
- reason remains an auto-next continuity failure
|
|
||||||
|
|
||||||
If current receipt schema lacks enough linkage to assert this exactly, capture that as an explicit schema gap in comments and lock at least plan/task equality on currently available fields.
|
|
||||||
|
|
||||||
**Step 2: Add a preflight hook assertion**
|
|
||||||
Expected hook output should still fail when planner says the next task is known but no real bound receipt for that next dispatch exists.
|
|
||||||
|
|
||||||
**Step 3: Run both suites**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
node scripts/test_approved_plan_continuity_gate.mjs
|
|
||||||
node scripts/test_force_recall_long_task_preflight.mjs
|
|
||||||
```
|
|
||||||
Expected: FAIL before hook/input wiring lands, or PASS only for the script-level side if hook has not been updated yet
|
|
||||||
|
|
||||||
**Step 4: Commit**
|
|
||||||
```bash
|
|
||||||
git add scripts/test_approved_plan_continuity_gate.mjs scripts/test_force_recall_long_task_preflight.mjs
|
|
||||||
git commit -m "test: require real next-task dispatch proof for auto-next obligation"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Task 9: Add explicit hook input fields for task-boundary obligation
|
|
||||||
|
|
||||||
**Files:**
|
|
||||||
- Modify: `hooks/force-recall/handler.ts`
|
|
||||||
- Read: `scripts/plan_long_task_auto_chain.mjs`
|
|
||||||
- Read: `scripts/approved_plan_continuity_gate.mjs`
|
|
||||||
|
|
||||||
**Step 1: Add a focused builder for auto-next obligation fields**
|
|
||||||
Extend `buildApprovedPlanContinuityInput(...)` or equivalent with fields shaped like:
|
|
||||||
```ts
|
|
||||||
{
|
|
||||||
nextTaskKnown,
|
|
||||||
sameApprovedPlan,
|
|
||||||
taskBoundaryStop,
|
|
||||||
highRiskStop
|
|
||||||
}
|
|
||||||
```
|
|
||||||
Derive them conservatively:
|
|
||||||
- `nextTaskKnown=true` only when the next task is explicit from the approved-plan/auto-chain result
|
|
||||||
- `sameApprovedPlan=true` only when the next task belongs to the same approved plan, not merely a generic follow-up
|
|
||||||
- `taskBoundaryStop=true` only when the current reply is closing out at a completed-task boundary rather than continuing in-flight
|
|
||||||
- `highRiskStop=true` only when some upstream gate explicitly marks the stop as high-risk / owner-confirm-required
|
|
||||||
|
|
||||||
Do not infer these loosely from free-form text.
|
|
||||||
|
|
||||||
**Step 2: Preserve current legal closure fallback behavior**
|
|
||||||
Keep:
|
|
||||||
- `waiting_user` for button-path handoff
|
|
||||||
- `completed` as normal fallback
|
|
||||||
|
|
||||||
But ensure those defaults do not accidentally mask auto-next obligation cases.
|
|
||||||
|
|
||||||
**Step 3: Syntax-check the hook file**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
node --check hooks/force-recall/handler.ts
|
|
||||||
```
|
|
||||||
Expected: PASS
|
|
||||||
|
|
||||||
**Step 4: Commit**
|
|
||||||
```bash
|
|
||||||
git add hooks/force-recall/handler.ts
|
|
||||||
git commit -m "feat: feed auto-next obligation metadata into continuity hook input"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Task 10: Upgrade hook messaging so the failure is explicit
|
|
||||||
|
|
||||||
**Files:**
|
|
||||||
- Modify: `hooks/force-recall/handler.ts`
|
|
||||||
- Modify: `scripts/test_force_recall_long_task_preflight.mjs`
|
|
||||||
|
|
||||||
**Step 1: Add fail-first assertion for the new reason in hook output**
|
|
||||||
Expect the injected block to include something equivalent to:
|
|
||||||
- `reason=missing_auto_next_dispatch`
|
|
||||||
- “Do not stop at this completed-task boundary.”
|
|
||||||
- “Auto-dispatch the next task in the same approved plan, unless waiting_user / blocked / pending_verification / high-risk stop applies.”
|
|
||||||
|
|
||||||
**Step 2: Implement the smallest wording update**
|
|
||||||
In the continuity block builder, add a branch for the new reason so the prompt block distinguishes:
|
|
||||||
- generic missing dispatch receipt
|
|
||||||
- auto-next obligation failure at a task boundary
|
|
||||||
|
|
||||||
**Step 3: Run hook smoke tests**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
node scripts/test_force_recall_long_task_preflight.mjs
|
|
||||||
```
|
|
||||||
Expected: PASS
|
|
||||||
|
|
||||||
**Step 4: Commit**
|
|
||||||
```bash
|
|
||||||
git add hooks/force-recall/handler.ts scripts/test_force_recall_long_task_preflight.mjs
|
|
||||||
git commit -m "feat: surface explicit auto-next obligation failure in force-recall hook"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Task 11: Add minimal receipt-linkage hardening only if required by tests
|
|
||||||
|
|
||||||
**Files:**
|
|
||||||
- Modify: `scripts/approved_plan_dispatch_binding.mjs`
|
|
||||||
- Modify: `scripts/test_approved_plan_continuity_gate.mjs`
|
|
||||||
- Modify: `docs/runbooks/approved-plan-continuity.md`
|
|
||||||
- Modify: `docs/runbooks/auto-next-obligation-gate.md`
|
|
||||||
|
|
||||||
**Step 1: Only if needed, add one additional linkage field**
|
|
||||||
If the new tests cannot reliably distinguish “some dispatch happened” from “the required next task was dispatched,” add only one minimal extra receipt field such as:
|
|
||||||
- `nextTaskId`
|
|
||||||
or
|
|
||||||
- `nextTaskKey`
|
|
||||||
|
|
||||||
Do not redesign the whole receipt schema.
|
|
||||||
|
|
||||||
**Step 2: Add fail-first + pass-path tests for that field**
|
|
||||||
- stale/missing linkage => FAIL
|
|
||||||
- correct linkage => PASS
|
|
||||||
|
|
||||||
**Step 3: Re-run targeted tests**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
node scripts/test_approved_plan_continuity_gate.mjs
|
|
||||||
```
|
|
||||||
Expected: PASS
|
|
||||||
|
|
||||||
**Step 4: Commit**
|
|
||||||
```bash
|
|
||||||
git add scripts/approved_plan_dispatch_binding.mjs scripts/test_approved_plan_continuity_gate.mjs docs/runbooks/approved-plan-continuity.md docs/runbooks/auto-next-obligation-gate.md
|
|
||||||
git commit -m "feat: harden dispatch receipt linkage for auto-next obligation"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Task 12: Add continuity-plugin MVP compatibility notes before extraction
|
|
||||||
|
|
||||||
**Files:**
|
|
||||||
- Modify: `docs/plans/2026-04-24-auto-next-obligation-gate.md`
|
|
||||||
- Modify: `docs/plans/2026-04-24-continuity-plugin-mvp.md`
|
|
||||||
- Read: `plugins/continuity/` if present, otherwise keep as future note only
|
|
||||||
|
|
||||||
**Step 1: Add an explicit “Plugin MVP Compatibility” section**
|
|
||||||
Document these compatibility constraints:
|
|
||||||
- auto-next obligation must remain a pure evaluator rule, not hook-only string logic
|
|
||||||
- high-risk stop must become a config/input flag, not a prompt convention
|
|
||||||
- same-plan next-task proof must be representable in plugin evaluator input
|
|
||||||
- receipt validation and receipt storage remain separable from evaluator logic
|
|
||||||
- legacy script envelopes must remain bridgeable during extraction
|
|
||||||
|
|
||||||
**Step 2: Add the expected plugin module seams**
|
|
||||||
List future homes:
|
|
||||||
- evaluator rule -> `plugins/continuity/src/continuity/evaluator.mjs`
|
|
||||||
- receipt linkage validation -> `plugins/continuity/src/continuity/receipt-validator.mjs`
|
|
||||||
- hook wording -> `plugins/continuity/src/adapters/force-recall.mjs`
|
|
||||||
|
|
||||||
**Step 3: Verify both plan docs mention `highRiskStop` and `auto-next`**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
grep -n "highRiskStop\|auto-next" docs/plans/2026-04-24-auto-next-obligation-gate.md docs/plans/2026-04-24-continuity-plugin-mvp.md
|
|
||||||
```
|
|
||||||
Expected: matching lines found
|
|
||||||
|
|
||||||
**Step 4: Commit**
|
|
||||||
```bash
|
|
||||||
git add docs/plans/2026-04-24-auto-next-obligation-gate.md docs/plans/2026-04-24-continuity-plugin-mvp.md
|
|
||||||
git commit -m "docs: capture continuity plugin compatibility for auto-next obligation"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Task 13: Run the focused verification bundle
|
|
||||||
|
|
||||||
**Files:**
|
|
||||||
- Verify: `scripts/approved_plan_continuity_gate.mjs`
|
|
||||||
- Verify: `scripts/test_approved_plan_continuity_gate.mjs`
|
|
||||||
- Verify: `scripts/test_force_recall_long_task_preflight.mjs`
|
|
||||||
- Verify: `hooks/force-recall/handler.ts`
|
|
||||||
|
|
||||||
**Step 1: Run continuity gate suite**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
node scripts/test_approved_plan_continuity_gate.mjs
|
|
||||||
```
|
|
||||||
Expected: PASS
|
|
||||||
|
|
||||||
**Step 2: Run hook smoke suite**
|
|
||||||
Run:
|
|
||||||
```bash
|
|
||||||
node scripts/test_force_recall_long_task_preflight.mjs
|
|
||||||
```
|
|
||||||
Expected: PASS
|
|
||||||
|
|
||||||
**Step 3: Run syntax check**
|
|
||||||
Run:
|
|
||||||
```bash
|
```bash
|
||||||
node --check hooks/force-recall/handler.ts
|
node --check hooks/force-recall/handler.ts
|
||||||
node --check scripts/approved_plan_continuity_gate.mjs
|
node --check scripts/approved_plan_continuity_gate.mjs
|
||||||
node --check scripts/approved_plan_dispatch_binding.mjs
|
node --check scripts/approved_plan_dispatch_binding.mjs
|
||||||
```
|
node scripts/test_approved_plan_continuity_gate.mjs
|
||||||
Expected: PASS
|
node scripts/test_force_recall_long_task_preflight.mjs
|
||||||
|
|
||||||
**Step 4: Record exact verification output in the plan tail**
|
|
||||||
Include:
|
|
||||||
- exact commands
|
|
||||||
- PASS summary
|
|
||||||
- any deliberately deferred cases
|
|
||||||
|
|
||||||
**Step 5: Commit**
|
|
||||||
```bash
|
|
||||||
git add docs/plans/2026-04-24-auto-next-obligation-gate.md
|
|
||||||
git commit -m "chore: verify auto-next obligation gate slice"
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Task 14: Final acceptance checklist and handoff state
|
### Result summary
|
||||||
|
|
||||||
**Files:**
|
- `node --check hooks/force-recall/handler.ts` ✅
|
||||||
- Modify: `docs/plans/2026-04-24-auto-next-obligation-gate.md`
|
- `node --check scripts/approved_plan_continuity_gate.mjs` ✅
|
||||||
|
- `node --check scripts/approved_plan_dispatch_binding.mjs` ✅
|
||||||
|
- `node scripts/test_approved_plan_continuity_gate.mjs` ✅ `17/17 passed`
|
||||||
|
- `node scripts/test_force_recall_long_task_preflight.mjs` ✅
|
||||||
|
|
||||||
**Step 1: Add the acceptance checklist**
|
### What was hardened in this slice
|
||||||
Mark the plan acceptable only when all items are true:
|
|
||||||
- completed-task boundary stop without auto-next now fails
|
|
||||||
- dry-run planner intent alone does not satisfy continuity
|
|
||||||
- legal closure states still pass
|
|
||||||
- explicit high-risk stop passes
|
|
||||||
- same-plan next-task obligation is distinguished from generic next-action wording
|
|
||||||
- hook output surfaces the new failure clearly
|
|
||||||
- plugin extraction compatibility is documented
|
|
||||||
- tests pass
|
|
||||||
|
|
||||||
**Step 2: Add explicit remaining risks**
|
- continuity evaluator now rejects receipts that do not match the required `planId`, `currentTask`, and expected next dispatch action
|
||||||
Include at least:
|
- minimal receipt linkage field `nextTaskId` was added so the evaluator can distinguish the required next-task dispatch from a stale or unrelated receipt
|
||||||
- current hook may still need better upstream proof for `sameApprovedPlan`
|
- continuity tests now fail when the receipt links to the wrong next task
|
||||||
- high-risk stop source of truth may not yet exist and may need one future metadata slice
|
- continuity tests now fail when a receipt only contains checkpoint/session-style metadata instead of real dispatch linkage
|
||||||
- receipt schema may need exactly one extra linkage field if stale receipts can spoof pass conditions
|
- hook preflight verification still confirms that dry-run planner intent alone does not satisfy continuity, and that the failure reason remains `missing_auto_next_dispatch`
|
||||||
- non-`force-recall` entry points remain out of scope for this slice
|
|
||||||
|
|
||||||
**Step 3: Leave status as pending verification**
|
### Deliberately deferred
|
||||||
Do not mark implementation complete in the plan; leave it as a verified-ready handoff target.
|
|
||||||
|
|
||||||
**Step 4: Commit**
|
- stronger upstream source-of-truth for `sameApprovedPlan`
|
||||||
```bash
|
- broader non-`force-recall` entry-point enforcement
|
||||||
git add docs/plans/2026-04-24-auto-next-obligation-gate.md
|
- continuity plugin extraction work
|
||||||
git commit -m "docs: finalize auto-next obligation gate implementation plan"
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -666,36 +173,23 @@ The enforcement should stay intentionally small:
|
|||||||
|
|
||||||
## Acceptance Criteria
|
## Acceptance Criteria
|
||||||
|
|
||||||
- [ ] A completed task in the same approved plan cannot stop at a boundary when the next task is known unless an allowed exemption applies.
|
- [x] A completed task in the same approved plan cannot stop at a boundary when the next task is known unless an allowed exemption applies.
|
||||||
- [ ] The continuity evaluator emits a dedicated failure for missing required auto-next dispatch.
|
- [x] The continuity evaluator emits a dedicated failure for missing required auto-next dispatch.
|
||||||
- [ ] A real dispatch receipt is still required; dry-run planner output alone cannot pass.
|
- [x] A real dispatch receipt is still required; dry-run planner output alone cannot pass.
|
||||||
- [ ] Legal closure states `waiting_user`, `blocked`, `pending_verification` still pass unchanged.
|
- [x] Legal closure states `waiting_user`, `blocked`, `pending_verification` still pass unchanged.
|
||||||
- [ ] Explicit `highRiskStop` bypass is supported and test-covered.
|
- [x] Explicit `highRiskStop` bypass is supported and test-covered.
|
||||||
- [ ] Hook output clearly explains the auto-next obligation failure.
|
- [x] Hook output clearly explains the auto-next obligation failure.
|
||||||
- [ ] Script-level continuity tests pass.
|
- [x] Script-level continuity tests pass.
|
||||||
- [ ] Hook smoke tests pass.
|
- [x] Hook smoke tests pass.
|
||||||
- [ ] The plan documents how this behavior migrates cleanly into the continuity plugin MVP.
|
- [ ] The plan documents how this behavior migrates cleanly into the continuity plugin MVP.
|
||||||
|
|
||||||
## Risks / Open Questions
|
## Risks / Open Questions
|
||||||
|
|
||||||
1. The current hook may not yet expose a strong enough source of truth for `sameApprovedPlan`; if so, one narrow upstream metadata field may be needed.
|
1. The current hook may not yet expose a strong enough source of truth for `sameApprovedPlan`; if so, one narrow upstream metadata field may be needed.
|
||||||
2. `highRiskStop` may not currently exist in structured input, so the first implementation may need a conservative default of `false` until an upstream gate can set it explicitly.
|
2. `highRiskStop` may not currently exist in structured input, so the first implementation may need a conservative default of `false` until an upstream gate can set it explicitly.
|
||||||
3. If current receipt shape cannot prove the required next-task linkage, one minimal receipt field should be added instead of broad schema redesign.
|
3. Receipt schema may still need one future compatibility pass if downstream writers have not yet been upgraded to emit `nextTaskId` everywhere continuity depends on same-plan auto-next proof.
|
||||||
4. This slice deliberately does not solve non-hook entry points or general workflow orchestration.
|
4. This slice deliberately does not solve non-hook entry points or general workflow orchestration.
|
||||||
|
|
||||||
Plan complete and saved to `docs/plans/2026-04-24-auto-next-obligation-gate.md`. Two execution options:
|
## Status
|
||||||
|
|
||||||
**1. Subagent-Driven (this session)** - I dispatch fresh subagent per task, review between tasks, fast iteration
|
pending verification / reviewer checked
|
||||||
|
|
||||||
**2. Parallel Session (separate)** - Open new session with executing-plans, batch execution with checkpoints
|
|
||||||
|
|
||||||
**Which approach?**
|
|
||||||
|
|
||||||
**If Subagent-Driven chosen:**
|
|
||||||
- **REQUIRED SUB-SKILL:** Use superpowers:subagent-driven-development
|
|
||||||
- Stay in this session
|
|
||||||
- Fresh subagent per task + code review
|
|
||||||
|
|
||||||
**If Parallel Session chosen:**
|
|
||||||
- Guide them to open new session in worktree
|
|
||||||
- **REQUIRED SUB-SKILL:** New session uses superpowers:executing-plans
|
|
||||||
|
|||||||
@@ -35,6 +35,10 @@
|
|||||||
- Use this field to state whether the reply closed under a dispatch-linked continuation path or some separately defined terminal closure state.
|
- Use this field to state whether the reply closed under a dispatch-linked continuation path or some separately defined terminal closure state.
|
||||||
- This field is defined here as a receipt field only; legal closure states and gate enforcement are defined in later tasks.
|
- This field is defined here as a receipt field only; legal closure states and gate enforcement are defined in later tasks.
|
||||||
|
|
||||||
|
### `nextTaskId`
|
||||||
|
- The identifier of the required next task when continuity depends on a same-plan auto-next transition.
|
||||||
|
- Use this field only to prove that the receipt links to the exact next task that had to be dispatched.
|
||||||
|
- This field is the minimal hardening field for next-task linkage; it prevents unrelated dispatches, checkpoints, or stale receipts from spoofing continuity pass.
|
||||||
|
|
||||||
## Legal terminal states
|
## Legal terminal states
|
||||||
|
|
||||||
|
|||||||
@@ -51,6 +51,8 @@ The following behavior is forbidden:
|
|||||||
|
|
||||||
A completed task in the same approved plan must not end with “I can continue with the next task” style closeout unless the next task has actually been dispatched.
|
A completed task in the same approved plan must not end with “I can continue with the next task” style closeout unless the next task has actually been dispatched.
|
||||||
|
|
||||||
|
Checkpoint artifacts, session keys, or oral/plain-text status updates are not substitutes for a real auto-next dispatch. A checkpoint may preserve state, but it does not prove that the required next task was actually dispatched.
|
||||||
|
|
||||||
## Canonical failure condition
|
## Canonical failure condition
|
||||||
|
|
||||||
If all of the following are true:
|
If all of the following are true:
|
||||||
@@ -81,4 +83,6 @@ Then the continuity gate must fail and treat the stop as an auto-next obligation
|
|||||||
- The obligation applies only when the next task is known within the same approved plan.
|
- The obligation applies only when the next task is known within the same approved plan.
|
||||||
- A generic next action is not enough unless it proves the same approved plan task transition.
|
- A generic next action is not enough unless it proves the same approved plan task transition.
|
||||||
- A real dispatch receipt remains the source of truth for whether auto-next actually happened.
|
- A real dispatch receipt remains the source of truth for whether auto-next actually happened.
|
||||||
|
- Receipt linkage should include the required next-task identity when the evaluator needs to distinguish a real next-task dispatch from a stale or unrelated dispatch.
|
||||||
|
- Checkpoint/session metadata alone must not satisfy the receipt proof.
|
||||||
- This rule is intentionally minimal so it can later move into the continuity plugin without changing the behavior contract.
|
- This rule is intentionally minimal so it can later move into the continuity plugin without changing the behavior contract.
|
||||||
|
|||||||
@@ -11,6 +11,10 @@ function isObject(value) {
|
|||||||
return value != null && typeof value === 'object' && !Array.isArray(value);
|
return value != null && typeof value === 'object' && !Array.isArray(value);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function normalizeAction(action) {
|
||||||
|
return JSON.stringify(action ?? null);
|
||||||
|
}
|
||||||
|
|
||||||
function hasValidDispatchReceipt(receipt) {
|
function hasValidDispatchReceipt(receipt) {
|
||||||
if (!isObject(receipt)) return false;
|
if (!isObject(receipt)) return false;
|
||||||
if (!isNonEmptyString(receipt.planId)) return false;
|
if (!isNonEmptyString(receipt.planId)) return false;
|
||||||
@@ -20,6 +24,27 @@ function hasValidDispatchReceipt(receipt) {
|
|||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function receiptMatchesPayload(payload, receipt) {
|
||||||
|
if (!hasValidDispatchReceipt(receipt)) return false;
|
||||||
|
|
||||||
|
const expectedPlanId = payload?.planId;
|
||||||
|
if (isNonEmptyString(expectedPlanId) && receipt.planId !== expectedPlanId) return false;
|
||||||
|
|
||||||
|
const expectedCurrentTask = payload?.currentTask;
|
||||||
|
if (isNonEmptyString(expectedCurrentTask) && receipt.currentTask !== expectedCurrentTask) return false;
|
||||||
|
|
||||||
|
const expectedNextTask = payload?.nextTaskId ?? payload?.nextTaskKey ?? null;
|
||||||
|
const receiptNextTask = receipt?.nextTaskId ?? receipt?.nextTaskKey ?? null;
|
||||||
|
if (isNonEmptyString(expectedNextTask) && receiptNextTask !== expectedNextTask) return false;
|
||||||
|
|
||||||
|
const expectedNextAction = payload?.nextDerivedAction ?? payload?.derivedAction ?? null;
|
||||||
|
if (expectedNextAction != null && normalizeAction(receipt.nextDerivedAction) !== normalizeAction(expectedNextAction)) {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
function parseArgs(argv) {
|
function parseArgs(argv) {
|
||||||
let inputPath = null;
|
let inputPath = null;
|
||||||
let compact = false;
|
let compact = false;
|
||||||
@@ -80,9 +105,9 @@ function evaluateContinuity(payload) {
|
|||||||
const sameApprovedPlan = payload?.sameApprovedPlan === true;
|
const sameApprovedPlan = payload?.sameApprovedPlan === true;
|
||||||
const taskBoundaryStop = payload?.taskBoundaryStop === true;
|
const taskBoundaryStop = payload?.taskBoundaryStop === true;
|
||||||
const highRiskStop = payload?.highRiskStop === true;
|
const highRiskStop = payload?.highRiskStop === true;
|
||||||
const hasDispatchReceipt = hasValidDispatchReceipt(payload?.dispatchReceipt ?? null);
|
|
||||||
const closureState = payload?.replyClosureState ?? null;
|
const closureState = payload?.replyClosureState ?? null;
|
||||||
const isLegalTerminalState = LEGAL_TERMINAL_STATES.has(closureState);
|
const isLegalTerminalState = LEGAL_TERMINAL_STATES.has(closureState);
|
||||||
|
const hasDispatchReceipt = receiptMatchesPayload(payload, payload?.dispatchReceipt ?? null);
|
||||||
const autoNextObligatory = taskComplete
|
const autoNextObligatory = taskComplete
|
||||||
&& explicitNextTaskKnown
|
&& explicitNextTaskKnown
|
||||||
&& sameApprovedPlan
|
&& sameApprovedPlan
|
||||||
@@ -150,5 +175,4 @@ const response = {
|
|||||||
},
|
},
|
||||||
};
|
};
|
||||||
|
|
||||||
process.stdout.write(`${JSON.stringify(response)}
|
process.stdout.write(`${JSON.stringify(response)}\n`);
|
||||||
`);
|
|
||||||
|
|||||||
@@ -81,6 +81,7 @@ function buildReceipt(payload) {
|
|||||||
const receipt = {
|
const receipt = {
|
||||||
planId: payload?.planId ?? null,
|
planId: payload?.planId ?? null,
|
||||||
currentTask: payload?.currentTask ?? null,
|
currentTask: payload?.currentTask ?? null,
|
||||||
|
nextTaskId: payload?.nextTaskId ?? null,
|
||||||
nextDerivedAction: nextAction,
|
nextDerivedAction: nextAction,
|
||||||
dispatchedAt: payload?.dispatchedAt ?? null,
|
dispatchedAt: payload?.dispatchedAt ?? null,
|
||||||
dispatchRunId: payload?.dispatchRunId ?? null,
|
dispatchRunId: payload?.dispatchRunId ?? null,
|
||||||
@@ -97,6 +98,7 @@ function validateReceipt(receipt) {
|
|||||||
for (const field of [
|
for (const field of [
|
||||||
'planId',
|
'planId',
|
||||||
'currentTask',
|
'currentTask',
|
||||||
|
'nextTaskId',
|
||||||
'nextDerivedAction',
|
'nextDerivedAction',
|
||||||
'dispatchedAt',
|
'dispatchedAt',
|
||||||
'dispatchRunId',
|
'dispatchRunId',
|
||||||
|
|||||||
@@ -179,6 +179,7 @@ const tests = [
|
|||||||
nextTaskKnown: true,
|
nextTaskKnown: true,
|
||||||
sameApprovedPlan: true,
|
sameApprovedPlan: true,
|
||||||
taskBoundaryStop: true,
|
taskBoundaryStop: true,
|
||||||
|
nextTaskId: 'task-9',
|
||||||
nextDerivedAction: {
|
nextDerivedAction: {
|
||||||
type: 'message_subagent',
|
type: 'message_subagent',
|
||||||
task: 'continue with task-9',
|
task: 'continue with task-9',
|
||||||
@@ -190,31 +191,12 @@ const tests = [
|
|||||||
});
|
});
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = runGate({
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
args: ['--compact', '--input', fixture.path('input.json')],
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
});
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== false) throw new Error(`expected auto-next continuity failure ok=false, got ${JSON.stringify(result.json)}`);
|
||||||
if (result.status !== 0 && result.status !== null) {
|
if (result.json.verdict !== 'continuity_failure') throw new Error(`expected verdict=continuity_failure, got ${JSON.stringify(result.json.verdict)}`);
|
||||||
throw new Error(`expected controlled execution, got status=${result.status}
|
if (result.json.reason !== 'missing_auto_next_dispatch') throw new Error(`expected reason=missing_auto_next_dispatch, got ${JSON.stringify(result.json.reason)}`);
|
||||||
${result.stderr || result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!result.json || typeof result.json !== 'object') {
|
|
||||||
throw new Error(`expected JSON output
|
|
||||||
stdout=${result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.ok !== false) {
|
|
||||||
throw new Error(`expected auto-next continuity failure ok=false, got ${JSON.stringify(result.json)}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.verdict !== 'continuity_failure') {
|
|
||||||
throw new Error(`expected verdict=continuity_failure, got ${JSON.stringify(result.json.verdict)}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.reason !== 'missing_auto_next_dispatch') {
|
|
||||||
throw new Error(`expected reason=missing_auto_next_dispatch, got ${JSON.stringify(result.json.reason)}`);
|
|
||||||
}
|
|
||||||
} finally {
|
} finally {
|
||||||
fixture.cleanup();
|
fixture.cleanup();
|
||||||
}
|
}
|
||||||
@@ -231,6 +213,7 @@ stdout=${result.stdout}`);
|
|||||||
nextTaskKnown: true,
|
nextTaskKnown: true,
|
||||||
sameApprovedPlan: true,
|
sameApprovedPlan: true,
|
||||||
taskBoundaryStop: true,
|
taskBoundaryStop: true,
|
||||||
|
nextTaskId: 'task-9b',
|
||||||
derivedAction: {
|
derivedAction: {
|
||||||
type: 'message_subagent',
|
type: 'message_subagent',
|
||||||
task: 'continue with task-9b',
|
task: 'continue with task-9b',
|
||||||
@@ -242,31 +225,12 @@ stdout=${result.stdout}`);
|
|||||||
});
|
});
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = runGate({
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
args: ['--compact', '--input', fixture.path('input.json')],
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
});
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== false) throw new Error(`expected auto-next continuity failure ok=false, got ${JSON.stringify(result.json)}`);
|
||||||
if (result.status !== 0 && result.status !== null) {
|
if (result.json.verdict !== 'continuity_failure') throw new Error(`expected verdict=continuity_failure, got ${JSON.stringify(result.json.verdict)}`);
|
||||||
throw new Error(`expected controlled execution, got status=${result.status}
|
if (result.json.reason !== 'missing_auto_next_dispatch') throw new Error(`expected reason=missing_auto_next_dispatch, got ${JSON.stringify(result.json.reason)}`);
|
||||||
${result.stderr || result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!result.json || typeof result.json !== 'object') {
|
|
||||||
throw new Error(`expected JSON output
|
|
||||||
stdout=${result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.ok !== false) {
|
|
||||||
throw new Error(`expected auto-next continuity failure ok=false, got ${JSON.stringify(result.json)}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.verdict !== 'continuity_failure') {
|
|
||||||
throw new Error(`expected verdict=continuity_failure, got ${JSON.stringify(result.json.verdict)}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.reason !== 'missing_auto_next_dispatch') {
|
|
||||||
throw new Error(`expected reason=missing_auto_next_dispatch, got ${JSON.stringify(result.json.reason)}`);
|
|
||||||
}
|
|
||||||
} finally {
|
} finally {
|
||||||
fixture.cleanup();
|
fixture.cleanup();
|
||||||
}
|
}
|
||||||
@@ -283,6 +247,7 @@ stdout=${result.stdout}`);
|
|||||||
nextTaskKnown: true,
|
nextTaskKnown: true,
|
||||||
sameApprovedPlan: true,
|
sameApprovedPlan: true,
|
||||||
taskBoundaryStop: true,
|
taskBoundaryStop: true,
|
||||||
|
nextTaskId: 'task-9c',
|
||||||
nextDerivedAction: {
|
nextDerivedAction: {
|
||||||
type: 'message_subagent',
|
type: 'message_subagent',
|
||||||
task: 'continue with task-9c',
|
task: 'continue with task-9c',
|
||||||
@@ -294,23 +259,10 @@ stdout=${result.stdout}`);
|
|||||||
});
|
});
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = runGate({
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
args: ['--compact', '--input', fixture.path('input.json')],
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
});
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== true) throw new Error(`expected continuity pass ok=true when highRiskStop=true, got ${JSON.stringify(result.json)}`);
|
||||||
if (result.status !== 0 && result.status !== null) {
|
|
||||||
throw new Error(`expected controlled execution, got status=${result.status}
|
|
||||||
${result.stderr || result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!result.json || typeof result.json !== 'object') {
|
|
||||||
throw new Error(`expected JSON output
|
|
||||||
stdout=${result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.ok !== true) {
|
|
||||||
throw new Error(`expected continuity pass ok=true when highRiskStop=true, got ${JSON.stringify(result.json)}`);
|
|
||||||
}
|
|
||||||
} finally {
|
} finally {
|
||||||
fixture.cleanup();
|
fixture.cleanup();
|
||||||
}
|
}
|
||||||
@@ -334,23 +286,10 @@ stdout=${result.stdout}`);
|
|||||||
});
|
});
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = runGate({
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
args: ['--compact', '--input', fixture.path('input.json')],
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
});
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== true) throw new Error(`expected pass when nextTaskKnown=false, got ${JSON.stringify(result.json)}`);
|
||||||
if (result.status !== 0 && result.status !== null) {
|
|
||||||
throw new Error(`expected controlled execution, got status=${result.status}
|
|
||||||
${result.stderr || result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!result.json || typeof result.json !== 'object') {
|
|
||||||
throw new Error(`expected JSON output
|
|
||||||
stdout=${result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.ok !== true) {
|
|
||||||
throw new Error(`expected pass when nextTaskKnown=false, got ${JSON.stringify(result.json)}`);
|
|
||||||
}
|
|
||||||
} finally {
|
} finally {
|
||||||
fixture.cleanup();
|
fixture.cleanup();
|
||||||
}
|
}
|
||||||
@@ -367,6 +306,7 @@ stdout=${result.stdout}`);
|
|||||||
nextTaskKnown: true,
|
nextTaskKnown: true,
|
||||||
sameApprovedPlan: false,
|
sameApprovedPlan: false,
|
||||||
taskBoundaryStop: true,
|
taskBoundaryStop: true,
|
||||||
|
nextTaskId: 'task-other',
|
||||||
nextDerivedAction: {
|
nextDerivedAction: {
|
||||||
type: 'message_subagent',
|
type: 'message_subagent',
|
||||||
task: 'continue with unrelated task',
|
task: 'continue with unrelated task',
|
||||||
@@ -378,23 +318,133 @@ stdout=${result.stdout}`);
|
|||||||
});
|
});
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = runGate({
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
args: ['--compact', '--input', fixture.path('input.json')],
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== true) throw new Error(`expected pass when sameApprovedPlan=false, got ${JSON.stringify(result.json)}`);
|
||||||
|
} finally {
|
||||||
|
fixture.cleanup();
|
||||||
|
}
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: 'auto-next obligation: fails when receipt exists but next-task linkage is stale or mismatched',
|
||||||
|
run() {
|
||||||
|
const fixture = createFixture({
|
||||||
|
'input.json': {
|
||||||
|
planId: 'plan-auto-next-linkage-mismatch',
|
||||||
|
currentTask: 'task-8f',
|
||||||
|
taskState: 'complete',
|
||||||
|
nextTaskKnown: true,
|
||||||
|
sameApprovedPlan: true,
|
||||||
|
taskBoundaryStop: true,
|
||||||
|
nextTaskId: 'task-9f',
|
||||||
|
nextDerivedAction: {
|
||||||
|
type: 'message_subagent',
|
||||||
|
task: 'continue with task-9f',
|
||||||
|
},
|
||||||
|
replyClosureState: 'completed',
|
||||||
|
highRiskStop: false,
|
||||||
|
dispatchReceipt: {
|
||||||
|
planId: 'plan-auto-next-linkage-mismatch',
|
||||||
|
currentTask: 'task-8f',
|
||||||
|
nextTaskId: 'task-10f',
|
||||||
|
nextDerivedAction: {
|
||||||
|
type: 'message_subagent',
|
||||||
|
task: 'continue with task-10f',
|
||||||
|
},
|
||||||
|
dispatchedAt: '2026-04-24T16:00:00+08:00',
|
||||||
|
},
|
||||||
|
},
|
||||||
});
|
});
|
||||||
|
|
||||||
if (result.status !== 0 && result.status !== null) {
|
try {
|
||||||
throw new Error(`expected controlled execution, got status=${result.status}
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
${result.stderr || result.stdout}`);
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== false) throw new Error(`expected linkage mismatch to fail, got ${JSON.stringify(result.json)}`);
|
||||||
|
if (result.json.reason !== 'missing_auto_next_dispatch') throw new Error(`expected linkage mismatch reason=missing_auto_next_dispatch, got ${JSON.stringify(result.json.reason)}`);
|
||||||
|
} finally {
|
||||||
|
fixture.cleanup();
|
||||||
}
|
}
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: 'auto-next obligation: passes when receipt links to the required next task',
|
||||||
|
run() {
|
||||||
|
const fixture = createFixture({
|
||||||
|
'input.json': {
|
||||||
|
planId: 'plan-auto-next-linkage-match',
|
||||||
|
currentTask: 'task-8g',
|
||||||
|
taskState: 'complete',
|
||||||
|
nextTaskKnown: true,
|
||||||
|
sameApprovedPlan: true,
|
||||||
|
taskBoundaryStop: true,
|
||||||
|
nextTaskId: 'task-9g',
|
||||||
|
nextDerivedAction: {
|
||||||
|
type: 'message_subagent',
|
||||||
|
task: 'continue with task-9g',
|
||||||
|
},
|
||||||
|
replyClosureState: 'completed',
|
||||||
|
highRiskStop: false,
|
||||||
|
dispatchReceipt: {
|
||||||
|
planId: 'plan-auto-next-linkage-match',
|
||||||
|
currentTask: 'task-8g',
|
||||||
|
nextTaskId: 'task-9g',
|
||||||
|
nextDerivedAction: {
|
||||||
|
type: 'message_subagent',
|
||||||
|
task: 'continue with task-9g',
|
||||||
|
},
|
||||||
|
dispatchedAt: '2026-04-24T16:05:00+08:00',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
if (!result.json || typeof result.json !== 'object') {
|
try {
|
||||||
throw new Error(`expected JSON output
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
stdout=${result.stdout}`);
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== true) throw new Error(`expected linkage-matched receipt to pass, got ${JSON.stringify(result.json)}`);
|
||||||
|
} finally {
|
||||||
|
fixture.cleanup();
|
||||||
}
|
}
|
||||||
|
},
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: 'auto-next obligation: fails when receipt only proves checkpoint/session metadata without actual dispatch linkage',
|
||||||
|
run() {
|
||||||
|
const fixture = createFixture({
|
||||||
|
'input.json': {
|
||||||
|
planId: 'plan-auto-next-checkpoint-spoof',
|
||||||
|
currentTask: 'task-8h',
|
||||||
|
taskState: 'complete',
|
||||||
|
nextTaskKnown: true,
|
||||||
|
sameApprovedPlan: true,
|
||||||
|
taskBoundaryStop: true,
|
||||||
|
nextTaskId: 'task-9h',
|
||||||
|
nextDerivedAction: {
|
||||||
|
type: 'message_subagent',
|
||||||
|
task: 'continue with task-9h',
|
||||||
|
},
|
||||||
|
replyClosureState: 'completed',
|
||||||
|
highRiskStop: false,
|
||||||
|
dispatchReceipt: {
|
||||||
|
planId: 'plan-auto-next-checkpoint-spoof',
|
||||||
|
currentTask: 'task-8h',
|
||||||
|
nextTaskId: 'task-9h',
|
||||||
|
checkpointPath: 'checkpoints/task-8h.json',
|
||||||
|
sessionKey: 'task-8h',
|
||||||
|
dispatchedAt: '2026-04-24T16:10:00+08:00',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
});
|
||||||
|
|
||||||
if (result.json.ok !== true) {
|
try {
|
||||||
throw new Error(`expected pass when sameApprovedPlan=false, got ${JSON.stringify(result.json)}`);
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
}
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== false) throw new Error(`expected checkpoint-only receipt to fail, got ${JSON.stringify(result.json)}`);
|
||||||
|
if (result.json.reason !== 'missing_auto_next_dispatch') throw new Error(`expected checkpoint-only reason=missing_auto_next_dispatch, got ${JSON.stringify(result.json.reason)}`);
|
||||||
} finally {
|
} finally {
|
||||||
fixture.cleanup();
|
fixture.cleanup();
|
||||||
}
|
}
|
||||||
@@ -420,35 +470,17 @@ stdout=${result.stdout}`);
|
|||||||
});
|
});
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = runGate({
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
args: ['--compact', '--input', fixture.path('input.json')],
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
});
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== false) throw new Error(`expected continuity failure ok=false for fake dispatch receipt, got ${JSON.stringify(result.json)}`);
|
||||||
if (result.status !== 0 && result.status !== null) {
|
if (result.json.verdict !== 'continuity_failure') throw new Error(`expected verdict=continuity_failure for fake dispatch receipt, got ${JSON.stringify(result.json.verdict)}`);
|
||||||
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
if (result.json.reason !== 'missing_dispatch_receipt') throw new Error(`expected reason=missing_dispatch_receipt for fake dispatch receipt, got ${JSON.stringify(result.json.reason)}`);
|
||||||
}
|
|
||||||
|
|
||||||
if (!result.json || typeof result.json !== 'object') {
|
|
||||||
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.ok !== false) {
|
|
||||||
throw new Error(`expected continuity failure ok=false for fake dispatch receipt, got ${JSON.stringify(result.json)}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.verdict !== 'continuity_failure') {
|
|
||||||
throw new Error(`expected verdict=continuity_failure for fake dispatch receipt, got ${JSON.stringify(result.json.verdict)}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.reason !== 'missing_dispatch_receipt') {
|
|
||||||
throw new Error(`expected reason=missing_dispatch_receipt for fake dispatch receipt, got ${JSON.stringify(result.json.reason)}`);
|
|
||||||
}
|
|
||||||
} finally {
|
} finally {
|
||||||
fixture.cleanup();
|
fixture.cleanup();
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
|
||||||
{
|
{
|
||||||
name: 'continuity: passes when task is complete, next action is known, and a dispatch receipt already exists',
|
name: 'continuity: passes when task is complete, next action is known, and a dispatch receipt already exists',
|
||||||
run() {
|
run() {
|
||||||
@@ -475,27 +507,15 @@ stdout=${result.stdout}`);
|
|||||||
});
|
});
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = runGate({
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
args: ['--compact', '--input', fixture.path('input.json')],
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
});
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== true) throw new Error(`expected continuity pass ok=true when dispatch receipt exists, got ${JSON.stringify(result.json)}`);
|
||||||
if (result.status !== 0 && result.status !== null) {
|
|
||||||
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!result.json || typeof result.json !== 'object') {
|
|
||||||
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.ok !== true) {
|
|
||||||
throw new Error(`expected continuity pass ok=true when dispatch receipt exists, got ${JSON.stringify(result.json)}`);
|
|
||||||
}
|
|
||||||
} finally {
|
} finally {
|
||||||
fixture.cleanup();
|
fixture.cleanup();
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
|
||||||
{
|
{
|
||||||
name: 'continuity: passes when planner returns derivedAction and a bound dispatch receipt already exists',
|
name: 'continuity: passes when planner returns derivedAction and a bound dispatch receipt already exists',
|
||||||
run() {
|
run() {
|
||||||
@@ -522,27 +542,15 @@ stdout=${result.stdout}`);
|
|||||||
});
|
});
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = runGate({
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
args: ['--compact', '--input', fixture.path('input.json')],
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
});
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== true) throw new Error(`expected continuity pass ok=true when derivedAction has bound dispatch receipt, got ${JSON.stringify(result.json)}`);
|
||||||
if (result.status !== 0 && result.status !== null) {
|
|
||||||
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!result.json || typeof result.json !== 'object') {
|
|
||||||
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.ok !== true) {
|
|
||||||
throw new Error(`expected continuity pass ok=true when derivedAction has bound dispatch receipt, got ${JSON.stringify(result.json)}`);
|
|
||||||
}
|
|
||||||
} finally {
|
} finally {
|
||||||
fixture.cleanup();
|
fixture.cleanup();
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
|
||||||
{
|
{
|
||||||
name: 'continuity: passes when task is complete, next action is known, no dispatch receipt exists, and closure is waiting_user',
|
name: 'continuity: passes when task is complete, next action is known, no dispatch receipt exists, and closure is waiting_user',
|
||||||
run() {
|
run() {
|
||||||
@@ -561,27 +569,15 @@ stdout=${result.stdout}`);
|
|||||||
});
|
});
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = runGate({
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
args: ['--compact', '--input', fixture.path('input.json')],
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
});
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== true) throw new Error(`expected continuity pass ok=true when closure is waiting_user, got ${JSON.stringify(result.json)}`);
|
||||||
if (result.status !== 0 && result.status !== null) {
|
|
||||||
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!result.json || typeof result.json !== 'object') {
|
|
||||||
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.ok !== true) {
|
|
||||||
throw new Error(`expected continuity pass ok=true when closure is waiting_user, got ${JSON.stringify(result.json)}`);
|
|
||||||
}
|
|
||||||
} finally {
|
} finally {
|
||||||
fixture.cleanup();
|
fixture.cleanup();
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
|
||||||
{
|
{
|
||||||
name: 'continuity: passes when task is complete, next action is known, no dispatch receipt exists, and closure is pending_verification',
|
name: 'continuity: passes when task is complete, next action is known, no dispatch receipt exists, and closure is pending_verification',
|
||||||
run() {
|
run() {
|
||||||
@@ -600,27 +596,15 @@ stdout=${result.stdout}`);
|
|||||||
});
|
});
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = runGate({
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
args: ['--compact', '--input', fixture.path('input.json')],
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
});
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== true) throw new Error(`expected continuity pass ok=true when closure is pending_verification, got ${JSON.stringify(result.json)}`);
|
||||||
if (result.status !== 0 && result.status !== null) {
|
|
||||||
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!result.json || typeof result.json !== 'object') {
|
|
||||||
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.ok !== true) {
|
|
||||||
throw new Error(`expected continuity pass ok=true when closure is pending_verification, got ${JSON.stringify(result.json)}`);
|
|
||||||
}
|
|
||||||
} finally {
|
} finally {
|
||||||
fixture.cleanup();
|
fixture.cleanup();
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
|
||||||
{
|
{
|
||||||
name: 'continuity: passes when task is complete, next action is known, no dispatch receipt exists, and closure is blocked',
|
name: 'continuity: passes when task is complete, next action is known, no dispatch receipt exists, and closure is blocked',
|
||||||
run() {
|
run() {
|
||||||
@@ -639,21 +623,10 @@ stdout=${result.stdout}`);
|
|||||||
});
|
});
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = runGate({
|
const result = runGate({ args: ['--compact', '--input', fixture.path('input.json')] });
|
||||||
args: ['--compact', '--input', fixture.path('input.json')],
|
if (result.status !== 0 && result.status !== null) throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
||||||
});
|
if (!result.json || typeof result.json !== 'object') throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
||||||
|
if (result.json.ok !== true) throw new Error(`expected continuity pass ok=true when closure is blocked, got ${JSON.stringify(result.json)}`);
|
||||||
if (result.status !== 0 && result.status !== null) {
|
|
||||||
throw new Error(`expected controlled execution, got status=${result.status}\n${result.stderr || result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!result.json || typeof result.json !== 'object') {
|
|
||||||
throw new Error(`expected JSON output\nstdout=${result.stdout}`);
|
|
||||||
}
|
|
||||||
|
|
||||||
if (result.json.ok !== true) {
|
|
||||||
throw new Error(`expected continuity pass ok=true when closure is blocked, got ${JSON.stringify(result.json)}`);
|
|
||||||
}
|
|
||||||
} finally {
|
} finally {
|
||||||
fixture.cleanup();
|
fixture.cleanup();
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user