watchdog-discord-route/references/watchdog-b-readme.md

# Watchdog B v3 notification layer

Single source of truth for owner-facing policy: `~/.config/openclaw/watchdog-b.env`

Runtime auto-detection source: `scripts/openclaw_runtime_probe.py`

This directory now contains:

- `check_openclaw_state.sh` — tri-state checker (`running` / `stalled` / `idle`)
- `run_watchdog_b.sh` — dispatcher + notification runner
- `notify_watchdog_b.py` — minimal notification integration layer

## Configuration source

Priority order is now:
1. process env already set by caller/systemd
2. `WATCHDOG_B_CONFIG_FILE` if set
3. fallback `~/.config/openclaw/watchdog-b.env`
4. code defaults

This means both `run_watchdog_b.sh` and `notify_watchdog_b.py` can be invoked manually and still resolve the same owner-facing channel / target / mode / wording.

For Node/OpenClaw runtime paths, the bundled scripts now resolve in this order:
1. explicit env overrides: `WATCHDOG_B_NODE_BIN`, `WATCHDOG_B_OPENCLAW_MJS`, `WATCHDOG_B_OPENCLAW_ENTRY`
2. PATH lookup for `node` and `openclaw`
3. common install roots scan: nvm, pnpm global, npm-global, `/usr/local`, `/usr`, Volta-style trees
4. fail with an operator-facing error that tells you which env vars to set manually

Repo template to copy from:
- `ops/systemd/user/watchdog-b.env.example`

Install example:
```bash
mkdir -p ~/.config/openclaw
cp ~/.openclaw/workspace/ops/systemd/user/watchdog-b.env.example ~/.config/openclaw/watchdog-b.env
$EDITOR ~/.config/openclaw/watchdog-b.env
```

## Notification strategy

### 1) `running`
Default: **manual / queue-ready only**.

Why:
- a healthy runtime every 10 minutes should not spam Eric
- owner-facing reporting should remain explicit and auditable

Behavior:
- `WATCHDOG_B_RUNNING_REPORT_MODE=manual` (default)
  - does not create external messages
  - returns a concrete hint for how to enable queue creation
- `WATCHDOG_B_RUNNING_REPORT_MODE=enqueue`
  - creates a real pending owner report in `~/.clawteam/owner-reports/pending/`
  - does **not** auto-send it
- `WATCHDOG_B_RUNNING_REPORT_MODE=enqueue-and-drain`
  - creates a pending owner report and immediately delivers that exact pending item through `owner_report_driver.py`
  - send path is direct OpenClaw Discord send using the env-configured owner-facing destination
  - this keeps queue/audit semantics but avoids depending on the wrapper watchdog's destination/visibility behavior

Throttle:
- `WATCHDOG_B_RUNNING_REPORT_MIN_INTERVAL_SECONDS` default `3600`

### 2) `stalled`
Default: **nudge main agent first**, then escalate to Eric only after repetition.

Behavior:
- call internal OpenClaw agent route:
  - `node .../openclaw.mjs agent --agent main --message ...`
- maintain local notify state under `state/watchdog-b/notify-state.json`
- after repeated observations (`WATCHDOG_B_STALLED_OWNER_ESCALATION_AFTER`, default `2`), enqueue an owner report

Throttle:
- `WATCHDOG_B_STALLED_NUDGE_MIN_INTERVAL_SECONDS` default `900`

Owner escalation mode:
- `WATCHDOG_B_STALLED_OWNER_MODE=escalate` (default)
- `WATCHDOG_B_STALLED_OWNER_MODE=always`
- `WATCHDOG_B_STALLED_OWNER_MODE=never`

Owner delivery mode after enqueue:
- `WATCHDOG_B_OWNER_DELIVERY_MODE=enqueue-only` (default)
- `WATCHDOG_B_OWNER_DELIVERY_MODE=direct-discord`

When `direct-discord` is enabled, watchdog-b still enqueues first, then directly delivers that same pending report via `owner_report_driver.py` to the env-configured Discord target.

## Owner-facing message style

Owner-facing Discord message is now compact and conclusion-first:
- first line: headline (`🔔 [watchdog-b] <worker>`)
- second line: concise conclusion with emoji, e.g. `✅ 主程序仍在運行`
- third line: actionable next step, prefixed with `→`
- last line: compact technical metadata (`task=... | status=... | progress=... | source=...`)

Style knobs live in `watchdog-b.env`:
- `WATCHDOG_B_RUNNING_EMOJI`, `WATCHDOG_B_RUNNING_SUMMARY`
- `WATCHDOG_B_STALLED_EMOJI`, `WATCHDOG_B_STALLED_SUMMARY`
- `WATCHDOG_B_IDLE_EMOJI`, `WATCHDOG_B_IDLE_SUMMARY`

### 3) `idle`
Default: same pattern as stalled, but slower.

Behavior:
- nudge main agent first
- only escalate to Eric after repeated idle detections

Throttle:
- `WATCHDOG_B_IDLE_NUDGE_MIN_INTERVAL_SECONDS` default `1800`

Owner escalation threshold:
- `WATCHDOG_B_IDLE_OWNER_ESCALATION_AFTER` default `2`

## Safety defaults

- `WATCHDOG_B_NOTIFY_DRY_RUN=1` by default in `run_watchdog_b.sh`
- owner-facing send path keeps the existing `owner-reporting-system` queue/artifact/audit flow, but direct Discord delivery no longer depends on the cron wrapper's default destination semantics
- local state tracks last send time / count to reduce spam
- a failed notifier does not crash the dispatcher; it emits warning + preserves artifacts

## Key artifacts

Under `state/watchdog-b/`:

- `last-output.txt` — rendered dispatcher output
- `last-notify-output.txt` — notifier JSON result
- `last-state.txt` — last state
- `history.tsv` — state history
- `notify-state.json` — throttle / repetition tracking

## Manual test examples

### Dry-run running path
```bash
WATCHDOG_B_NOTIFY_DRY_RUN=1 \
WATCHDOG_B_RUNNING_REPORT_MODE=manual \
./scripts/watchdog-b/notify_watchdog_b.py --state running --dry-run
```

### Real queue creation for running (no send)
```bash
WATCHDOG_B_NOTIFY_DRY_RUN=0 \
WATCHDOG_B_RUNNING_REPORT_MODE=enqueue \
./scripts/watchdog-b/notify_watchdog_b.py --state running
ls -l ~/.clawteam/owner-reports/pending
```

### Runtime probe only
```bash
python3 ./scripts/watchdog-b/openclaw_runtime_probe.py --pretty
```

### Single-shot enqueue + direct Discord delivery
```bash
WATCHDOG_B_NOTIFY_DRY_RUN=0 \
WATCHDOG_B_RUNNING_REPORT_MODE=enqueue-and-drain \
./scripts/watchdog-b/notify_watchdog_b.py --state running
```

### Dry-run stalled nudge
```bash
WATCHDOG_B_NOTIFY_DRY_RUN=1 \
./scripts/watchdog-b/notify_watchdog_b.py --state stalled --dry-run
```

If runtime auto-detection fails on a host with a custom install layout, set one or more of:
- `WATCHDOG_B_NODE_BIN`
- `WATCHDOG_B_OPENCLAW_MJS`
- `WATCHDOG_B_OPENCLAW_ENTRY`

### Full dispatcher dry-run with fixture overrides
```bash
OPENCLAW_PID_FILE=$PWD/tests/fixtures/watchdog-b/running/host-runtime/openclaw.pid \
OPENCLAW_LOG_FILE=$PWD/tests/fixtures/watchdog-b/running/logs/openclaw.log \
WATCHDOG_B_ARTIFACT_DIR=$PWD/state/watchdog-b-test-running-v3 \
WATCHDOG_B_NOTIFY_DRY_RUN=1 \
./scripts/watchdog-b/run_watchdog_b.sh
```

## What is truly wired vs not

### Truly wired now
- state detection
- notifier invocation from dispatcher
- main-agent internal nudge command construction and execution path
- owner-report queue creation via existing producer
- optional direct delivery of an enqueued owner report through `owner_report_driver.py`
- throttling / repetition state persisted locally

### Still conditional / not claimed as universally proven
- successful main-agent wake-up depends on local OpenClaw CLI/runtime being callable from this environment
- successful owner-facing delivery still depends on valid local OpenClaw Discord routing on the host
- direct watchdog-b owner delivery now targets `channel:1480577550445969541` by default and bypasses the wrapper watchdog destination logic
- the cron wrapper remains available for generic queue draining, but watchdog-b no longer needs to rely on it for single-shot owner-facing verification