193 lines
7.2 KiB
Markdown
193 lines
7.2 KiB
Markdown
# Watchdog B v3 notification layer
|
|
|
|
Single source of truth for owner-facing policy: `~/.config/openclaw/watchdog-b.env`
|
|
|
|
Runtime auto-detection source: `scripts/openclaw_runtime_probe.py`
|
|
|
|
This directory now contains:
|
|
|
|
- `check_openclaw_state.sh` — tri-state checker (`running` / `stalled` / `idle`)
|
|
- `run_watchdog_b.sh` — dispatcher + notification runner
|
|
- `notify_watchdog_b.py` — minimal notification integration layer
|
|
|
|
## Configuration source
|
|
|
|
Priority order is now:
|
|
1. process env already set by caller/systemd
|
|
2. `WATCHDOG_B_CONFIG_FILE` if set
|
|
3. fallback `~/.config/openclaw/watchdog-b.env`
|
|
4. code defaults
|
|
|
|
This means both `run_watchdog_b.sh` and `notify_watchdog_b.py` can be invoked manually and still resolve the same owner-facing channel / target / mode / wording.
|
|
|
|
For Node/OpenClaw runtime paths, the bundled scripts now resolve in this order:
|
|
1. explicit env overrides: `WATCHDOG_B_NODE_BIN`, `WATCHDOG_B_OPENCLAW_MJS`, `WATCHDOG_B_OPENCLAW_ENTRY`
|
|
2. PATH lookup for `node` and `openclaw`
|
|
3. common install roots scan: nvm, pnpm global, npm-global, `/usr/local`, `/usr`, Volta-style trees
|
|
4. fail with an operator-facing error that tells you which env vars to set manually
|
|
|
|
Repo template to copy from:
|
|
- `ops/systemd/user/watchdog-b.env.example`
|
|
|
|
Install example:
|
|
```bash
|
|
mkdir -p ~/.config/openclaw
|
|
cp ~/.openclaw/workspace/ops/systemd/user/watchdog-b.env.example ~/.config/openclaw/watchdog-b.env
|
|
$EDITOR ~/.config/openclaw/watchdog-b.env
|
|
```
|
|
|
|
## Notification strategy
|
|
|
|
### 1) `running`
|
|
Default: **manual / queue-ready only**.
|
|
|
|
Why:
|
|
- a healthy runtime every 10 minutes should not spam Eric
|
|
- owner-facing reporting should remain explicit and auditable
|
|
|
|
Behavior:
|
|
- `WATCHDOG_B_RUNNING_REPORT_MODE=manual` (default)
|
|
- does not create external messages
|
|
- returns a concrete hint for how to enable queue creation
|
|
- `WATCHDOG_B_RUNNING_REPORT_MODE=enqueue`
|
|
- creates a real pending owner report in `~/.clawteam/owner-reports/pending/`
|
|
- does **not** auto-send it
|
|
- `WATCHDOG_B_RUNNING_REPORT_MODE=enqueue-and-drain`
|
|
- creates a pending owner report and immediately delivers that exact pending item through `owner_report_driver.py`
|
|
- send path is direct OpenClaw Discord send using the env-configured owner-facing destination
|
|
- this keeps queue/audit semantics but avoids depending on the wrapper watchdog's destination/visibility behavior
|
|
|
|
Throttle:
|
|
- `WATCHDOG_B_RUNNING_REPORT_MIN_INTERVAL_SECONDS` default `3600`
|
|
|
|
### 2) `stalled`
|
|
Default: **nudge main agent first**, then escalate to Eric only after repetition.
|
|
|
|
Behavior:
|
|
- call internal OpenClaw agent route:
|
|
- `node .../openclaw.mjs agent --agent main --message ...`
|
|
- maintain local notify state under `state/watchdog-b/notify-state.json`
|
|
- after repeated observations (`WATCHDOG_B_STALLED_OWNER_ESCALATION_AFTER`, default `2`), enqueue an owner report
|
|
|
|
Throttle:
|
|
- `WATCHDOG_B_STALLED_NUDGE_MIN_INTERVAL_SECONDS` default `900`
|
|
|
|
Owner escalation mode:
|
|
- `WATCHDOG_B_STALLED_OWNER_MODE=escalate` (default)
|
|
- `WATCHDOG_B_STALLED_OWNER_MODE=always`
|
|
- `WATCHDOG_B_STALLED_OWNER_MODE=never`
|
|
|
|
Owner delivery mode after enqueue:
|
|
- `WATCHDOG_B_OWNER_DELIVERY_MODE=enqueue-only` (default)
|
|
- `WATCHDOG_B_OWNER_DELIVERY_MODE=direct-discord`
|
|
|
|
When `direct-discord` is enabled, watchdog-b still enqueues first, then directly delivers that same pending report via `owner_report_driver.py` to the env-configured Discord target.
|
|
|
|
## Owner-facing message style
|
|
|
|
Owner-facing Discord message is now compact and conclusion-first:
|
|
- first line: headline (`🔔 [watchdog-b] <worker>`)
|
|
- second line: concise conclusion with emoji, e.g. `✅ 主程序仍在運行`
|
|
- third line: actionable next step, prefixed with `→`
|
|
- last line: compact technical metadata (`task=... | status=... | progress=... | source=...`)
|
|
|
|
Style knobs live in `watchdog-b.env`:
|
|
- `WATCHDOG_B_RUNNING_EMOJI`, `WATCHDOG_B_RUNNING_SUMMARY`
|
|
- `WATCHDOG_B_STALLED_EMOJI`, `WATCHDOG_B_STALLED_SUMMARY`
|
|
- `WATCHDOG_B_IDLE_EMOJI`, `WATCHDOG_B_IDLE_SUMMARY`
|
|
|
|
### 3) `idle`
|
|
Default: same pattern as stalled, but slower.
|
|
|
|
Behavior:
|
|
- nudge main agent first
|
|
- only escalate to Eric after repeated idle detections
|
|
|
|
Throttle:
|
|
- `WATCHDOG_B_IDLE_NUDGE_MIN_INTERVAL_SECONDS` default `1800`
|
|
|
|
Owner escalation threshold:
|
|
- `WATCHDOG_B_IDLE_OWNER_ESCALATION_AFTER` default `2`
|
|
|
|
## Safety defaults
|
|
|
|
- `WATCHDOG_B_NOTIFY_DRY_RUN=1` by default in `run_watchdog_b.sh`
|
|
- owner-facing send path keeps the existing `owner-reporting-system` queue/artifact/audit flow, but direct Discord delivery no longer depends on the cron wrapper's default destination semantics
|
|
- local state tracks last send time / count to reduce spam
|
|
- a failed notifier does not crash the dispatcher; it emits warning + preserves artifacts
|
|
|
|
## Key artifacts
|
|
|
|
Under `state/watchdog-b/`:
|
|
|
|
- `last-output.txt` — rendered dispatcher output
|
|
- `last-notify-output.txt` — notifier JSON result
|
|
- `last-state.txt` — last state
|
|
- `history.tsv` — state history
|
|
- `notify-state.json` — throttle / repetition tracking
|
|
|
|
## Manual test examples
|
|
|
|
### Dry-run running path
|
|
```bash
|
|
WATCHDOG_B_NOTIFY_DRY_RUN=1 \
|
|
WATCHDOG_B_RUNNING_REPORT_MODE=manual \
|
|
./scripts/watchdog-b/notify_watchdog_b.py --state running --dry-run
|
|
```
|
|
|
|
### Real queue creation for running (no send)
|
|
```bash
|
|
WATCHDOG_B_NOTIFY_DRY_RUN=0 \
|
|
WATCHDOG_B_RUNNING_REPORT_MODE=enqueue \
|
|
./scripts/watchdog-b/notify_watchdog_b.py --state running
|
|
ls -l ~/.clawteam/owner-reports/pending
|
|
```
|
|
|
|
### Runtime probe only
|
|
```bash
|
|
python3 ./scripts/watchdog-b/openclaw_runtime_probe.py --pretty
|
|
```
|
|
|
|
### Single-shot enqueue + direct Discord delivery
|
|
```bash
|
|
WATCHDOG_B_NOTIFY_DRY_RUN=0 \
|
|
WATCHDOG_B_RUNNING_REPORT_MODE=enqueue-and-drain \
|
|
./scripts/watchdog-b/notify_watchdog_b.py --state running
|
|
```
|
|
|
|
### Dry-run stalled nudge
|
|
```bash
|
|
WATCHDOG_B_NOTIFY_DRY_RUN=1 \
|
|
./scripts/watchdog-b/notify_watchdog_b.py --state stalled --dry-run
|
|
```
|
|
|
|
If runtime auto-detection fails on a host with a custom install layout, set one or more of:
|
|
- `WATCHDOG_B_NODE_BIN`
|
|
- `WATCHDOG_B_OPENCLAW_MJS`
|
|
- `WATCHDOG_B_OPENCLAW_ENTRY`
|
|
|
|
### Full dispatcher dry-run with fixture overrides
|
|
```bash
|
|
OPENCLAW_PID_FILE=$PWD/tests/fixtures/watchdog-b/running/host-runtime/openclaw.pid \
|
|
OPENCLAW_LOG_FILE=$PWD/tests/fixtures/watchdog-b/running/logs/openclaw.log \
|
|
WATCHDOG_B_ARTIFACT_DIR=$PWD/state/watchdog-b-test-running-v3 \
|
|
WATCHDOG_B_NOTIFY_DRY_RUN=1 \
|
|
./scripts/watchdog-b/run_watchdog_b.sh
|
|
```
|
|
|
|
## What is truly wired vs not
|
|
|
|
### Truly wired now
|
|
- state detection
|
|
- notifier invocation from dispatcher
|
|
- main-agent internal nudge command construction and execution path
|
|
- owner-report queue creation via existing producer
|
|
- optional direct delivery of an enqueued owner report through `owner_report_driver.py`
|
|
- throttling / repetition state persisted locally
|
|
|
|
### Still conditional / not claimed as universally proven
|
|
- successful main-agent wake-up depends on local OpenClaw CLI/runtime being callable from this environment
|
|
- successful owner-facing delivery still depends on valid local OpenClaw Discord routing on the host
|
|
- direct watchdog-b owner delivery now targets `channel:1480577550445969541` by default and bypasses the wrapper watchdog destination logic
|
|
- the cron wrapper remains available for generic queue draining, but watchdog-b no longer needs to rely on it for single-shot owner-facing verification
|