Initial import of watchdog-discord-route skill
This commit is contained in:
274
SKILL.md
Normal file
274
SKILL.md
Normal file
@@ -0,0 +1,274 @@
|
|||||||
|
---
|
||||||
|
name: watchdog-discord-route
|
||||||
|
description: Install, reset, verify, or operate the OpenClaw watchdog-b owner-facing Discord route. Use when setting up or repairing watchdog-b -> owner-report -> Discord delivery, cleaning old watchdog test residue, enabling the systemd --user timer, running end-to-end verification, or adjusting the Discord-facing notification path/target/template.
|
||||||
|
---
|
||||||
|
|
||||||
|
# Watchdog Discord Route
|
||||||
|
|
||||||
|
Use this skill when the task is about the **watchdog-b owner-facing notification path to Discord**.
|
||||||
|
|
||||||
|
This skill covers four recurring jobs:
|
||||||
|
|
||||||
|
1. **Reset / clean** old watchdog test residue without deleting live audit assets.
|
||||||
|
2. **Verify** the end-to-end path from watchdog-b to Discord.
|
||||||
|
3. **Install / enable** the live watchdog schedule.
|
||||||
|
4. **Repair / adjust** the Discord-facing route, target, or message format.
|
||||||
|
|
||||||
|
## What the current canonical path is
|
||||||
|
|
||||||
|
Preferred owner-facing path:
|
||||||
|
|
||||||
|
`watchdog-b -> notify_watchdog_b.py -> owner_report_producer.py -> owner_report_driver.py -> OpenClaw Discord send -> sent archive`
|
||||||
|
|
||||||
|
For watchdog-b single-shot owner-facing delivery, prefer the **direct Discord driver path** over relying on the generic wrapper watchdog destination semantics.
|
||||||
|
|
||||||
|
Current default Discord target in this workspace is a validated example, but the portable skill should treat target as host-local configuration.
|
||||||
|
|
||||||
|
For portable installs, set:
|
||||||
|
|
||||||
|
- `WATCHDOG_B_OWNER_REPORT_TARGET=channel:REPLACE_ME`
|
||||||
|
|
||||||
|
## Bundled skill resources
|
||||||
|
|
||||||
|
This skill now carries a portable bundle under its own directory.
|
||||||
|
Prefer these bundled files first when adapting or reusing on another host:
|
||||||
|
|
||||||
|
### scripts/
|
||||||
|
- `scripts/check_openclaw_state.sh`
|
||||||
|
- `scripts/notify_watchdog_b.py`
|
||||||
|
- `scripts/run_watchdog_b.sh`
|
||||||
|
- `scripts/verify_watchdog_b_e2e.sh`
|
||||||
|
- `scripts/owner_report_consumer.py`
|
||||||
|
- `scripts/owner_report_producer.py`
|
||||||
|
- `scripts/owner_report_driver.py`
|
||||||
|
- `scripts/install_watchdog_bundle.sh`
|
||||||
|
- `scripts/bootstrap_watchdog_bundle.sh`
|
||||||
|
- `scripts/openclaw-watchdog-b.service`
|
||||||
|
- `scripts/openclaw-watchdog-b.timer`
|
||||||
|
- `scripts/openclaw_runtime_probe.py`
|
||||||
|
- `scripts/watchdog-b.env.example`
|
||||||
|
|
||||||
|
### references/
|
||||||
|
- `references/watchdog-b-readme.md`
|
||||||
|
- `references/owner-reporting-system.md`
|
||||||
|
- `references/owner-report-operator-manual.md`
|
||||||
|
|
||||||
|
If working in this workspace, you may still inspect the live workspace files too, but the bundled skill files are the portable baseline.
|
||||||
|
|
||||||
|
## When resetting / cleaning
|
||||||
|
|
||||||
|
Before touching anything, inventory:
|
||||||
|
|
||||||
|
- `state/watchdog-b/`
|
||||||
|
- `state/watchdog-b-test-*`
|
||||||
|
- `state/watchdog-b-verify-e2e/`
|
||||||
|
- `state/archive/`
|
||||||
|
- `~/.clawteam/owner-reports/pending/`
|
||||||
|
- `~/.clawteam/owner-reports/sent/`
|
||||||
|
- user crontab entries related to owner-report/watchdog
|
||||||
|
- `~/.config/systemd/user/openclaw-watchdog-b.*`
|
||||||
|
|
||||||
|
Rules:
|
||||||
|
|
||||||
|
- **Do not delete** `~/.clawteam/owner-reports/sent/` history unless explicitly asked.
|
||||||
|
- **Do not delete** live `state/watchdog-b/notify-state.json` unless the user explicitly wants a hard reset.
|
||||||
|
- Prefer **archiving** old `state/watchdog-b-test-*` into `state/archive/<timestamp>/`.
|
||||||
|
- Distinguish clearly between:
|
||||||
|
- live state
|
||||||
|
- old test residue
|
||||||
|
- repo templates
|
||||||
|
- live installed units
|
||||||
|
|
||||||
|
## When doing end-to-end verification
|
||||||
|
|
||||||
|
Default verification script:
|
||||||
|
|
||||||
|
- bundled: `scripts/verify_watchdog_b_e2e.sh`
|
||||||
|
- workspace live path: `scripts/watchdog-b/verify_watchdog_b_e2e.sh`
|
||||||
|
|
||||||
|
This should be treated as the **single-source verification path** unless there is a specific reason to bypass it.
|
||||||
|
|
||||||
|
Minimum success evidence:
|
||||||
|
|
||||||
|
- Discord send success with a new message id
|
||||||
|
- pending report created then moved to sent
|
||||||
|
- sent file exists under `~/.clawteam/owner-reports/sent/`
|
||||||
|
- verification artifacts under `state/watchdog-b-verify-e2e/<run-id>/`
|
||||||
|
|
||||||
|
Useful artifacts:
|
||||||
|
|
||||||
|
- `verify.log`
|
||||||
|
- `run-output.txt`
|
||||||
|
- `queue-before.txt`
|
||||||
|
- `queue-after.txt`
|
||||||
|
- `sent-head.txt`
|
||||||
|
- `state/notify-state.json`
|
||||||
|
|
||||||
|
Do not claim human-visible success unless either:
|
||||||
|
|
||||||
|
- the user confirms visibility, or
|
||||||
|
- you can read back the message from the exact Discord channel and match the message id/content.
|
||||||
|
|
||||||
|
## When installing the live schedule
|
||||||
|
|
||||||
|
Preferred scheduler: **systemd --user timer**.
|
||||||
|
|
||||||
|
Before claiming a host is ready, prefer to run the bundled bootstrap checker first:
|
||||||
|
|
||||||
|
- `scripts/bootstrap_watchdog_bundle.sh`
|
||||||
|
|
||||||
|
When installing or refreshing the portable bundle into live paths, prefer the bundled installer:
|
||||||
|
|
||||||
|
- `scripts/install_watchdog_bundle.sh --install-env-example`
|
||||||
|
|
||||||
|
Use bootstrap when you need a quick host-readiness answer without changing the host.
|
||||||
|
Use install when you need to copy the bundled scripts/service/timer/env example into the live workspace and user config paths.
|
||||||
|
|
||||||
|
Use it when:
|
||||||
|
|
||||||
|
- `systemd --user` is available
|
||||||
|
- linger/user services are supported
|
||||||
|
- you want journal/status/list-timers visibility
|
||||||
|
|
||||||
|
Live install paths:
|
||||||
|
|
||||||
|
- `~/.config/systemd/user/openclaw-watchdog-b.service`
|
||||||
|
- `~/.config/systemd/user/openclaw-watchdog-b.timer`
|
||||||
|
- `~/.config/openclaw/watchdog-b.env`
|
||||||
|
|
||||||
|
Portable install sources carried by this skill:
|
||||||
|
|
||||||
|
- `scripts/check_openclaw_state.sh`
|
||||||
|
- `scripts/run_watchdog_b.sh`
|
||||||
|
- `scripts/notify_watchdog_b.py`
|
||||||
|
- `scripts/owner_report_consumer.py`
|
||||||
|
- `scripts/owner_report_producer.py`
|
||||||
|
- `scripts/owner_report_driver.py`
|
||||||
|
- `scripts/install_watchdog_bundle.sh`
|
||||||
|
- `scripts/bootstrap_watchdog_bundle.sh`
|
||||||
|
- `scripts/openclaw-watchdog-b.service`
|
||||||
|
- `scripts/openclaw-watchdog-b.timer`
|
||||||
|
- `scripts/openclaw_runtime_probe.py`
|
||||||
|
- `scripts/watchdog-b.env.example`
|
||||||
|
|
||||||
|
Expected environment for live route:
|
||||||
|
|
||||||
|
- `WATCHDOG_B_NOTIFY_DRY_RUN=0`
|
||||||
|
- `WATCHDOG_B_RUNNING_REPORT_MODE=enqueue-and-drain`
|
||||||
|
- `WATCHDOG_B_OWNER_DELIVERY_MODE=direct-discord`
|
||||||
|
- `WATCHDOG_B_OWNER_REPORT_CHANNEL=discord`
|
||||||
|
- `WATCHDOG_B_OWNER_REPORT_TARGET=channel:REPLACE_ME`
|
||||||
|
- optional: `WATCHDOG_B_MAIN_AGENT_ID=<valid-agent-id-on-that-host>`
|
||||||
|
|
||||||
|
Minimum installation verification:
|
||||||
|
|
||||||
|
- `systemctl --user daemon-reload`
|
||||||
|
- `systemctl --user enable --now openclaw-watchdog-b.timer`
|
||||||
|
- `systemctl --user start openclaw-watchdog-b.service`
|
||||||
|
- `systemctl --user status openclaw-watchdog-b.timer --no-pager`
|
||||||
|
- `systemctl --user status openclaw-watchdog-b.service --no-pager`
|
||||||
|
- `systemctl --user list-timers --all | rg openclaw-watchdog-b`
|
||||||
|
- `journalctl --user -u openclaw-watchdog-b.service -n 50 --no-pager`
|
||||||
|
|
||||||
|
## When repairing Discord delivery
|
||||||
|
|
||||||
|
Check these in order:
|
||||||
|
|
||||||
|
1. Is the target channel correct?
|
||||||
|
- Validate the host-local configured form such as `channel:<discord_channel_id>`
|
||||||
|
2. Is the send path using direct driver delivery or the generic wrapper?
|
||||||
|
- For watchdog-b owner-facing single-shot delivery, prefer direct driver delivery.
|
||||||
|
3. If stalled/idle tries to nudge a main agent, is `WATCHDOG_B_MAIN_AGENT_ID` set to a valid agent id on that host?
|
||||||
|
- If not, leave it unset so main-agent nudge is skipped instead of failing on `Unknown agent id`.
|
||||||
|
4. Does the message actually appear in channel readback?
|
||||||
|
5. Is the message merely transport-accepted, or human-visible?
|
||||||
|
|
||||||
|
Be precise in language:
|
||||||
|
|
||||||
|
- "transport accepted" = send returned success / message id
|
||||||
|
- "channel-readable" = message appears in channel readback
|
||||||
|
- "human-confirmed" = Eric says he saw it
|
||||||
|
|
||||||
|
Do not collapse these into one claim.
|
||||||
|
|
||||||
|
## Message formatting preference
|
||||||
|
|
||||||
|
Eric reported that visible does not necessarily mean prominent.
|
||||||
|
|
||||||
|
When adjusting Discord-facing watchdog messages, prefer:
|
||||||
|
|
||||||
|
- short first line
|
||||||
|
- conclusion first
|
||||||
|
- minimize raw diagnostic fields in the user-facing body
|
||||||
|
- keep machine/audit detail in artifacts, not in the first visible lines
|
||||||
|
|
||||||
|
Preferred shape:
|
||||||
|
|
||||||
|
- `🔔 WATCHDOG|<結論>`
|
||||||
|
- `任務:...`
|
||||||
|
- `結論:...`
|
||||||
|
- `你現在不用做事` or `需要你介入:...`
|
||||||
|
|
||||||
|
## Suggested execution pattern
|
||||||
|
|
||||||
|
For most tasks in this skill:
|
||||||
|
|
||||||
|
1. Inventory live state and installed schedule.
|
||||||
|
2. Decide whether the task is:
|
||||||
|
- reset/cleanup
|
||||||
|
- route repair
|
||||||
|
- schedule install
|
||||||
|
- e2e verify
|
||||||
|
- portability / migration
|
||||||
|
3. Preserve audit assets.
|
||||||
|
4. Prefer the bundled skill scripts as the reusable baseline.
|
||||||
|
5. Use direct driver Discord route for owner-facing verification.
|
||||||
|
6. Attach evidence paths and exact commands before claiming success.
|
||||||
|
|
||||||
|
## Portability note
|
||||||
|
|
||||||
|
This skill is now suitable to copy to another OpenClaw workspace, but portability still depends on the target host having:
|
||||||
|
|
||||||
|
- a working OpenClaw install
|
||||||
|
- Discord routing available
|
||||||
|
- a valid destination channel/target configured in `watchdog-b.env`
|
||||||
|
- systemd --user if live scheduling is desired
|
||||||
|
|
||||||
|
When moving to another host, recommended order is:
|
||||||
|
|
||||||
|
1. Run `scripts/install_watchdog_bundle.sh --install-env-example` to populate live paths from the bundled copies.
|
||||||
|
2. If `~/.config/openclaw/watchdog-b.env` does not exist, create it from the example:
|
||||||
|
- `mkdir -p ~/.config/openclaw`
|
||||||
|
- `cp ~/.config/openclaw/watchdog-b.env.example ~/.config/openclaw/watchdog-b.env`
|
||||||
|
3. Edit `~/.config/openclaw/watchdog-b.env` and set at least:
|
||||||
|
- `WATCHDOG_B_OWNER_REPORT_TARGET=channel:YOUR_DISCORD_CHANNEL_ID`
|
||||||
|
4. Run `scripts/bootstrap_watchdog_bundle.sh`.
|
||||||
|
5. Bootstrap should now validate the installed live bundle under `scripts/watchdog-b/` and should not require a separate `owner-reporting-system/` live tree.
|
||||||
|
6. Re-run bootstrap until it passes, then enable/start systemd --user units.
|
||||||
|
|
||||||
|
When moving to another host, runtime detection now follows this order:
|
||||||
|
|
||||||
|
1. explicit env overrides: `WATCHDOG_B_NODE_BIN`, `WATCHDOG_B_OPENCLAW_MJS`, `WATCHDOG_B_OPENCLAW_ENTRY`
|
||||||
|
2. PATH discovery: `node`, then `openclaw`-adjacent install roots
|
||||||
|
3. common install roots scan: nvm, pnpm global, npm-global, `/usr/local`, `/usr`, and Volta-style locations
|
||||||
|
4. hard failure with a message telling the operator which env vars to set manually
|
||||||
|
|
||||||
|
So on most hosts you should only need to update:
|
||||||
|
|
||||||
|
- `WATCHDOG_B_OWNER_REPORT_TARGET`
|
||||||
|
- `WATCHDOG_B_WORKSPACE` if the workspace is not the default `~/.openclaw/workspace`
|
||||||
|
|
||||||
|
Only set `WATCHDOG_B_NODE_BIN` / `WATCHDOG_B_OPENCLAW_ENTRY` / `WATCHDOG_B_OPENCLAW_MJS` when auto-detection fails or you intentionally want to pin a non-default runtime.
|
||||||
|
|
||||||
|
If another agent reports that `~/.config/openclaw/watchdog-b.env` does not exist, the correct response is to create it from `watchdog-b.env.example` before expecting bootstrap to pass.
|
||||||
|
|
||||||
|
## Success checklist
|
||||||
|
|
||||||
|
Do not say done unless you have all applicable evidence:
|
||||||
|
|
||||||
|
- changed file paths
|
||||||
|
- exact command(s) run
|
||||||
|
- status output or send result
|
||||||
|
- sent archive path when queue/driver is involved
|
||||||
|
- channel/message evidence when claiming delivery
|
||||||
|
- rollback instructions when you changed live scheduling
|
||||||
242
references/owner-report-operator-manual.md
Normal file
242
references/owner-report-operator-manual.md
Normal file
@@ -0,0 +1,242 @@
|
|||||||
|
# Owner Report Operator Manual
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
Owner-report system 用來把「需要主動回報的 checkpoint」變成一條可觀測、可驗證、可補送的通知鏈路。
|
||||||
|
|
||||||
|
適合解決:
|
||||||
|
- 長時間、跨步驟任務
|
||||||
|
- ClawTeam / worker / 背景流程的進度回報
|
||||||
|
- 不能只靠口頭承諾「等等回報」的情境
|
||||||
|
|
||||||
|
不追求複雜事件平台;目標是 **簡單、可靠、失敗不假成功**。
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
最小鏈路:
|
||||||
|
|
||||||
|
一般 queue drain:
|
||||||
|
`producer -> pending/*.md -> watchdog -> driver -> Discord -> sent/*.md`
|
||||||
|
|
||||||
|
watchdog-b 單發直送:
|
||||||
|
`producer -> pending/*.md -> driver -> Discord channel:1480577550445969541 -> sent/*.md`
|
||||||
|
|
||||||
|
元件:
|
||||||
|
- `scripts/owner_report_producer.py`
|
||||||
|
- 把顯式 checkpoint 欄位寫成 `~/.clawteam/owner-reports/pending/<report_id>.md`
|
||||||
|
- `scripts/owner_report_consumer.py`
|
||||||
|
- 讀 pending report,轉成標準 JSON
|
||||||
|
- `scripts/owner_report_driver.py`
|
||||||
|
- 呼叫外部 send command;**只有送成功才移到** `sent/`
|
||||||
|
- `scripts/owner_report_watchdog.py`
|
||||||
|
- 單次掃描 pending,預設 oldest-first 處理 1 筆
|
||||||
|
- `scripts/run_owner_report_watchdog.sh`
|
||||||
|
- 本機 wrapper,固定 send command / target / max-count,給 cron 用
|
||||||
|
|
||||||
|
目錄:
|
||||||
|
- pending: `~/.clawteam/owner-reports/pending/`
|
||||||
|
- sent: `~/.clawteam/owner-reports/sent/`
|
||||||
|
- cron log: `/opt/workspace_auditing_report/logs/owner_report_watchdog_cron.out`
|
||||||
|
|
||||||
|
## When to use
|
||||||
|
請用在:
|
||||||
|
- 多步驟、跨時間任務
|
||||||
|
- 有明確 checkpoint / status change
|
||||||
|
- 使用 ClawTeam、subagent、watchdog、cron 的工作
|
||||||
|
- Eric 明確要求不要漏回報的任務
|
||||||
|
|
||||||
|
不要用在:
|
||||||
|
- 一次回完即可的短問答
|
||||||
|
- 小修改、低風險單步操作
|
||||||
|
- 不需要主動通知的普通工作
|
||||||
|
|
||||||
|
## Normal flow
|
||||||
|
1. 任務出現值得回報的 checkpoint
|
||||||
|
2. `owner_report_producer.py` 產生一筆 pending report
|
||||||
|
3. cron 每分鐘執行 `run_owner_report_watchdog.sh`
|
||||||
|
4. watchdog 從 `pending/` 挑最舊 report
|
||||||
|
5. driver 用 `OWNER_REPORT_SEND_CMD` 實際送出
|
||||||
|
6. 成功:移到 `sent/`
|
||||||
|
7. 失敗:留在 `pending/`,本輪停止
|
||||||
|
|
||||||
|
### Failure semantics
|
||||||
|
- **送失敗不會 archive**
|
||||||
|
- **backlog 依 oldest-first**
|
||||||
|
- **遇到失敗即停**,避免把後續成功誤當整體正常
|
||||||
|
|
||||||
|
## Common commands
|
||||||
|
### 1) 看 queue
|
||||||
|
```bash
|
||||||
|
ls -l ~/.clawteam/owner-reports/pending
|
||||||
|
ls -l ~/.clawteam/owner-reports/sent
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2) 手動產生一筆 report
|
||||||
|
```bash
|
||||||
|
cd /home/chchang/.openclaw/workspace/owner-reporting-system/scripts
|
||||||
|
uv run python owner_report_producer.py \
|
||||||
|
--team clawteam \
|
||||||
|
--worker backend-a \
|
||||||
|
--task-id example-task \
|
||||||
|
--progress 80% \
|
||||||
|
--done 'export complete' \
|
||||||
|
--next 'wait for aggregation' \
|
||||||
|
--status normal \
|
||||||
|
--source checkpoint-complete
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3) producer dry-run
|
||||||
|
```bash
|
||||||
|
uv run python owner_report_producer.py \
|
||||||
|
--team clawteam \
|
||||||
|
--worker backend-a \
|
||||||
|
--task-id example-task \
|
||||||
|
--progress 80% \
|
||||||
|
--done 'export complete' \
|
||||||
|
--next 'wait for aggregation' \
|
||||||
|
--status normal \
|
||||||
|
--dry-run
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4) watchdog dry-run
|
||||||
|
```bash
|
||||||
|
/home/chchang/.openclaw/workspace/owner-reporting-system/scripts/run_owner_report_watchdog.sh --dry-run
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5) 立即手動跑 watchdog
|
||||||
|
```bash
|
||||||
|
/home/chchang/.openclaw/workspace/owner-reporting-system/scripts/run_owner_report_watchdog.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6) 一次多吃幾筆 backlog
|
||||||
|
```bash
|
||||||
|
OWNER_REPORT_MAX_COUNT=20 \
|
||||||
|
/home/chchang/.openclaw/workspace/owner-reporting-system/scripts/run_owner_report_watchdog.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
### 7) 查看 cron
|
||||||
|
```bash
|
||||||
|
crontab -l
|
||||||
|
```
|
||||||
|
|
||||||
|
### 8) 看 watchdog log
|
||||||
|
```bash
|
||||||
|
tail -n 200 /opt/workspace_auditing_report/logs/owner_report_watchdog_cron.out
|
||||||
|
```
|
||||||
|
|
||||||
|
## Debugging checklist
|
||||||
|
遇到「沒有送出」時,依序檢查:
|
||||||
|
|
||||||
|
1. **pending 有沒有檔案**
|
||||||
|
```bash
|
||||||
|
ls -l ~/.clawteam/owner-reports/pending
|
||||||
|
```
|
||||||
|
2. **report 內容是否合理**
|
||||||
|
```bash
|
||||||
|
cd /home/chchang/.openclaw/workspace/owner-reporting-system/scripts
|
||||||
|
uv run python owner_report_consumer.py <report_id_or_path>
|
||||||
|
```
|
||||||
|
3. **driver dry-run 是否正常**
|
||||||
|
```bash
|
||||||
|
uv run python owner_report_driver.py <report_id_or_path> --dry-run
|
||||||
|
```
|
||||||
|
4. **watchdog dry-run 挑的是哪一筆**
|
||||||
|
```bash
|
||||||
|
./run_owner_report_watchdog.sh --dry-run
|
||||||
|
```
|
||||||
|
5. **cron 是否存在**
|
||||||
|
```bash
|
||||||
|
crontab -l
|
||||||
|
```
|
||||||
|
6. **cron log 是否有錯**
|
||||||
|
```bash
|
||||||
|
tail -n 200 /opt/workspace_auditing_report/logs/owner_report_watchdog_cron.out
|
||||||
|
```
|
||||||
|
7. **Node / openclaw entry 是否存在**
|
||||||
|
- `run_owner_report_watchdog.sh` 會檢查:
|
||||||
|
- `NODE_BIN`
|
||||||
|
- `OPENCLAW_ENTRY`
|
||||||
|
8. **send command 是否能成功發送**
|
||||||
|
- wrapper 與 watchdog-b 直送都靠 `OWNER_REPORT_SEND_CMD` / `--send-cmd`
|
||||||
|
- 現行 owner-facing 預設目標為 Discord `channel:1480577550445969541`
|
||||||
|
- watchdog-b 單發驗證可直接用 `owner_report_driver.py <report_id> --send-cmd '...message send --channel discord --target '\''channel:1480577550445969541'\'' --message "$OWNER_REPORT_MESSAGE"'`
|
||||||
|
9. **如果有 backlog 卡住**
|
||||||
|
- 先修掉最舊失敗那筆;watchdog 是 oldest-first,且失敗即停
|
||||||
|
|
||||||
|
## Cron / watchdog behavior
|
||||||
|
本機現況:
|
||||||
|
- schedule: `* * * * *`
|
||||||
|
- command: `run_owner_report_watchdog.sh`
|
||||||
|
- target: 預設 Discord `channel:1480577550445969541`
|
||||||
|
- max backlog per run: 預設 `5`
|
||||||
|
|
||||||
|
wrapper 會:
|
||||||
|
1. 固定 `OWNER_REPORT_SEND_CMD`
|
||||||
|
2. 固定 owner-facing channel/target(預設 `OWNER_REPORT_CHANNEL=discord`、`OWNER_REPORT_TARGET=channel:1480577550445969541`;可覆蓋)
|
||||||
|
3. 呼叫 `owner_report_watchdog.py --max-count "$OWNER_REPORT_MAX_COUNT"`
|
||||||
|
|
||||||
|
watchdog 本身特性:
|
||||||
|
- 非 daemon
|
||||||
|
- 非常駐
|
||||||
|
- 不 retry / backoff
|
||||||
|
- 每次只做一輪掃描
|
||||||
|
- 預設處理 1 筆;wrapper 預設放大為 5 筆
|
||||||
|
- `--all` 可掃完整個當前 backlog
|
||||||
|
|
||||||
|
## Caveats
|
||||||
|
- 這不是 message bus,也不是完整 job system
|
||||||
|
- 沒有內建 retry / dedupe / database
|
||||||
|
- 若最舊 pending 一直失敗,後面會被擋住
|
||||||
|
- `sent/` 代表「send command 成功」,不是保證人類已讀
|
||||||
|
- producer 依賴顯式欄位輸入,不會自動理解任意 log
|
||||||
|
- cron / PATH / node version 問題,已透過 wrapper 固定 Node 與 OpenClaw entry 盡量降低
|
||||||
|
|
||||||
|
## Examples
|
||||||
|
### Example A: 一般任務 checkpoint
|
||||||
|
```bash
|
||||||
|
cd /home/chchang/.openclaw/workspace/owner-reporting-system/scripts
|
||||||
|
uv run python owner_report_producer.py \
|
||||||
|
--team general-task \
|
||||||
|
--worker alice \
|
||||||
|
--task-id manual-checkpoint-1 \
|
||||||
|
--progress 50% \
|
||||||
|
--done '第一階段完成' \
|
||||||
|
--next '等待第二階段結果' \
|
||||||
|
--status normal \
|
||||||
|
--source manual-checkpoint
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example B: 只驗證不送出
|
||||||
|
```bash
|
||||||
|
/home/chchang/.openclaw/workspace/owner-reporting-system/scripts/run_owner_report_watchdog.sh --dry-run
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example C: watchdog-b 單發直送到 owner-facing Discord
|
||||||
|
先用 probe 找到本機 runtime:
|
||||||
|
```bash
|
||||||
|
python3 /home/chchang/.openclaw/workspace/skills/watchdog-discord-route/scripts/openclaw_runtime_probe.py --pretty
|
||||||
|
```
|
||||||
|
|
||||||
|
再帶入偵測到的 `node` / `dist/entry.js`:
|
||||||
|
```bash
|
||||||
|
cd /home/chchang/.openclaw/workspace/owner-reporting-system/scripts
|
||||||
|
python3 owner_report_driver.py <report_id> \
|
||||||
|
--send-cmd '"<detected-node>" "<detected-entry.js>" message send --channel discord --target '\''channel:1480577550445969541'\'' --message "$OWNER_REPORT_MESSAGE"'
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example D: 臨時改送別的 channel / target
|
||||||
|
```bash
|
||||||
|
OWNER_REPORT_CHANNEL=telegram \
|
||||||
|
OWNER_REPORT_TARGET=123456789 \
|
||||||
|
/home/chchang/.openclaw/workspace/owner-reporting-system/scripts/run_owner_report_watchdog.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example E: 快速消化 backlog
|
||||||
|
```bash
|
||||||
|
OWNER_REPORT_MAX_COUNT=10 \
|
||||||
|
/home/chchang/.openclaw/workspace/owner-reporting-system/scripts/run_owner_report_watchdog.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
## Operator summary
|
||||||
|
如果只記三件事:
|
||||||
|
1. **report 要先進 `pending/`,才有東西可送**
|
||||||
|
2. **只有 send 成功才會移到 `sent/`**
|
||||||
|
3. **看不到通知時,先查 pending / cron log / 最舊失敗那筆**
|
||||||
80
references/owner-reporting-system.md
Normal file
80
references/owner-reporting-system.md
Normal file
@@ -0,0 +1,80 @@
|
|||||||
|
# Owner Reporting System
|
||||||
|
|
||||||
|
這是一套全域性的 owner-facing 主動回報流程,不屬於某一個特定專案。
|
||||||
|
|
||||||
|
它的目的,是把長時間、多步驟、不可漏回報的工作,整理成一條可觀測、可驗證、失敗不假成功的通知鏈路。
|
||||||
|
|
||||||
|
## Core flow
|
||||||
|
|
||||||
|
General queue drain path:
|
||||||
|
`producer -> pending/*.md -> watchdog -> driver -> Discord -> sent/*.md`
|
||||||
|
|
||||||
|
Watchdog-b single-shot direct path:
|
||||||
|
`producer -> pending/*.md -> driver -> Discord channel:1480577550445969541 -> sent/*.md`
|
||||||
|
|
||||||
|
元件:
|
||||||
|
- `scripts/owner_report_producer.py`
|
||||||
|
- `scripts/owner_report_consumer.py`
|
||||||
|
- `scripts/owner_report_driver.py`
|
||||||
|
- `scripts/owner_report_watchdog.py`
|
||||||
|
- `scripts/run_owner_report_watchdog.sh`
|
||||||
|
- `OWNER_REPORT_OPERATOR_MANUAL.md`
|
||||||
|
|
||||||
|
## Scope
|
||||||
|
|
||||||
|
適用於:
|
||||||
|
- ClawTeam / subagent / 背景流程 checkpoint
|
||||||
|
- 多步驟技術任務
|
||||||
|
- 明確要求不要漏回報的交辦
|
||||||
|
- 需要 oldest-first / success-only archive / stop-on-failure 語義的通知鏈路
|
||||||
|
|
||||||
|
不適用於:
|
||||||
|
- 單次短問答
|
||||||
|
- 不需要主動通知的小修改
|
||||||
|
- 一次即可回完的低風險任務
|
||||||
|
|
||||||
|
## Queue paths
|
||||||
|
|
||||||
|
- pending: `~/.clawteam/owner-reports/pending/`
|
||||||
|
- sent: `~/.clawteam/owner-reports/sent/`
|
||||||
|
|
||||||
|
## Local integration
|
||||||
|
|
||||||
|
本機目前由 user crontab 每分鐘執行一次 watchdog wrapper:
|
||||||
|
|
||||||
|
- wrapper: `/home/chchang/.openclaw/workspace/owner-reporting-system/scripts/run_owner_report_watchdog.sh`
|
||||||
|
- log: `/opt/workspace_auditing_report/logs/owner_report_watchdog_cron.out`
|
||||||
|
- default target: 預設 `OWNER_REPORT_CHANNEL=discord` + `OWNER_REPORT_TARGET=channel:1480577550445969541`
|
||||||
|
- backlog per run: 預設 `OWNER_REPORT_MAX_COUNT=5`
|
||||||
|
|
||||||
|
另外,watchdog-b owner-facing 單發驗證現在可直接走 `owner_report_driver.py`,不必依賴 wrapper watchdog 的目標/顯示語義判斷。
|
||||||
|
|
||||||
|
## Common commands
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# produce one checkpoint report
|
||||||
|
cd /home/chchang/.openclaw/workspace/owner-reporting-system/scripts
|
||||||
|
uv run python owner_report_producer.py \
|
||||||
|
--team general-task \
|
||||||
|
--worker alice \
|
||||||
|
--task-id example-task \
|
||||||
|
--progress 50% \
|
||||||
|
--done '第一階段完成' \
|
||||||
|
--next '等待第二階段結果' \
|
||||||
|
--status normal \
|
||||||
|
--source manual-checkpoint
|
||||||
|
|
||||||
|
# dry-run watchdog
|
||||||
|
/home/chchang/.openclaw/workspace/owner-reporting-system/scripts/run_owner_report_watchdog.sh --dry-run
|
||||||
|
|
||||||
|
# process backlog immediately
|
||||||
|
OWNER_REPORT_MAX_COUNT=20 \
|
||||||
|
/home/chchang/.openclaw/workspace/owner-reporting-system/scripts/run_owner_report_watchdog.sh
|
||||||
|
|
||||||
|
# temporarily override destination
|
||||||
|
OWNER_REPORT_CHANNEL=telegram \
|
||||||
|
OWNER_REPORT_TARGET=864811879 \
|
||||||
|
/home/chchang/.openclaw/workspace/owner-reporting-system/scripts/run_owner_report_watchdog.sh --dry-run
|
||||||
|
```
|
||||||
|
|
||||||
|
更完整的操作、debug 與 failure semantics 請看 `OWNER_REPORT_OPERATOR_MANUAL.md`。
|
||||||
192
references/watchdog-b-readme.md
Normal file
192
references/watchdog-b-readme.md
Normal file
@@ -0,0 +1,192 @@
|
|||||||
|
# Watchdog B v3 notification layer
|
||||||
|
|
||||||
|
Single source of truth for owner-facing policy: `~/.config/openclaw/watchdog-b.env`
|
||||||
|
|
||||||
|
Runtime auto-detection source: `scripts/openclaw_runtime_probe.py`
|
||||||
|
|
||||||
|
This directory now contains:
|
||||||
|
|
||||||
|
- `check_openclaw_state.sh` — tri-state checker (`running` / `stalled` / `idle`)
|
||||||
|
- `run_watchdog_b.sh` — dispatcher + notification runner
|
||||||
|
- `notify_watchdog_b.py` — minimal notification integration layer
|
||||||
|
|
||||||
|
## Configuration source
|
||||||
|
|
||||||
|
Priority order is now:
|
||||||
|
1. process env already set by caller/systemd
|
||||||
|
2. `WATCHDOG_B_CONFIG_FILE` if set
|
||||||
|
3. fallback `~/.config/openclaw/watchdog-b.env`
|
||||||
|
4. code defaults
|
||||||
|
|
||||||
|
This means both `run_watchdog_b.sh` and `notify_watchdog_b.py` can be invoked manually and still resolve the same owner-facing channel / target / mode / wording.
|
||||||
|
|
||||||
|
For Node/OpenClaw runtime paths, the bundled scripts now resolve in this order:
|
||||||
|
1. explicit env overrides: `WATCHDOG_B_NODE_BIN`, `WATCHDOG_B_OPENCLAW_MJS`, `WATCHDOG_B_OPENCLAW_ENTRY`
|
||||||
|
2. PATH lookup for `node` and `openclaw`
|
||||||
|
3. common install roots scan: nvm, pnpm global, npm-global, `/usr/local`, `/usr`, Volta-style trees
|
||||||
|
4. fail with an operator-facing error that tells you which env vars to set manually
|
||||||
|
|
||||||
|
Repo template to copy from:
|
||||||
|
- `ops/systemd/user/watchdog-b.env.example`
|
||||||
|
|
||||||
|
Install example:
|
||||||
|
```bash
|
||||||
|
mkdir -p ~/.config/openclaw
|
||||||
|
cp ~/.openclaw/workspace/ops/systemd/user/watchdog-b.env.example ~/.config/openclaw/watchdog-b.env
|
||||||
|
$EDITOR ~/.config/openclaw/watchdog-b.env
|
||||||
|
```
|
||||||
|
|
||||||
|
## Notification strategy
|
||||||
|
|
||||||
|
### 1) `running`
|
||||||
|
Default: **manual / queue-ready only**.
|
||||||
|
|
||||||
|
Why:
|
||||||
|
- a healthy runtime every 10 minutes should not spam Eric
|
||||||
|
- owner-facing reporting should remain explicit and auditable
|
||||||
|
|
||||||
|
Behavior:
|
||||||
|
- `WATCHDOG_B_RUNNING_REPORT_MODE=manual` (default)
|
||||||
|
- does not create external messages
|
||||||
|
- returns a concrete hint for how to enable queue creation
|
||||||
|
- `WATCHDOG_B_RUNNING_REPORT_MODE=enqueue`
|
||||||
|
- creates a real pending owner report in `~/.clawteam/owner-reports/pending/`
|
||||||
|
- does **not** auto-send it
|
||||||
|
- `WATCHDOG_B_RUNNING_REPORT_MODE=enqueue-and-drain`
|
||||||
|
- creates a pending owner report and immediately delivers that exact pending item through `owner_report_driver.py`
|
||||||
|
- send path is direct OpenClaw Discord send using the env-configured owner-facing destination
|
||||||
|
- this keeps queue/audit semantics but avoids depending on the wrapper watchdog's destination/visibility behavior
|
||||||
|
|
||||||
|
Throttle:
|
||||||
|
- `WATCHDOG_B_RUNNING_REPORT_MIN_INTERVAL_SECONDS` default `3600`
|
||||||
|
|
||||||
|
### 2) `stalled`
|
||||||
|
Default: **nudge main agent first**, then escalate to Eric only after repetition.
|
||||||
|
|
||||||
|
Behavior:
|
||||||
|
- call internal OpenClaw agent route:
|
||||||
|
- `node .../openclaw.mjs agent --agent main --message ...`
|
||||||
|
- maintain local notify state under `state/watchdog-b/notify-state.json`
|
||||||
|
- after repeated observations (`WATCHDOG_B_STALLED_OWNER_ESCALATION_AFTER`, default `2`), enqueue an owner report
|
||||||
|
|
||||||
|
Throttle:
|
||||||
|
- `WATCHDOG_B_STALLED_NUDGE_MIN_INTERVAL_SECONDS` default `900`
|
||||||
|
|
||||||
|
Owner escalation mode:
|
||||||
|
- `WATCHDOG_B_STALLED_OWNER_MODE=escalate` (default)
|
||||||
|
- `WATCHDOG_B_STALLED_OWNER_MODE=always`
|
||||||
|
- `WATCHDOG_B_STALLED_OWNER_MODE=never`
|
||||||
|
|
||||||
|
Owner delivery mode after enqueue:
|
||||||
|
- `WATCHDOG_B_OWNER_DELIVERY_MODE=enqueue-only` (default)
|
||||||
|
- `WATCHDOG_B_OWNER_DELIVERY_MODE=direct-discord`
|
||||||
|
|
||||||
|
When `direct-discord` is enabled, watchdog-b still enqueues first, then directly delivers that same pending report via `owner_report_driver.py` to the env-configured Discord target.
|
||||||
|
|
||||||
|
## Owner-facing message style
|
||||||
|
|
||||||
|
Owner-facing Discord message is now compact and conclusion-first:
|
||||||
|
- first line: headline (`🔔 [watchdog-b] <worker>`)
|
||||||
|
- second line: concise conclusion with emoji, e.g. `✅ 主程序仍在運行`
|
||||||
|
- third line: actionable next step, prefixed with `→`
|
||||||
|
- last line: compact technical metadata (`task=... | status=... | progress=... | source=...`)
|
||||||
|
|
||||||
|
Style knobs live in `watchdog-b.env`:
|
||||||
|
- `WATCHDOG_B_RUNNING_EMOJI`, `WATCHDOG_B_RUNNING_SUMMARY`
|
||||||
|
- `WATCHDOG_B_STALLED_EMOJI`, `WATCHDOG_B_STALLED_SUMMARY`
|
||||||
|
- `WATCHDOG_B_IDLE_EMOJI`, `WATCHDOG_B_IDLE_SUMMARY`
|
||||||
|
|
||||||
|
### 3) `idle`
|
||||||
|
Default: same pattern as stalled, but slower.
|
||||||
|
|
||||||
|
Behavior:
|
||||||
|
- nudge main agent first
|
||||||
|
- only escalate to Eric after repeated idle detections
|
||||||
|
|
||||||
|
Throttle:
|
||||||
|
- `WATCHDOG_B_IDLE_NUDGE_MIN_INTERVAL_SECONDS` default `1800`
|
||||||
|
|
||||||
|
Owner escalation threshold:
|
||||||
|
- `WATCHDOG_B_IDLE_OWNER_ESCALATION_AFTER` default `2`
|
||||||
|
|
||||||
|
## Safety defaults
|
||||||
|
|
||||||
|
- `WATCHDOG_B_NOTIFY_DRY_RUN=1` by default in `run_watchdog_b.sh`
|
||||||
|
- owner-facing send path keeps the existing `owner-reporting-system` queue/artifact/audit flow, but direct Discord delivery no longer depends on the cron wrapper's default destination semantics
|
||||||
|
- local state tracks last send time / count to reduce spam
|
||||||
|
- a failed notifier does not crash the dispatcher; it emits warning + preserves artifacts
|
||||||
|
|
||||||
|
## Key artifacts
|
||||||
|
|
||||||
|
Under `state/watchdog-b/`:
|
||||||
|
|
||||||
|
- `last-output.txt` — rendered dispatcher output
|
||||||
|
- `last-notify-output.txt` — notifier JSON result
|
||||||
|
- `last-state.txt` — last state
|
||||||
|
- `history.tsv` — state history
|
||||||
|
- `notify-state.json` — throttle / repetition tracking
|
||||||
|
|
||||||
|
## Manual test examples
|
||||||
|
|
||||||
|
### Dry-run running path
|
||||||
|
```bash
|
||||||
|
WATCHDOG_B_NOTIFY_DRY_RUN=1 \
|
||||||
|
WATCHDOG_B_RUNNING_REPORT_MODE=manual \
|
||||||
|
./scripts/watchdog-b/notify_watchdog_b.py --state running --dry-run
|
||||||
|
```
|
||||||
|
|
||||||
|
### Real queue creation for running (no send)
|
||||||
|
```bash
|
||||||
|
WATCHDOG_B_NOTIFY_DRY_RUN=0 \
|
||||||
|
WATCHDOG_B_RUNNING_REPORT_MODE=enqueue \
|
||||||
|
./scripts/watchdog-b/notify_watchdog_b.py --state running
|
||||||
|
ls -l ~/.clawteam/owner-reports/pending
|
||||||
|
```
|
||||||
|
|
||||||
|
### Runtime probe only
|
||||||
|
```bash
|
||||||
|
python3 ./scripts/watchdog-b/openclaw_runtime_probe.py --pretty
|
||||||
|
```
|
||||||
|
|
||||||
|
### Single-shot enqueue + direct Discord delivery
|
||||||
|
```bash
|
||||||
|
WATCHDOG_B_NOTIFY_DRY_RUN=0 \
|
||||||
|
WATCHDOG_B_RUNNING_REPORT_MODE=enqueue-and-drain \
|
||||||
|
./scripts/watchdog-b/notify_watchdog_b.py --state running
|
||||||
|
```
|
||||||
|
|
||||||
|
### Dry-run stalled nudge
|
||||||
|
```bash
|
||||||
|
WATCHDOG_B_NOTIFY_DRY_RUN=1 \
|
||||||
|
./scripts/watchdog-b/notify_watchdog_b.py --state stalled --dry-run
|
||||||
|
```
|
||||||
|
|
||||||
|
If runtime auto-detection fails on a host with a custom install layout, set one or more of:
|
||||||
|
- `WATCHDOG_B_NODE_BIN`
|
||||||
|
- `WATCHDOG_B_OPENCLAW_MJS`
|
||||||
|
- `WATCHDOG_B_OPENCLAW_ENTRY`
|
||||||
|
|
||||||
|
### Full dispatcher dry-run with fixture overrides
|
||||||
|
```bash
|
||||||
|
OPENCLAW_PID_FILE=$PWD/tests/fixtures/watchdog-b/running/host-runtime/openclaw.pid \
|
||||||
|
OPENCLAW_LOG_FILE=$PWD/tests/fixtures/watchdog-b/running/logs/openclaw.log \
|
||||||
|
WATCHDOG_B_ARTIFACT_DIR=$PWD/state/watchdog-b-test-running-v3 \
|
||||||
|
WATCHDOG_B_NOTIFY_DRY_RUN=1 \
|
||||||
|
./scripts/watchdog-b/run_watchdog_b.sh
|
||||||
|
```
|
||||||
|
|
||||||
|
## What is truly wired vs not
|
||||||
|
|
||||||
|
### Truly wired now
|
||||||
|
- state detection
|
||||||
|
- notifier invocation from dispatcher
|
||||||
|
- main-agent internal nudge command construction and execution path
|
||||||
|
- owner-report queue creation via existing producer
|
||||||
|
- optional direct delivery of an enqueued owner report through `owner_report_driver.py`
|
||||||
|
- throttling / repetition state persisted locally
|
||||||
|
|
||||||
|
### Still conditional / not claimed as universally proven
|
||||||
|
- successful main-agent wake-up depends on local OpenClaw CLI/runtime being callable from this environment
|
||||||
|
- successful owner-facing delivery still depends on valid local OpenClaw Discord routing on the host
|
||||||
|
- direct watchdog-b owner delivery now targets `channel:1480577550445969541` by default and bypasses the wrapper watchdog destination logic
|
||||||
|
- the cron wrapper remains available for generic queue draining, but watchdog-b no longer needs to rely on it for single-shot owner-facing verification
|
||||||
BIN
scripts/__pycache__/notify_watchdog_b.cpython-312.pyc
Normal file
BIN
scripts/__pycache__/notify_watchdog_b.cpython-312.pyc
Normal file
Binary file not shown.
BIN
scripts/__pycache__/openclaw_runtime_probe.cpython-312.pyc
Normal file
BIN
scripts/__pycache__/openclaw_runtime_probe.cpython-312.pyc
Normal file
Binary file not shown.
BIN
scripts/__pycache__/owner_report_consumer.cpython-312.pyc
Normal file
BIN
scripts/__pycache__/owner_report_consumer.cpython-312.pyc
Normal file
Binary file not shown.
BIN
scripts/__pycache__/owner_report_driver.cpython-312.pyc
Normal file
BIN
scripts/__pycache__/owner_report_driver.cpython-312.pyc
Normal file
Binary file not shown.
BIN
scripts/__pycache__/owner_report_producer.cpython-312.pyc
Normal file
BIN
scripts/__pycache__/owner_report_producer.cpython-312.pyc
Normal file
Binary file not shown.
174
scripts/bootstrap_watchdog_bundle.sh
Executable file
174
scripts/bootstrap_watchdog_bundle.sh
Executable file
@@ -0,0 +1,174 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
SKILL_DIR="$(cd -- "$SCRIPT_DIR/.." && pwd)"
|
||||||
|
HOME_DIR="${HOME:?HOME is required}"
|
||||||
|
WORKSPACE_DEFAULT="$HOME_DIR/.openclaw/workspace"
|
||||||
|
WORKSPACE="${WATCHDOG_B_WORKSPACE:-$WORKSPACE_DEFAULT}"
|
||||||
|
LIVE_SCRIPT_DIR="${WATCHDOG_B_LIVE_SCRIPT_DIR:-$WORKSPACE/scripts/watchdog-b}"
|
||||||
|
SYSTEMD_USER_DIR="${WATCHDOG_B_SYSTEMD_USER_DIR:-$HOME_DIR/.config/systemd/user}"
|
||||||
|
CONFIG_DIR="${WATCHDOG_B_CONFIG_DIR:-$HOME_DIR/.config/openclaw}"
|
||||||
|
CONFIG_FILE="${WATCHDOG_B_CONFIG_FILE:-$CONFIG_DIR/watchdog-b.env}"
|
||||||
|
PROBE_SCRIPT="${WATCHDOG_B_RUNTIME_PROBE:-$SCRIPT_DIR/openclaw_runtime_probe.py}"
|
||||||
|
NODE_BIN_RAW="${WATCHDOG_B_NODE_BIN:-}"
|
||||||
|
OPENCLAW_MJS="${WATCHDOG_B_OPENCLAW_MJS:-}"
|
||||||
|
OPENCLAW_ENTRY="${WATCHDOG_B_OPENCLAW_ENTRY:-}"
|
||||||
|
OWNER_REPORT_PRODUCER="${WATCHDOG_B_OWNER_PRODUCER:-$LIVE_SCRIPT_DIR/owner_report_producer.py}"
|
||||||
|
OWNER_REPORT_DRIVER="${WATCHDOG_B_OWNER_DRIVER:-$LIVE_SCRIPT_DIR/owner_report_driver.py}"
|
||||||
|
OWNER_REPORT_CONSUMER_DEFAULT="$LIVE_SCRIPT_DIR/owner_report_consumer.py"
|
||||||
|
OWNER_REPORT_CONSUMER="${WATCHDOG_B_OWNER_REPORT_CONSUMER:-$OWNER_REPORT_CONSUMER_DEFAULT}"
|
||||||
|
FAILURES=0
|
||||||
|
|
||||||
|
pass() { echo "[PASS] $*"; }
|
||||||
|
warn() { echo "[WARN] $*"; }
|
||||||
|
fail() { echo "[FAIL] $*"; FAILURES=$((FAILURES+1)); }
|
||||||
|
|
||||||
|
check_exists() {
|
||||||
|
local path="$1" label="$2"
|
||||||
|
if [[ -e "$path" ]]; then
|
||||||
|
pass "$label: $path"
|
||||||
|
else
|
||||||
|
fail "$label missing: $path"
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
check_exec_path() {
|
||||||
|
local raw="$1" label="$2"
|
||||||
|
local resolved=""
|
||||||
|
if [[ "$raw" == */* ]]; then
|
||||||
|
resolved="$raw"
|
||||||
|
if [[ -x "$resolved" ]]; then
|
||||||
|
pass "$label executable: $resolved"
|
||||||
|
else
|
||||||
|
fail "$label not executable: $resolved"
|
||||||
|
fi
|
||||||
|
return
|
||||||
|
fi
|
||||||
|
if resolved="$(command -v "$raw" 2>/dev/null)"; then
|
||||||
|
pass "$label on PATH: $resolved"
|
||||||
|
else
|
||||||
|
fail "$label not found on PATH: $raw"
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
check_systemd_user() {
|
||||||
|
if ! command -v systemctl >/dev/null 2>&1; then
|
||||||
|
fail "systemctl not found"
|
||||||
|
return
|
||||||
|
fi
|
||||||
|
if systemctl --user --version >/dev/null 2>&1; then
|
||||||
|
pass "systemd --user command available"
|
||||||
|
else
|
||||||
|
fail "systemd --user unavailable"
|
||||||
|
fi
|
||||||
|
if systemctl --user show-environment >/dev/null 2>&1; then
|
||||||
|
pass "systemd --user bus reachable"
|
||||||
|
else
|
||||||
|
warn "systemd --user bus not reachable in current session"
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
check_env_target() {
|
||||||
|
if [[ ! -f "$CONFIG_FILE" ]]; then
|
||||||
|
warn "config file not present yet: $CONFIG_FILE"
|
||||||
|
return
|
||||||
|
fi
|
||||||
|
local target=""
|
||||||
|
target="$(awk -F= '/^WATCHDOG_B_OWNER_REPORT_TARGET=/{print $2}' "$CONFIG_FILE" | tail -n 1 | tr -d '[:space:]' || true)"
|
||||||
|
if [[ -z "$target" ]]; then
|
||||||
|
fail "WATCHDOG_B_OWNER_REPORT_TARGET missing in $CONFIG_FILE"
|
||||||
|
elif [[ "$target" == "channel:REPLACE_ME" ]]; then
|
||||||
|
fail "WATCHDOG_B_OWNER_REPORT_TARGET still placeholder in $CONFIG_FILE"
|
||||||
|
elif [[ "$target" == channel:* || "$target" == user:* ]]; then
|
||||||
|
pass "WATCHDOG_B_OWNER_REPORT_TARGET looks configured: $target"
|
||||||
|
else
|
||||||
|
warn "WATCHDOG_B_OWNER_REPORT_TARGET present but format is unusual: $target"
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
probe_runtime() {
|
||||||
|
if [[ ! -f "$PROBE_SCRIPT" ]]; then
|
||||||
|
fail "runtime probe missing: $PROBE_SCRIPT"
|
||||||
|
return
|
||||||
|
fi
|
||||||
|
|
||||||
|
local probe_output=""
|
||||||
|
if ! probe_output="$(python3 "$PROBE_SCRIPT" --shell 2>/dev/null)"; then
|
||||||
|
fail "runtime probe failed; set WATCHDOG_B_NODE_BIN / WATCHDOG_B_OPENCLAW_MJS / WATCHDOG_B_OPENCLAW_ENTRY explicitly"
|
||||||
|
return
|
||||||
|
fi
|
||||||
|
|
||||||
|
while IFS='=' read -r key value; do
|
||||||
|
case "$key" in
|
||||||
|
WATCHDOG_B_NODE_BIN) NODE_BIN_RAW="$value" ;;
|
||||||
|
WATCHDOG_B_OPENCLAW_MJS) OPENCLAW_MJS="$value" ;;
|
||||||
|
WATCHDOG_B_OPENCLAW_ENTRY) OPENCLAW_ENTRY="$value" ;;
|
||||||
|
esac
|
||||||
|
done <<< "$probe_output"
|
||||||
|
|
||||||
|
pass "runtime probe resolved node/openclaw paths"
|
||||||
|
}
|
||||||
|
|
||||||
|
check_message_cli() {
|
||||||
|
probe_runtime
|
||||||
|
if [[ -n "$OPENCLAW_ENTRY" && -f "$OPENCLAW_ENTRY" ]]; then
|
||||||
|
pass "openclaw entry present: $OPENCLAW_ENTRY"
|
||||||
|
else
|
||||||
|
fail "openclaw entry missing: ${OPENCLAW_ENTRY:-<unset>}"
|
||||||
|
fi
|
||||||
|
if [[ -n "$OPENCLAW_MJS" && -f "$OPENCLAW_MJS" ]]; then
|
||||||
|
pass "openclaw mjs present: $OPENCLAW_MJS"
|
||||||
|
else
|
||||||
|
fail "openclaw mjs missing: ${OPENCLAW_MJS:-<unset>}"
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
echo "watchdog-discord-route bootstrap"
|
||||||
|
echo "- skill_dir: $SKILL_DIR"
|
||||||
|
echo "- workspace: $WORKSPACE"
|
||||||
|
echo "- live_script_dir: $LIVE_SCRIPT_DIR"
|
||||||
|
echo "- systemd_user_dir: $SYSTEMD_USER_DIR"
|
||||||
|
echo "- config_file: $CONFIG_FILE"
|
||||||
|
|
||||||
|
echo
|
||||||
|
echo "[bundle]"
|
||||||
|
check_exists "$SCRIPT_DIR/check_openclaw_state.sh" "bundled checker"
|
||||||
|
check_exists "$SCRIPT_DIR/run_watchdog_b.sh" "bundled runner"
|
||||||
|
check_exists "$SCRIPT_DIR/notify_watchdog_b.py" "bundled notifier"
|
||||||
|
check_exists "$SCRIPT_DIR/openclaw_runtime_probe.py" "bundled runtime probe"
|
||||||
|
check_exists "$SCRIPT_DIR/openclaw-watchdog-b.service" "bundled service"
|
||||||
|
check_exists "$SCRIPT_DIR/openclaw-watchdog-b.timer" "bundled timer"
|
||||||
|
check_exists "$SCRIPT_DIR/watchdog-b.env.example" "bundled env example"
|
||||||
|
|
||||||
|
echo
|
||||||
|
echo "[workspace/live paths]"
|
||||||
|
check_exists "$WORKSPACE" "workspace"
|
||||||
|
check_exists "$LIVE_SCRIPT_DIR" "live script dir"
|
||||||
|
check_exists "$OWNER_REPORT_CONSUMER" "live owner_report_consumer.py"
|
||||||
|
check_exists "$OWNER_REPORT_PRODUCER" "live owner_report_producer.py"
|
||||||
|
check_exists "$OWNER_REPORT_DRIVER" "live owner_report_driver.py"
|
||||||
|
|
||||||
|
echo
|
||||||
|
echo "[runtime]"
|
||||||
|
check_message_cli
|
||||||
|
if [[ -n "$NODE_BIN_RAW" ]]; then
|
||||||
|
check_exec_path "$NODE_BIN_RAW" "node"
|
||||||
|
else
|
||||||
|
fail "node runtime unresolved"
|
||||||
|
fi
|
||||||
|
check_exec_path "python3" "python3"
|
||||||
|
check_systemd_user
|
||||||
|
|
||||||
|
echo
|
||||||
|
echo "[discord-route minimal config]"
|
||||||
|
check_env_target
|
||||||
|
|
||||||
|
if [[ $FAILURES -gt 0 ]]; then
|
||||||
|
echo
|
||||||
|
fail "bootstrap failed with $FAILURES issue(s)"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo
|
||||||
|
pass "bootstrap checks passed"
|
||||||
68
scripts/check_openclaw_state.sh
Executable file
68
scripts/check_openclaw_state.sh
Executable file
@@ -0,0 +1,68 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
# Watchdog B MVP tri-state checker for OpenClaw main runtime.
|
||||||
|
# Output (stdout): exactly one token: running | stalled | idle
|
||||||
|
#
|
||||||
|
# Heuristic (MVP):
|
||||||
|
# - If openclaw.pid exists and process is alive => running unless logs are stale.
|
||||||
|
# - If process alive but log file hasn't changed for STALL_AFTER_SECONDS => stalled.
|
||||||
|
# - Otherwise => idle.
|
||||||
|
#
|
||||||
|
# Future extension point:
|
||||||
|
# - Replace/augment log-freshness with real main-agent session/ledger signals.
|
||||||
|
|
||||||
|
PID_FILE_DEFAULT="${OPENCLAW_PID_FILE:-/home/chchang/.openclaw/workspace/host-runtime/openclaw.pid}"
|
||||||
|
LOG_FILE_DEFAULT="${OPENCLAW_LOG_FILE:-/home/chchang/.openclaw/workspace/logs/openclaw.log}"
|
||||||
|
|
||||||
|
STALL_AFTER_SECONDS="${STALL_AFTER_SECONDS:-1200}" # 20 minutes default
|
||||||
|
NOW_EPOCH="$(date +%s)"
|
||||||
|
|
||||||
|
pid_file="$PID_FILE_DEFAULT"
|
||||||
|
log_file="$LOG_FILE_DEFAULT"
|
||||||
|
|
||||||
|
get_mtime_epoch() {
|
||||||
|
# GNU stat: %Y; BSD stat: -f %m
|
||||||
|
local path="$1"
|
||||||
|
if stat -c %Y "$path" >/dev/null 2>&1; then
|
||||||
|
stat -c %Y "$path"
|
||||||
|
else
|
||||||
|
stat -f %m "$path"
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
proc_alive() {
|
||||||
|
local pid="$1"
|
||||||
|
[[ -n "$pid" ]] || return 1
|
||||||
|
[[ "$pid" =~ ^[0-9]+$ ]] || return 1
|
||||||
|
kill -0 "$pid" >/dev/null 2>&1
|
||||||
|
}
|
||||||
|
|
||||||
|
# No pid file => idle
|
||||||
|
if [[ ! -f "$pid_file" ]]; then
|
||||||
|
echo "idle"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
pid="$(tr -d ' \t\n\r' < "$pid_file" || true)"
|
||||||
|
|
||||||
|
# PID file exists but process not alive => idle
|
||||||
|
if ! proc_alive "$pid"; then
|
||||||
|
echo "idle"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Process alive. If no log file, assume running (can't assess stall)
|
||||||
|
if [[ ! -f "$log_file" ]]; then
|
||||||
|
echo "running"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
log_mtime="$(get_mtime_epoch "$log_file")"
|
||||||
|
age=$(( NOW_EPOCH - log_mtime ))
|
||||||
|
|
||||||
|
if (( age > STALL_AFTER_SECONDS )); then
|
||||||
|
echo "stalled"
|
||||||
|
else
|
||||||
|
echo "running"
|
||||||
|
fi
|
||||||
136
scripts/install_watchdog_bundle.sh
Executable file
136
scripts/install_watchdog_bundle.sh
Executable file
@@ -0,0 +1,136 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
SKILL_DIR="$(cd -- "$SCRIPT_DIR/.." && pwd)"
|
||||||
|
HOME_DIR="${HOME:?HOME is required}"
|
||||||
|
WORKSPACE_DEFAULT="$HOME_DIR/.openclaw/workspace"
|
||||||
|
WORKSPACE="${WATCHDOG_B_WORKSPACE:-$WORKSPACE_DEFAULT}"
|
||||||
|
SYSTEMD_USER_DIR="${WATCHDOG_B_SYSTEMD_USER_DIR:-$HOME_DIR/.config/systemd/user}"
|
||||||
|
CONFIG_DIR="${WATCHDOG_B_CONFIG_DIR:-$HOME_DIR/.config/openclaw}"
|
||||||
|
LIVE_SCRIPT_DIR="${WATCHDOG_B_LIVE_SCRIPT_DIR:-$WORKSPACE/scripts/watchdog-b}"
|
||||||
|
INSTALL_ENV_EXAMPLE=0
|
||||||
|
FORCE=0
|
||||||
|
|
||||||
|
usage() {
|
||||||
|
cat <<EOF
|
||||||
|
Usage: $(basename "$0") [options]
|
||||||
|
|
||||||
|
Install bundled watchdog-discord-route assets into live paths.
|
||||||
|
|
||||||
|
Options:
|
||||||
|
--workspace PATH Target workspace (default: $WORKSPACE_DEFAULT)
|
||||||
|
--systemd-user-dir PATH Target systemd --user unit dir (default: ~/.config/systemd/user)
|
||||||
|
--config-dir PATH Target config dir (default: ~/.config/openclaw)
|
||||||
|
--live-script-dir PATH Target live watchdog script dir (default: <workspace>/scripts/watchdog-b)
|
||||||
|
--install-env-example Also install watchdog-b.env.example to <config-dir>/watchdog-b.env.example
|
||||||
|
--force Overwrite existing files in live paths
|
||||||
|
-h, --help Show this help
|
||||||
|
EOF
|
||||||
|
}
|
||||||
|
|
||||||
|
while [[ $# -gt 0 ]]; do
|
||||||
|
case "$1" in
|
||||||
|
--workspace)
|
||||||
|
WORKSPACE="$2"; shift 2 ;;
|
||||||
|
--systemd-user-dir)
|
||||||
|
SYSTEMD_USER_DIR="$2"; shift 2 ;;
|
||||||
|
--config-dir)
|
||||||
|
CONFIG_DIR="$2"; shift 2 ;;
|
||||||
|
--live-script-dir)
|
||||||
|
LIVE_SCRIPT_DIR="$2"; shift 2 ;;
|
||||||
|
--install-env-example)
|
||||||
|
INSTALL_ENV_EXAMPLE=1; shift ;;
|
||||||
|
--force)
|
||||||
|
FORCE=1; shift ;;
|
||||||
|
-h|--help)
|
||||||
|
usage; exit 0 ;;
|
||||||
|
*)
|
||||||
|
echo "unknown argument: $1" >&2
|
||||||
|
usage >&2
|
||||||
|
exit 2 ;;
|
||||||
|
esac
|
||||||
|
done
|
||||||
|
|
||||||
|
mkdir -p "$LIVE_SCRIPT_DIR" "$SYSTEMD_USER_DIR" "$CONFIG_DIR"
|
||||||
|
|
||||||
|
copy_file() {
|
||||||
|
local src="$1"
|
||||||
|
local dest="$2"
|
||||||
|
if [[ -e "$dest" && "$FORCE" != "1" ]]; then
|
||||||
|
echo "skip existing: $dest"
|
||||||
|
return 0
|
||||||
|
fi
|
||||||
|
install -m 0644 "$src" "$dest"
|
||||||
|
echo "installed: $dest"
|
||||||
|
}
|
||||||
|
|
||||||
|
copy_exec() {
|
||||||
|
local src="$1"
|
||||||
|
local dest="$2"
|
||||||
|
if [[ -e "$dest" && "$FORCE" != "1" ]]; then
|
||||||
|
echo "skip existing: $dest"
|
||||||
|
return 0
|
||||||
|
fi
|
||||||
|
install -m 0755 "$src" "$dest"
|
||||||
|
echo "installed: $dest"
|
||||||
|
}
|
||||||
|
|
||||||
|
render_service() {
|
||||||
|
local src="$SCRIPT_DIR/openclaw-watchdog-b.service"
|
||||||
|
local dest="$SYSTEMD_USER_DIR/openclaw-watchdog-b.service"
|
||||||
|
if [[ -e "$dest" && "$FORCE" != "1" ]]; then
|
||||||
|
echo "skip existing: $dest"
|
||||||
|
return 0
|
||||||
|
fi
|
||||||
|
sed \
|
||||||
|
-e "s#%h/.openclaw/workspace#${WORKSPACE//\#/\\#}#g" \
|
||||||
|
-e "s#%h/.config/openclaw#${CONFIG_DIR//\#/\\#}#g" \
|
||||||
|
-e "s#%h/.openclaw/workspace/scripts/watchdog-b#${LIVE_SCRIPT_DIR//\#/\\#}#g" \
|
||||||
|
"$src" > "$dest"
|
||||||
|
chmod 0644 "$dest"
|
||||||
|
echo "installed: $dest"
|
||||||
|
}
|
||||||
|
|
||||||
|
copy_exec "$SCRIPT_DIR/check_openclaw_state.sh" "$LIVE_SCRIPT_DIR/check_openclaw_state.sh"
|
||||||
|
copy_exec "$SCRIPT_DIR/run_watchdog_b.sh" "$LIVE_SCRIPT_DIR/run_watchdog_b.sh"
|
||||||
|
copy_exec "$SCRIPT_DIR/verify_watchdog_b_e2e.sh" "$LIVE_SCRIPT_DIR/verify_watchdog_b_e2e.sh"
|
||||||
|
copy_exec "$SCRIPT_DIR/notify_watchdog_b.py" "$LIVE_SCRIPT_DIR/notify_watchdog_b.py"
|
||||||
|
copy_exec "$SCRIPT_DIR/openclaw_runtime_probe.py" "$LIVE_SCRIPT_DIR/openclaw_runtime_probe.py"
|
||||||
|
copy_file "$SCRIPT_DIR/owner_report_consumer.py" "$LIVE_SCRIPT_DIR/owner_report_consumer.py"
|
||||||
|
copy_file "$SCRIPT_DIR/owner_report_driver.py" "$LIVE_SCRIPT_DIR/owner_report_driver.py"
|
||||||
|
copy_file "$SCRIPT_DIR/owner_report_producer.py" "$LIVE_SCRIPT_DIR/owner_report_producer.py"
|
||||||
|
copy_file "$SCRIPT_DIR/openclaw-watchdog-b.timer" "$SYSTEMD_USER_DIR/openclaw-watchdog-b.timer"
|
||||||
|
render_service
|
||||||
|
|
||||||
|
if [[ "$INSTALL_ENV_EXAMPLE" == "1" ]]; then
|
||||||
|
copy_file "$SCRIPT_DIR/watchdog-b.env.example" "$CONFIG_DIR/watchdog-b.env.example"
|
||||||
|
fi
|
||||||
|
|
||||||
|
cat <<EOF
|
||||||
|
|
||||||
|
Install summary
|
||||||
|
- skill_dir: $SKILL_DIR
|
||||||
|
- workspace: $WORKSPACE
|
||||||
|
- live_script_dir: $LIVE_SCRIPT_DIR
|
||||||
|
- systemd_user_dir: $SYSTEMD_USER_DIR
|
||||||
|
- config_dir: $CONFIG_DIR
|
||||||
|
|
||||||
|
Operator install order
|
||||||
|
1. Install bundle files:
|
||||||
|
./scripts/install_watchdog_bundle.sh --install-env-example
|
||||||
|
2. Create live env if missing:
|
||||||
|
mkdir -p "$CONFIG_DIR"
|
||||||
|
cp "$CONFIG_DIR/watchdog-b.env.example" "$CONFIG_DIR/watchdog-b.env"
|
||||||
|
3. Edit live env and set at least:
|
||||||
|
WATCHDOG_B_OWNER_REPORT_TARGET=channel:YOUR_DISCORD_CHANNEL_ID
|
||||||
|
4. Run bootstrap:
|
||||||
|
./scripts/bootstrap_watchdog_bundle.sh
|
||||||
|
5. Only after bootstrap passes:
|
||||||
|
systemctl --user daemon-reload
|
||||||
|
systemctl --user enable --now openclaw-watchdog-b.timer
|
||||||
|
|
||||||
|
Notes
|
||||||
|
- If $CONFIG_DIR/watchdog-b.env does not exist, bootstrap will warn/fail until you create it.
|
||||||
|
- The env example is intentionally installed as watchdog-b.env.example first; copy it to watchdog-b.env after editing.
|
||||||
|
EOF
|
||||||
467
scripts/notify_watchdog_b.py
Executable file
467
scripts/notify_watchdog_b.py
Executable file
@@ -0,0 +1,467 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import shutil
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
SCRIPT_DIR = Path(__file__).resolve().parent
|
||||||
|
SKILL_DIR = SCRIPT_DIR.parent
|
||||||
|
WORKSPACE = Path(os.environ.get("WATCHDOG_B_WORKSPACE", str(Path.home() / ".openclaw" / "workspace")))
|
||||||
|
CONFIG_FILE = Path(os.environ.get("WATCHDOG_B_CONFIG_FILE", str(Path.home() / ".config" / "openclaw" / "watchdog-b.env")))
|
||||||
|
LIVE_SCRIPT_DIR = Path(os.environ.get("WATCHDOG_B_LIVE_SCRIPT_DIR", str(WORKSPACE / "scripts" / "watchdog-b")))
|
||||||
|
|
||||||
|
|
||||||
|
def load_env_file(path: Path) -> None:
|
||||||
|
if not path.exists():
|
||||||
|
return
|
||||||
|
for raw_line in path.read_text(encoding="utf-8").splitlines():
|
||||||
|
line = raw_line.strip()
|
||||||
|
if not line or line.startswith("#") or "=" not in line:
|
||||||
|
continue
|
||||||
|
key, value = line.split("=", 1)
|
||||||
|
key = key.strip()
|
||||||
|
if not key:
|
||||||
|
continue
|
||||||
|
value = value.strip()
|
||||||
|
if (value.startswith('"') and value.endswith('"')) or (value.startswith("'") and value.endswith("'")):
|
||||||
|
value = value[1:-1]
|
||||||
|
os.environ.setdefault(key, value)
|
||||||
|
|
||||||
|
|
||||||
|
load_env_file(CONFIG_FILE)
|
||||||
|
|
||||||
|
STATE_DIR = Path(os.environ.get("WATCHDOG_B_ARTIFACT_DIR", str(WORKSPACE / "state" / "watchdog-b")))
|
||||||
|
NOTIFY_STATE_PATH = STATE_DIR / "notify-state.json"
|
||||||
|
OWNER_PRODUCER = Path(os.environ.get("WATCHDOG_B_OWNER_PRODUCER", str(SCRIPT_DIR / "owner_report_producer.py")))
|
||||||
|
OWNER_DRIVER = Path(os.environ.get("WATCHDOG_B_OWNER_DRIVER", str(SCRIPT_DIR / "owner_report_driver.py")))
|
||||||
|
PYTHON_BIN = os.environ.get("WATCHDOG_B_PYTHON_BIN", sys.executable or "python3")
|
||||||
|
WATCHDOG_OWNER_REPORT_CHANNEL = os.environ.get("WATCHDOG_B_OWNER_REPORT_CHANNEL", "discord")
|
||||||
|
WATCHDOG_OWNER_REPORT_TARGET = os.environ.get("WATCHDOG_B_OWNER_REPORT_TARGET", "channel:REPLACE_ME")
|
||||||
|
WATCHDOG_MAIN_AGENT_ID = os.environ.get("WATCHDOG_B_MAIN_AGENT_ID", "").strip()
|
||||||
|
HOSTNAME = os.uname().nodename
|
||||||
|
UTC = timezone.utc
|
||||||
|
RUNTIME_PROBE = Path(os.environ.get("WATCHDOG_B_RUNTIME_PROBE", str(SCRIPT_DIR / "openclaw_runtime_probe.py")))
|
||||||
|
RUNTIME_CACHE: dict[str, Path] | None = None
|
||||||
|
|
||||||
|
DEFAULTS = {
|
||||||
|
"running_min_interval_seconds": 3600,
|
||||||
|
"stalled_nudge_min_interval_seconds": 900,
|
||||||
|
"idle_nudge_min_interval_seconds": 1800,
|
||||||
|
"stalled_owner_escalation_after": 2,
|
||||||
|
"idle_owner_escalation_after": 2,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def now_iso() -> str:
|
||||||
|
return datetime.now().astimezone().isoformat(timespec="seconds")
|
||||||
|
|
||||||
|
|
||||||
|
def path_or_none(value: str | None) -> Path | None:
|
||||||
|
if not value:
|
||||||
|
return None
|
||||||
|
return Path(value).expanduser()
|
||||||
|
|
||||||
|
|
||||||
|
def detect_runtime_paths() -> dict[str, Path]:
|
||||||
|
global RUNTIME_CACHE
|
||||||
|
if RUNTIME_CACHE is not None:
|
||||||
|
return RUNTIME_CACHE
|
||||||
|
|
||||||
|
node_bin = path_or_none(os.environ.get("WATCHDOG_B_NODE_BIN"))
|
||||||
|
openclaw_mjs = path_or_none(os.environ.get("WATCHDOG_B_OPENCLAW_MJS"))
|
||||||
|
openclaw_entry = path_or_none(os.environ.get("WATCHDOG_B_OPENCLAW_ENTRY"))
|
||||||
|
|
||||||
|
if node_bin and node_bin.exists() and os.access(node_bin, os.X_OK) and openclaw_mjs and openclaw_mjs.is_file() and openclaw_entry and openclaw_entry.is_file():
|
||||||
|
RUNTIME_CACHE = {
|
||||||
|
"node": node_bin,
|
||||||
|
"openclaw_mjs": openclaw_mjs,
|
||||||
|
"openclaw_entry": openclaw_entry,
|
||||||
|
}
|
||||||
|
return RUNTIME_CACHE
|
||||||
|
|
||||||
|
if RUNTIME_PROBE.exists():
|
||||||
|
proc = subprocess.run([PYTHON_BIN, str(RUNTIME_PROBE)], text=True, capture_output=True)
|
||||||
|
if proc.returncode == 0:
|
||||||
|
payload = json.loads(proc.stdout)
|
||||||
|
detected = payload.get("detected", {})
|
||||||
|
RUNTIME_CACHE = {
|
||||||
|
"node": Path(detected["node"]),
|
||||||
|
"openclaw_mjs": Path(detected["openclaw_mjs"]),
|
||||||
|
"openclaw_entry": Path(detected["openclaw_entry"]),
|
||||||
|
}
|
||||||
|
return RUNTIME_CACHE
|
||||||
|
|
||||||
|
node_which = shutil.which("node")
|
||||||
|
if node_which:
|
||||||
|
node_bin = Path(node_which)
|
||||||
|
|
||||||
|
missing = []
|
||||||
|
if not node_bin or not node_bin.exists():
|
||||||
|
missing.append("WATCHDOG_B_NODE_BIN")
|
||||||
|
if not openclaw_mjs or not openclaw_mjs.is_file():
|
||||||
|
missing.append("WATCHDOG_B_OPENCLAW_MJS")
|
||||||
|
if not openclaw_entry or not openclaw_entry.is_file():
|
||||||
|
missing.append("WATCHDOG_B_OPENCLAW_ENTRY")
|
||||||
|
raise RuntimeError(
|
||||||
|
"Unable to auto-detect watchdog runtime paths. Missing: " + ", ".join(missing)
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def load_state() -> dict[str, Any]:
|
||||||
|
if NOTIFY_STATE_PATH.exists():
|
||||||
|
try:
|
||||||
|
return json.loads(NOTIFY_STATE_PATH.read_text(encoding="utf-8"))
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
return {"events": {}}
|
||||||
|
|
||||||
|
|
||||||
|
def save_state(data: dict[str, Any]) -> None:
|
||||||
|
STATE_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
|
NOTIFY_STATE_PATH.write_text(json.dumps(data, ensure_ascii=False, indent=2) + "\n", encoding="utf-8")
|
||||||
|
|
||||||
|
|
||||||
|
def event_bucket(state: str) -> dict[str, Any]:
|
||||||
|
data = load_state()
|
||||||
|
events = data.setdefault("events", {})
|
||||||
|
bucket = events.setdefault(state, {})
|
||||||
|
return data
|
||||||
|
|
||||||
|
|
||||||
|
def get_bucket(data: dict[str, Any], state: str) -> dict[str, Any]:
|
||||||
|
events = data.setdefault("events", {})
|
||||||
|
return events.setdefault(state, {})
|
||||||
|
|
||||||
|
|
||||||
|
def should_send(bucket: dict[str, Any], min_interval_seconds: int, timestamp: datetime) -> tuple[bool, str]:
|
||||||
|
last_sent = bucket.get("last_sent_at")
|
||||||
|
if not last_sent:
|
||||||
|
return True, "first-send"
|
||||||
|
try:
|
||||||
|
prev = datetime.fromisoformat(last_sent)
|
||||||
|
except Exception:
|
||||||
|
return True, "state-corrupt-reset"
|
||||||
|
elapsed = (timestamp - prev).total_seconds()
|
||||||
|
if elapsed >= min_interval_seconds:
|
||||||
|
return True, f"interval-ok:{int(elapsed)}s"
|
||||||
|
return False, f"throttled:{int(elapsed)}s<{min_interval_seconds}s"
|
||||||
|
|
||||||
|
|
||||||
|
def mark_sent(bucket: dict[str, Any], channel: str, timestamp: str, detail: dict[str, Any] | None = None) -> None:
|
||||||
|
bucket["last_sent_at"] = timestamp
|
||||||
|
bucket["last_channel"] = channel
|
||||||
|
bucket["send_count"] = int(bucket.get("send_count", 0)) + 1
|
||||||
|
bucket["last_detail"] = detail or {}
|
||||||
|
|
||||||
|
|
||||||
|
def build_owner_message(state: str, timestamp: str, detail: str) -> dict[str, str]:
|
||||||
|
emoji_default = {
|
||||||
|
"running": "✅",
|
||||||
|
"stalled": "⚠️",
|
||||||
|
"idle": "🛑",
|
||||||
|
}
|
||||||
|
summary_default = {
|
||||||
|
"running": "主程序仍在運行",
|
||||||
|
"stalled": "主程序疑似卡住",
|
||||||
|
"idle": "主程序目前未運行",
|
||||||
|
}
|
||||||
|
progress_default = {
|
||||||
|
"running": "running",
|
||||||
|
"stalled": "stalled",
|
||||||
|
"idle": "idle",
|
||||||
|
}
|
||||||
|
status_default = {
|
||||||
|
"running": "normal",
|
||||||
|
"stalled": "needs-attention",
|
||||||
|
"idle": "needs-attention",
|
||||||
|
}
|
||||||
|
source_default = {
|
||||||
|
"running": "watchdog-b-running",
|
||||||
|
"stalled": "watchdog-b-stalled-escalation",
|
||||||
|
"idle": "watchdog-b-idle-escalation",
|
||||||
|
}
|
||||||
|
detail_default = {
|
||||||
|
"running": f"checked_at={timestamp} host={HOSTNAME}",
|
||||||
|
"stalled": f"checked_at={timestamp} host={HOSTNAME}; stale activity detected while process still looked alive",
|
||||||
|
"idle": f"checked_at={timestamp} host={HOSTNAME}; no active main runtime detected",
|
||||||
|
}
|
||||||
|
return {
|
||||||
|
"progress": os.environ.get(f"WATCHDOG_B_{state.upper()}_PROGRESS_LABEL", progress_default[state]),
|
||||||
|
"done": f"{os.environ.get(f'WATCHDOG_B_{state.upper()}_EMOJI', emoji_default[state])} {os.environ.get(f'WATCHDOG_B_{state.upper()}_SUMMARY', summary_default[state])}",
|
||||||
|
"next": detail or os.environ.get(f"WATCHDOG_B_{state.upper()}_DETAIL", detail_default[state]),
|
||||||
|
"status": os.environ.get(f"WATCHDOG_B_{state.upper()}_STATUS", status_default[state]),
|
||||||
|
"source": os.environ.get(f"WATCHDOG_B_{state.upper()}_SOURCE", source_default[state]),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def enqueue_owner_report(*, state: str, timestamp: str, dry_run: bool, detail: str) -> dict[str, Any]:
|
||||||
|
msg = build_owner_message(state, timestamp, detail)
|
||||||
|
report_id = f"watchdog-b-{state}-{datetime.now(UTC).strftime('%Y%m%dT%H%M%SZ')}"
|
||||||
|
cmd = [
|
||||||
|
PYTHON_BIN,
|
||||||
|
str(OWNER_PRODUCER),
|
||||||
|
"--team",
|
||||||
|
"watchdog-b",
|
||||||
|
"--worker",
|
||||||
|
HOSTNAME,
|
||||||
|
"--task-id",
|
||||||
|
f"openclaw-main-{state}",
|
||||||
|
"--progress",
|
||||||
|
msg["progress"],
|
||||||
|
"--done",
|
||||||
|
msg["done"],
|
||||||
|
"--next",
|
||||||
|
msg["next"],
|
||||||
|
"--status",
|
||||||
|
msg["status"],
|
||||||
|
"--source",
|
||||||
|
msg["source"],
|
||||||
|
"--report-id",
|
||||||
|
report_id,
|
||||||
|
]
|
||||||
|
if dry_run:
|
||||||
|
cmd.append("--dry-run")
|
||||||
|
proc = subprocess.run(cmd, text=True, capture_output=True)
|
||||||
|
result = {
|
||||||
|
"kind": "owner-report-enqueue",
|
||||||
|
"ok": proc.returncode == 0,
|
||||||
|
"command": cmd,
|
||||||
|
"exit_code": proc.returncode,
|
||||||
|
"stdout": proc.stdout,
|
||||||
|
"stderr": proc.stderr,
|
||||||
|
"report_id": report_id,
|
||||||
|
"dry_run": dry_run,
|
||||||
|
}
|
||||||
|
if proc.returncode == 0 and not dry_run:
|
||||||
|
result["pending_path"] = str(Path.home() / ".clawteam" / "owner-reports" / "pending" / f"{report_id}.md")
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
def build_owner_send_cmd() -> str:
|
||||||
|
runtime = detect_runtime_paths()
|
||||||
|
return (
|
||||||
|
f'"{runtime["node"]}" "{runtime["openclaw_entry"]}" message send '
|
||||||
|
f'--channel {WATCHDOG_OWNER_REPORT_CHANNEL} '
|
||||||
|
f"--target '{WATCHDOG_OWNER_REPORT_TARGET}' "
|
||||||
|
f'--message "$OWNER_REPORT_MESSAGE"'
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def deliver_owner_report(*, report_id: str, dry_run: bool) -> dict[str, Any]:
|
||||||
|
send_cmd = build_owner_send_cmd()
|
||||||
|
cmd = [PYTHON_BIN, str(OWNER_DRIVER), report_id, "--send-cmd", send_cmd]
|
||||||
|
if dry_run:
|
||||||
|
cmd.append("--dry-run")
|
||||||
|
proc = subprocess.run(cmd, text=True, capture_output=True)
|
||||||
|
return {
|
||||||
|
"kind": "owner-report-direct-delivery",
|
||||||
|
"ok": proc.returncode == 0,
|
||||||
|
"command": cmd,
|
||||||
|
"send_cmd": send_cmd,
|
||||||
|
"exit_code": proc.returncode,
|
||||||
|
"stdout": proc.stdout,
|
||||||
|
"stderr": proc.stderr,
|
||||||
|
"dry_run": dry_run,
|
||||||
|
"report_id": report_id,
|
||||||
|
"target_channel": WATCHDOG_OWNER_REPORT_CHANNEL,
|
||||||
|
"target": WATCHDOG_OWNER_REPORT_TARGET,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def call_main_agent(*, state: str, timestamp: str, dry_run: bool) -> dict[str, Any]:
|
||||||
|
message = (
|
||||||
|
f"[watchdog-b][{state}] {timestamp}\n"
|
||||||
|
f"Host: {HOSTNAME}\n"
|
||||||
|
f"Please confirm current task state, whether progress is blocked, and whether owner-facing escalation is needed."
|
||||||
|
)
|
||||||
|
if not WATCHDOG_MAIN_AGENT_ID:
|
||||||
|
return {
|
||||||
|
"kind": "main-agent-nudge",
|
||||||
|
"ok": True,
|
||||||
|
"skipped": True,
|
||||||
|
"reason": "WATCHDOG_B_MAIN_AGENT_ID not configured",
|
||||||
|
"dry_run": dry_run,
|
||||||
|
"message": message,
|
||||||
|
}
|
||||||
|
try:
|
||||||
|
runtime = detect_runtime_paths()
|
||||||
|
except Exception as exc:
|
||||||
|
return {
|
||||||
|
"kind": "main-agent-nudge",
|
||||||
|
"ok": False,
|
||||||
|
"dry_run": dry_run,
|
||||||
|
"error": str(exc),
|
||||||
|
"message": message,
|
||||||
|
}
|
||||||
|
cmd = [
|
||||||
|
str(runtime["node"]),
|
||||||
|
str(runtime["openclaw_mjs"]),
|
||||||
|
"agent",
|
||||||
|
"--agent",
|
||||||
|
WATCHDOG_MAIN_AGENT_ID,
|
||||||
|
"--message",
|
||||||
|
message,
|
||||||
|
"--timeout",
|
||||||
|
os.environ.get("WATCHDOG_B_MAIN_AGENT_TIMEOUT", "120"),
|
||||||
|
]
|
||||||
|
if dry_run:
|
||||||
|
return {"kind": "main-agent-nudge", "ok": True, "dry_run": True, "command": cmd, "message": message}
|
||||||
|
try:
|
||||||
|
proc = subprocess.run(cmd, text=True, capture_output=True, timeout=int(os.environ.get("WATCHDOG_B_MAIN_AGENT_TIMEOUT", "120")) + 10)
|
||||||
|
return {
|
||||||
|
"kind": "main-agent-nudge",
|
||||||
|
"ok": proc.returncode == 0,
|
||||||
|
"dry_run": False,
|
||||||
|
"command": cmd,
|
||||||
|
"exit_code": proc.returncode,
|
||||||
|
"stdout": proc.stdout,
|
||||||
|
"stderr": proc.stderr,
|
||||||
|
"message": message,
|
||||||
|
}
|
||||||
|
except subprocess.TimeoutExpired as e:
|
||||||
|
return {
|
||||||
|
"kind": "main-agent-nudge",
|
||||||
|
"ok": False,
|
||||||
|
"dry_run": False,
|
||||||
|
"command": cmd,
|
||||||
|
"timeout": True,
|
||||||
|
"stdout": e.stdout,
|
||||||
|
"stderr": e.stderr,
|
||||||
|
"message": message,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def maybe_running_report(data: dict[str, Any], bucket: dict[str, Any], timestamp: str, dry_run: bool) -> dict[str, Any]:
|
||||||
|
mode = os.environ.get("WATCHDOG_B_RUNNING_REPORT_MODE", "manual").lower()
|
||||||
|
min_interval = int(os.environ.get("WATCHDOG_B_RUNNING_REPORT_MIN_INTERVAL_SECONDS", str(DEFAULTS["running_min_interval_seconds"])))
|
||||||
|
allowed, reason = should_send(bucket, min_interval, datetime.fromisoformat(timestamp))
|
||||||
|
result: dict[str, Any] = {
|
||||||
|
"state": "running",
|
||||||
|
"route": "owner-report",
|
||||||
|
"mode": mode,
|
||||||
|
"allowed": allowed,
|
||||||
|
"reason": reason,
|
||||||
|
"dry_run": dry_run,
|
||||||
|
}
|
||||||
|
if mode not in {"manual", "enqueue", "enqueue-and-drain"}:
|
||||||
|
result.update({"ok": False, "error": f"unsupported running mode: {mode}"})
|
||||||
|
return result
|
||||||
|
if mode == "manual":
|
||||||
|
result.update({
|
||||||
|
"ok": True,
|
||||||
|
"action": "manual-only",
|
||||||
|
"hint": "set WATCHDOG_B_RUNNING_REPORT_MODE=enqueue to create a real pending item, or enqueue-and-drain to enqueue and directly deliver it to Discord",
|
||||||
|
})
|
||||||
|
return result
|
||||||
|
if not allowed:
|
||||||
|
result.update({"ok": True, "action": "suppressed"})
|
||||||
|
return result
|
||||||
|
enqueue = enqueue_owner_report(state="running", timestamp=timestamp, dry_run=dry_run, detail="Main runtime alive and log activity fresh.")
|
||||||
|
result["enqueue"] = enqueue
|
||||||
|
result["ok"] = enqueue.get("ok", False)
|
||||||
|
if enqueue.get("ok"):
|
||||||
|
mark_sent(bucket, "owner-report-enqueue", timestamp, {"report_id": enqueue.get("report_id")})
|
||||||
|
if mode == "enqueue-and-drain" and enqueue.get("ok"):
|
||||||
|
deliver = deliver_owner_report(report_id=enqueue.get("report_id"), dry_run=dry_run)
|
||||||
|
result["deliver"] = deliver
|
||||||
|
result["ok"] = result["ok"] and deliver.get("ok", False)
|
||||||
|
if deliver.get("ok"):
|
||||||
|
mark_sent(bucket, "owner-report-direct-delivery", timestamp, {"report_id": enqueue.get("report_id")})
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
def maybe_nudge_and_escalate(data: dict[str, Any], bucket: dict[str, Any], *, state: str, timestamp: str, dry_run: bool) -> dict[str, Any]:
|
||||||
|
is_stalled = state == "stalled"
|
||||||
|
nudge_min = int(os.environ.get(
|
||||||
|
"WATCHDOG_B_STALLED_NUDGE_MIN_INTERVAL_SECONDS" if is_stalled else "WATCHDOG_B_IDLE_NUDGE_MIN_INTERVAL_SECONDS",
|
||||||
|
str(DEFAULTS["stalled_nudge_min_interval_seconds"] if is_stalled else DEFAULTS["idle_nudge_min_interval_seconds"]),
|
||||||
|
))
|
||||||
|
escalation_after = int(os.environ.get(
|
||||||
|
"WATCHDOG_B_STALLED_OWNER_ESCALATION_AFTER" if is_stalled else "WATCHDOG_B_IDLE_OWNER_ESCALATION_AFTER",
|
||||||
|
str(DEFAULTS["stalled_owner_escalation_after"] if is_stalled else DEFAULTS["idle_owner_escalation_after"]),
|
||||||
|
))
|
||||||
|
owner_mode = os.environ.get(
|
||||||
|
"WATCHDOG_B_STALLED_OWNER_MODE" if is_stalled else "WATCHDOG_B_IDLE_OWNER_MODE",
|
||||||
|
"escalate",
|
||||||
|
).lower()
|
||||||
|
|
||||||
|
bucket["seen_count"] = int(bucket.get("seen_count", 0)) + 1
|
||||||
|
allowed, reason = should_send(bucket, nudge_min, datetime.fromisoformat(timestamp))
|
||||||
|
result: dict[str, Any] = {
|
||||||
|
"state": state,
|
||||||
|
"route": "main-agent-then-owner",
|
||||||
|
"allowed": allowed,
|
||||||
|
"reason": reason,
|
||||||
|
"seen_count": bucket["seen_count"],
|
||||||
|
"owner_mode": owner_mode,
|
||||||
|
"dry_run": dry_run,
|
||||||
|
}
|
||||||
|
|
||||||
|
if allowed:
|
||||||
|
nudge = call_main_agent(state=state, timestamp=timestamp, dry_run=dry_run)
|
||||||
|
result["main_agent_nudge"] = nudge
|
||||||
|
if nudge.get("ok"):
|
||||||
|
mark_sent(bucket, "main-agent", timestamp, {"state": state})
|
||||||
|
result["ok"] = nudge.get("ok", False)
|
||||||
|
else:
|
||||||
|
result.update({"ok": True, "action": "nudge-suppressed"})
|
||||||
|
|
||||||
|
should_escalate = owner_mode in {"always", "escalate"} and bucket["seen_count"] >= escalation_after
|
||||||
|
if owner_mode == "never":
|
||||||
|
should_escalate = False
|
||||||
|
|
||||||
|
if should_escalate:
|
||||||
|
owner_allowed, owner_reason = should_send(bucket, nudge_min, datetime.fromisoformat(timestamp))
|
||||||
|
result["owner_escalation_gate"] = {"allowed": owner_allowed, "reason": owner_reason, "threshold": escalation_after}
|
||||||
|
if owner_allowed:
|
||||||
|
detail = "Main agent was nudged repeatedly; please review whether manual intervention is needed."
|
||||||
|
enqueue = enqueue_owner_report(state=state, timestamp=timestamp, dry_run=dry_run, detail=detail)
|
||||||
|
result["owner_enqueue"] = enqueue
|
||||||
|
result["ok"] = result.get("ok", True) and enqueue.get("ok", False)
|
||||||
|
if enqueue.get("ok"):
|
||||||
|
mark_sent(bucket, "owner-report-enqueue", timestamp, {"report_id": enqueue.get("report_id"), "state": state})
|
||||||
|
owner_delivery_mode = os.environ.get(
|
||||||
|
"WATCHDOG_B_OWNER_DELIVERY_MODE",
|
||||||
|
"enqueue-only",
|
||||||
|
).lower()
|
||||||
|
result["owner_delivery_mode"] = owner_delivery_mode
|
||||||
|
if owner_delivery_mode == "direct-discord":
|
||||||
|
deliver = deliver_owner_report(report_id=enqueue.get("report_id"), dry_run=dry_run)
|
||||||
|
result["owner_deliver"] = deliver
|
||||||
|
result["ok"] = result.get("ok", True) and deliver.get("ok", False)
|
||||||
|
if deliver.get("ok"):
|
||||||
|
mark_sent(bucket, "owner-report-direct-delivery", timestamp, {"report_id": enqueue.get("report_id"), "state": state})
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> int:
|
||||||
|
ap = argparse.ArgumentParser(description="Notification layer for watchdog-b")
|
||||||
|
ap.add_argument("--state", required=True, choices=["running", "stalled", "idle"])
|
||||||
|
ap.add_argument("--timestamp", default=now_iso())
|
||||||
|
ap.add_argument("--dry-run", action="store_true")
|
||||||
|
args = ap.parse_args()
|
||||||
|
|
||||||
|
data = load_state()
|
||||||
|
bucket = get_bucket(data, args.state)
|
||||||
|
|
||||||
|
if args.state == "running":
|
||||||
|
result = maybe_running_report(data, bucket, args.timestamp, args.dry_run)
|
||||||
|
else:
|
||||||
|
result = maybe_nudge_and_escalate(data, bucket, state=args.state, timestamp=args.timestamp, dry_run=args.dry_run)
|
||||||
|
|
||||||
|
bucket["last_seen_at"] = args.timestamp
|
||||||
|
bucket["last_result"] = result
|
||||||
|
save_state(data)
|
||||||
|
print(json.dumps(result, ensure_ascii=False, indent=2))
|
||||||
|
return 0 if result.get("ok", False) else 1
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
raise SystemExit(main())
|
||||||
17
scripts/openclaw-watchdog-b.service
Normal file
17
scripts/openclaw-watchdog-b.service
Normal file
@@ -0,0 +1,17 @@
|
|||||||
|
# Template systemd --user unit for Watchdog B.
|
||||||
|
# Install to: ~/.config/systemd/user/openclaw-watchdog-b.service
|
||||||
|
# Optional env file: ~/.config/openclaw/watchdog-b.env
|
||||||
|
|
||||||
|
[Unit]
|
||||||
|
Description=OpenClaw Watchdog B (verified direct Discord owner-facing path)
|
||||||
|
After=network-online.target
|
||||||
|
Wants=network-online.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=oneshot
|
||||||
|
WorkingDirectory=%h/.openclaw/workspace
|
||||||
|
Environment=WATCHDOG_B_CONFIG_FILE=%h/.config/openclaw/watchdog-b.env
|
||||||
|
EnvironmentFile=-%h/.config/openclaw/watchdog-b.env
|
||||||
|
ExecStart=%h/.openclaw/workspace/scripts/watchdog-b/run_watchdog_b.sh
|
||||||
|
StandardOutput=journal
|
||||||
|
StandardError=journal
|
||||||
15
scripts/openclaw-watchdog-b.timer
Normal file
15
scripts/openclaw-watchdog-b.timer
Normal file
@@ -0,0 +1,15 @@
|
|||||||
|
# Template systemd --user timer (DO NOT auto-install).
|
||||||
|
# Runs every 10 minutes.
|
||||||
|
|
||||||
|
[Unit]
|
||||||
|
Description=Run OpenClaw Watchdog B every 10 minutes
|
||||||
|
|
||||||
|
[Timer]
|
||||||
|
OnCalendar=*:0/10
|
||||||
|
Persistent=true
|
||||||
|
# Optional jitter to avoid synchronized runs
|
||||||
|
RandomizedDelaySec=30
|
||||||
|
Unit=openclaw-watchdog-b.service
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=timers.target
|
||||||
200
scripts/openclaw_runtime_probe.py
Normal file
200
scripts/openclaw_runtime_probe.py
Normal file
@@ -0,0 +1,200 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import shutil
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Iterable
|
||||||
|
|
||||||
|
HOME = Path.home()
|
||||||
|
ENV_KEYS = {
|
||||||
|
"node": "WATCHDOG_B_NODE_BIN",
|
||||||
|
"openclaw_mjs": "WATCHDOG_B_OPENCLAW_MJS",
|
||||||
|
"openclaw_entry": "WATCHDOG_B_OPENCLAW_ENTRY",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def dedupe(items: Iterable[Path]) -> list[Path]:
|
||||||
|
seen: set[str] = set()
|
||||||
|
out: list[Path] = []
|
||||||
|
for item in items:
|
||||||
|
key = str(item)
|
||||||
|
if key in seen:
|
||||||
|
continue
|
||||||
|
seen.add(key)
|
||||||
|
out.append(item)
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def path_candidates() -> tuple[Path | None, list[Path], list[Path]]:
|
||||||
|
node_path = shutil.which("node")
|
||||||
|
openclaw_path = shutil.which("openclaw")
|
||||||
|
node_candidate = Path(node_path).resolve() if node_path else None
|
||||||
|
roots: list[Path] = []
|
||||||
|
entry_candidates: list[Path] = []
|
||||||
|
if openclaw_path:
|
||||||
|
op = Path(openclaw_path).resolve()
|
||||||
|
roots.extend([
|
||||||
|
op.parent.parent / "lib" / "node_modules" / "openclaw",
|
||||||
|
op.parent.parent.parent / "lib" / "node_modules" / "openclaw",
|
||||||
|
])
|
||||||
|
entry_candidates.append(op.parent.parent / "lib" / "node_modules" / "openclaw" / "dist" / "entry.js")
|
||||||
|
if node_candidate:
|
||||||
|
roots.append(node_candidate.parent.parent / "lib" / "node_modules" / "openclaw")
|
||||||
|
return node_candidate, dedupe(roots), dedupe(entry_candidates)
|
||||||
|
|
||||||
|
|
||||||
|
def common_roots() -> list[Path]:
|
||||||
|
roots: list[Path] = []
|
||||||
|
nvm_dir = Path(os.environ.get("NVM_DIR", HOME / ".nvm")).expanduser()
|
||||||
|
roots.extend([
|
||||||
|
HOME / ".nvm" / "versions" / "node",
|
||||||
|
nvm_dir / "versions" / "node",
|
||||||
|
HOME / ".local" / "share" / "pnpm" / "global",
|
||||||
|
HOME / ".npm-global",
|
||||||
|
Path("/usr/local"),
|
||||||
|
Path("/usr"),
|
||||||
|
HOME / ".volta" / "tools" / "image",
|
||||||
|
])
|
||||||
|
return dedupe(roots)
|
||||||
|
|
||||||
|
|
||||||
|
def scan_openclaw_install_roots() -> list[Path]:
|
||||||
|
candidates: list[Path] = []
|
||||||
|
for root in common_roots():
|
||||||
|
if not root.exists():
|
||||||
|
continue
|
||||||
|
if root.name == "node":
|
||||||
|
for child in sorted(root.glob("v*/lib/node_modules/openclaw"), reverse=True):
|
||||||
|
candidates.append(child)
|
||||||
|
continue
|
||||||
|
patterns = [
|
||||||
|
"lib/node_modules/openclaw",
|
||||||
|
"node_modules/openclaw",
|
||||||
|
"*/lib/node_modules/openclaw",
|
||||||
|
"*/node_modules/openclaw",
|
||||||
|
]
|
||||||
|
for pattern in patterns:
|
||||||
|
for child in sorted(root.glob(pattern), reverse=True):
|
||||||
|
candidates.append(child)
|
||||||
|
return dedupe(candidates)
|
||||||
|
|
||||||
|
|
||||||
|
def valid_node(path: Path | None) -> Path | None:
|
||||||
|
if path and path.exists() and os.access(path, os.X_OK):
|
||||||
|
return path
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def valid_file(path: Path | None) -> Path | None:
|
||||||
|
if path and path.is_file():
|
||||||
|
return path
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def detect_runtime() -> dict[str, object]:
|
||||||
|
result: dict[str, object] = {"ok": False, "detected": {}, "sources": {}, "searched": {}}
|
||||||
|
detected: dict[str, str] = {}
|
||||||
|
sources: dict[str, str] = {}
|
||||||
|
searched: dict[str, list[str]] = {"node": [], "openclaw": []}
|
||||||
|
|
||||||
|
env_node = os.environ.get(ENV_KEYS["node"])
|
||||||
|
if env_node:
|
||||||
|
searched["node"].append(env_node)
|
||||||
|
node = valid_node(Path(env_node).expanduser())
|
||||||
|
if node:
|
||||||
|
detected["node"] = str(node)
|
||||||
|
sources["node"] = f"env:{ENV_KEYS['node']}"
|
||||||
|
env_mjs = os.environ.get(ENV_KEYS["openclaw_mjs"])
|
||||||
|
if env_mjs:
|
||||||
|
searched["openclaw"].append(env_mjs)
|
||||||
|
mjs = valid_file(Path(env_mjs).expanduser())
|
||||||
|
if mjs:
|
||||||
|
detected["openclaw_mjs"] = str(mjs)
|
||||||
|
sources["openclaw_mjs"] = f"env:{ENV_KEYS['openclaw_mjs']}"
|
||||||
|
env_entry = os.environ.get(ENV_KEYS["openclaw_entry"])
|
||||||
|
if env_entry:
|
||||||
|
searched["openclaw"].append(env_entry)
|
||||||
|
entry = valid_file(Path(env_entry).expanduser())
|
||||||
|
if entry:
|
||||||
|
detected["openclaw_entry"] = str(entry)
|
||||||
|
sources["openclaw_entry"] = f"env:{ENV_KEYS['openclaw_entry']}"
|
||||||
|
|
||||||
|
path_node, path_roots, path_entry_candidates = path_candidates()
|
||||||
|
if "node" not in detected and path_node:
|
||||||
|
searched["node"].append(str(path_node))
|
||||||
|
node = valid_node(path_node)
|
||||||
|
if node:
|
||||||
|
detected["node"] = str(node)
|
||||||
|
sources["node"] = "path:node"
|
||||||
|
|
||||||
|
install_roots = dedupe(path_roots + path_entry_candidates + scan_openclaw_install_roots())
|
||||||
|
searched["openclaw"].extend(str(p) for p in install_roots)
|
||||||
|
|
||||||
|
def fill_from_root(root: Path, source: str) -> None:
|
||||||
|
if root.is_file():
|
||||||
|
candidate_entry = valid_file(root)
|
||||||
|
if candidate_entry and candidate_entry.name == "entry.js" and "openclaw_entry" not in detected:
|
||||||
|
detected["openclaw_entry"] = str(candidate_entry)
|
||||||
|
sources["openclaw_entry"] = source
|
||||||
|
root = candidate_entry.parent.parent
|
||||||
|
elif candidate_entry and candidate_entry.name == "openclaw.mjs" and "openclaw_mjs" not in detected:
|
||||||
|
detected["openclaw_mjs"] = str(candidate_entry)
|
||||||
|
sources["openclaw_mjs"] = source
|
||||||
|
root = candidate_entry.parent
|
||||||
|
else:
|
||||||
|
return
|
||||||
|
candidate_mjs = valid_file(root / "openclaw.mjs")
|
||||||
|
candidate_entry = valid_file(root / "dist" / "entry.js")
|
||||||
|
if candidate_mjs and "openclaw_mjs" not in detected:
|
||||||
|
detected["openclaw_mjs"] = str(candidate_mjs)
|
||||||
|
sources["openclaw_mjs"] = source
|
||||||
|
if candidate_entry and "openclaw_entry" not in detected:
|
||||||
|
detected["openclaw_entry"] = str(candidate_entry)
|
||||||
|
sources["openclaw_entry"] = source
|
||||||
|
|
||||||
|
for root in install_roots:
|
||||||
|
source = "path:openclaw" if root in path_roots or root in path_entry_candidates else "scan:common-locations"
|
||||||
|
fill_from_root(root, source)
|
||||||
|
if all(k in detected for k in ("openclaw_mjs", "openclaw_entry")):
|
||||||
|
break
|
||||||
|
|
||||||
|
result["detected"] = detected
|
||||||
|
result["sources"] = sources
|
||||||
|
result["searched"] = searched
|
||||||
|
result["ok"] = all(k in detected for k in ("node", "openclaw_mjs", "openclaw_entry"))
|
||||||
|
if not result["ok"]:
|
||||||
|
missing = [k for k in ("node", "openclaw_mjs", "openclaw_entry") if k not in detected]
|
||||||
|
result["missing"] = missing
|
||||||
|
result["error"] = (
|
||||||
|
"Could not auto-detect: " + ", ".join(missing) + ". "
|
||||||
|
"Set WATCHDOG_B_NODE_BIN / WATCHDOG_B_OPENCLAW_MJS / WATCHDOG_B_OPENCLAW_ENTRY explicitly if this host uses a non-standard install path."
|
||||||
|
)
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> int:
|
||||||
|
parser = argparse.ArgumentParser(description="Detect node/openclaw runtime paths for watchdog-b scripts")
|
||||||
|
parser.add_argument("--shell", action="store_true", help="print shell export lines")
|
||||||
|
parser.add_argument("--pretty", action="store_true", help="pretty-print json")
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
result = detect_runtime()
|
||||||
|
if args.shell:
|
||||||
|
if not result["ok"]:
|
||||||
|
print(result["error"], flush=True)
|
||||||
|
return 1
|
||||||
|
detected = result["detected"]
|
||||||
|
print(f'WATCHDOG_B_NODE_BIN={detected["node"]}')
|
||||||
|
print(f'WATCHDOG_B_OPENCLAW_MJS={detected["openclaw_mjs"]}')
|
||||||
|
print(f'WATCHDOG_B_OPENCLAW_ENTRY={detected["openclaw_entry"]}')
|
||||||
|
return 0
|
||||||
|
|
||||||
|
print(json.dumps(result, ensure_ascii=False, indent=2 if args.pretty else None))
|
||||||
|
return 0 if result["ok"] else 1
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
raise SystemExit(main())
|
||||||
75
scripts/owner_report_consumer.py
Normal file
75
scripts/owner_report_consumer.py
Normal file
@@ -0,0 +1,75 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Minimal owner-report consumer.
|
||||||
|
|
||||||
|
Reads a pending owner report markdown file with simple front-matter-like key/value
|
||||||
|
lines and emits normalized JSON to stdout.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
OWNER_REPORT_ROOT = Path.home() / ".clawteam" / "owner-reports"
|
||||||
|
PENDING_DIR = OWNER_REPORT_ROOT / "pending"
|
||||||
|
|
||||||
|
|
||||||
|
def parse_pending_report(path: Path) -> dict:
|
||||||
|
raw = path.read_text(encoding="utf-8")
|
||||||
|
data: dict[str, str] = {}
|
||||||
|
for line in raw.splitlines():
|
||||||
|
line = line.strip()
|
||||||
|
if not line or ":" not in line:
|
||||||
|
continue
|
||||||
|
key, value = line.split(":", 1)
|
||||||
|
data[key.strip()] = value.strip()
|
||||||
|
|
||||||
|
return {
|
||||||
|
"ok": True,
|
||||||
|
"path": str(path),
|
||||||
|
"filename": path.name,
|
||||||
|
"report_id": data.get("report_id") or path.stem,
|
||||||
|
"team": data.get("team"),
|
||||||
|
"source": data.get("source"),
|
||||||
|
"report_kind": data.get("report_kind") or "checkpoint",
|
||||||
|
"created_at": data.get("created_at"),
|
||||||
|
"message": _unquote(data.get("message", "")),
|
||||||
|
"raw": data,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _unquote(value: str) -> str:
|
||||||
|
value = value.strip()
|
||||||
|
if len(value) >= 2 and value[0] == '"' and value[-1] == '"':
|
||||||
|
return value[1:-1]
|
||||||
|
return value
|
||||||
|
|
||||||
|
|
||||||
|
def resolve_input(name_or_path: str) -> Path:
|
||||||
|
p = Path(name_or_path).expanduser()
|
||||||
|
if p.exists():
|
||||||
|
return p
|
||||||
|
candidate = PENDING_DIR / name_or_path
|
||||||
|
if candidate.exists():
|
||||||
|
return candidate
|
||||||
|
if not candidate.suffix:
|
||||||
|
md_candidate = candidate.with_suffix(".md")
|
||||||
|
if md_candidate.exists():
|
||||||
|
return md_candidate
|
||||||
|
raise FileNotFoundError(f"pending report not found: {name_or_path}")
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> int:
|
||||||
|
ap = argparse.ArgumentParser(description="Emit JSON for a pending owner report")
|
||||||
|
ap.add_argument("report", help="Pending report path, filename, or report_id")
|
||||||
|
args = ap.parse_args()
|
||||||
|
|
||||||
|
path = resolve_input(args.report)
|
||||||
|
payload = parse_pending_report(path)
|
||||||
|
print(json.dumps(payload, ensure_ascii=False, indent=2))
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
raise SystemExit(main())
|
||||||
118
scripts/owner_report_driver.py
Normal file
118
scripts/owner_report_driver.py
Normal file
@@ -0,0 +1,118 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Minimal owner-report driver.
|
||||||
|
|
||||||
|
Consumes one pending owner report, calls an external send command, and only moves
|
||||||
|
it to sent/ after the send command succeeds.
|
||||||
|
|
||||||
|
This is a deliberately small manual driver for debugging the owner-report chain.
|
||||||
|
It does not watch directories, retry, or send anything by itself.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import subprocess
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from owner_report_consumer import OWNER_REPORT_ROOT, PENDING_DIR, parse_pending_report, resolve_input
|
||||||
|
|
||||||
|
SENT_DIR = OWNER_REPORT_ROOT / "sent"
|
||||||
|
|
||||||
|
|
||||||
|
def _build_send_env(payload: dict) -> dict[str, str]:
|
||||||
|
env = os.environ.copy()
|
||||||
|
env.update(
|
||||||
|
{
|
||||||
|
"OWNER_REPORT_JSON": json.dumps(payload, ensure_ascii=False),
|
||||||
|
"OWNER_REPORT_ID": str(payload.get("report_id") or ""),
|
||||||
|
"OWNER_REPORT_TEAM": str(payload.get("team") or ""),
|
||||||
|
"OWNER_REPORT_SOURCE": str(payload.get("source") or ""),
|
||||||
|
"OWNER_REPORT_KIND": str(payload.get("report_kind") or "checkpoint"),
|
||||||
|
"OWNER_REPORT_CREATED_AT": str(payload.get("created_at") or ""),
|
||||||
|
"OWNER_REPORT_MESSAGE": str(payload.get("message") or ""),
|
||||||
|
"OWNER_REPORT_PATH": str(payload.get("path") or ""),
|
||||||
|
}
|
||||||
|
)
|
||||||
|
return env
|
||||||
|
|
||||||
|
|
||||||
|
def _sent_path(src: Path) -> Path:
|
||||||
|
SENT_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
|
return SENT_DIR / src.name
|
||||||
|
|
||||||
|
|
||||||
|
def _finalize_successful_send(src: Path) -> dict[str, object]:
|
||||||
|
dest = _sent_path(src)
|
||||||
|
if src.exists():
|
||||||
|
src.rename(dest)
|
||||||
|
return {"moved": True, "already_archived": False, "final_path": str(dest)}
|
||||||
|
|
||||||
|
if dest.exists():
|
||||||
|
return {"moved": False, "already_archived": True, "final_path": str(dest)}
|
||||||
|
|
||||||
|
raise FileNotFoundError(
|
||||||
|
f"successful send completed but pending report disappeared before archiving: pending={src} sent={dest}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> int:
|
||||||
|
ap = argparse.ArgumentParser(description="Send one pending owner report via external command")
|
||||||
|
ap.add_argument("report", help="Pending report path, filename, or report_id")
|
||||||
|
ap.add_argument(
|
||||||
|
"--send-cmd",
|
||||||
|
help="Shell command used to send the report. Can also come from OWNER_REPORT_SEND_CMD.",
|
||||||
|
)
|
||||||
|
ap.add_argument("--dry-run", action="store_true", help="Print what would be sent and do not move files")
|
||||||
|
args = ap.parse_args()
|
||||||
|
|
||||||
|
src = resolve_input(args.report)
|
||||||
|
payload = parse_pending_report(src)
|
||||||
|
|
||||||
|
send_cmd = args.send_cmd or os.environ.get("OWNER_REPORT_SEND_CMD")
|
||||||
|
if not send_cmd and not args.dry_run:
|
||||||
|
raise SystemExit("missing send command: use --send-cmd or OWNER_REPORT_SEND_CMD")
|
||||||
|
|
||||||
|
if args.dry_run:
|
||||||
|
print(json.dumps({
|
||||||
|
"ok": True,
|
||||||
|
"dry_run": True,
|
||||||
|
"action": "would_send",
|
||||||
|
"pending_path": str(src),
|
||||||
|
"sent_path": str(_sent_path(src)),
|
||||||
|
"payload": payload,
|
||||||
|
"send_cmd": send_cmd,
|
||||||
|
}, ensure_ascii=False, indent=2))
|
||||||
|
return 0
|
||||||
|
|
||||||
|
proc = subprocess.run(
|
||||||
|
["bash", "-lc", send_cmd],
|
||||||
|
text=True,
|
||||||
|
capture_output=True,
|
||||||
|
env=_build_send_env(payload),
|
||||||
|
)
|
||||||
|
|
||||||
|
result = {
|
||||||
|
"ok": proc.returncode == 0,
|
||||||
|
"dry_run": False,
|
||||||
|
"pending_path": str(src),
|
||||||
|
"sent_path": str(_sent_path(src)),
|
||||||
|
"send_cmd": send_cmd,
|
||||||
|
"exit_code": proc.returncode,
|
||||||
|
"stdout": proc.stdout,
|
||||||
|
"stderr": proc.stderr,
|
||||||
|
"payload": payload,
|
||||||
|
}
|
||||||
|
|
||||||
|
if proc.returncode != 0:
|
||||||
|
print(json.dumps(result, ensure_ascii=False, indent=2))
|
||||||
|
return proc.returncode
|
||||||
|
|
||||||
|
result.update(_finalize_successful_send(src))
|
||||||
|
print(json.dumps(result, ensure_ascii=False, indent=2))
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
raise SystemExit(main())
|
||||||
143
scripts/owner_report_producer.py
Normal file
143
scripts/owner_report_producer.py
Normal file
@@ -0,0 +1,143 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Minimal owner-report producer for ClawTeam-style worker checkpoints.
|
||||||
|
|
||||||
|
Writes ~/.clawteam/owner-reports/pending/<report_id>.md using explicit checkpoint
|
||||||
|
fields and a human-readable message suitable for direct Telegram delivery.
|
||||||
|
|
||||||
|
This intentionally stays tiny:
|
||||||
|
- no daemon
|
||||||
|
- no event bus
|
||||||
|
- no parser for arbitrary logs
|
||||||
|
- just explicit fields in -> pending markdown out
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import re
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from owner_report_consumer import OWNER_REPORT_ROOT
|
||||||
|
|
||||||
|
PENDING_DIR = OWNER_REPORT_ROOT / "pending"
|
||||||
|
|
||||||
|
|
||||||
|
def _slug(value: str) -> str:
|
||||||
|
slug = re.sub(r"[^a-zA-Z0-9._-]+", "-", value.strip()).strip("-._")
|
||||||
|
return slug or "report"
|
||||||
|
|
||||||
|
|
||||||
|
def _now_iso() -> str:
|
||||||
|
return datetime.now().astimezone().isoformat(timespec="seconds")
|
||||||
|
|
||||||
|
|
||||||
|
def build_message(*, team: str, worker: str, task_id: str, progress: str, done: str, next_step: str, status: str, source: str | None, report_kind: str) -> str:
|
||||||
|
headline = f"🔔 [{team}] {worker}"
|
||||||
|
if report_kind == "leader-final":
|
||||||
|
headline = f"✅ [{team}] final"
|
||||||
|
|
||||||
|
lines = [
|
||||||
|
headline,
|
||||||
|
done,
|
||||||
|
]
|
||||||
|
|
||||||
|
if next_step.strip():
|
||||||
|
lines.append(f"→ {next_step}")
|
||||||
|
|
||||||
|
tech = [
|
||||||
|
f"task={task_id}",
|
||||||
|
f"status={status}",
|
||||||
|
f"progress={progress}",
|
||||||
|
]
|
||||||
|
if source:
|
||||||
|
tech.append(f"source={source}")
|
||||||
|
lines.append(" | ".join(tech))
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def build_report_body(*, report_id: str, team: str, worker: str, task_id: str, progress: str, done: str, next_step: str, status: str, source: str | None, created_at: str, message: str, report_kind: str) -> str:
|
||||||
|
fields: list[tuple[str, str | None]] = [
|
||||||
|
("report_id", report_id),
|
||||||
|
("team", team),
|
||||||
|
("worker", worker),
|
||||||
|
("task_id", task_id),
|
||||||
|
("progress", progress),
|
||||||
|
("done", done),
|
||||||
|
("next", next_step),
|
||||||
|
("status", status),
|
||||||
|
("report_kind", report_kind),
|
||||||
|
("source", source),
|
||||||
|
("created_at", created_at),
|
||||||
|
("message", json.dumps(message, ensure_ascii=False)),
|
||||||
|
]
|
||||||
|
return "\n".join(f"{k}: {v}" for k, v in fields if v is not None) + "\n"
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> int:
|
||||||
|
ap = argparse.ArgumentParser(description="Create one pending owner report from explicit checkpoint fields")
|
||||||
|
ap.add_argument("--team", required=True)
|
||||||
|
ap.add_argument("--worker", required=True)
|
||||||
|
ap.add_argument("--task-id", required=True)
|
||||||
|
ap.add_argument("--progress", required=True)
|
||||||
|
ap.add_argument("--done", required=True)
|
||||||
|
ap.add_argument("--next", dest="next_step", required=True)
|
||||||
|
ap.add_argument("--status", required=True)
|
||||||
|
ap.add_argument("--source")
|
||||||
|
ap.add_argument("--report-kind", choices=["checkpoint", "leader-final"], default="checkpoint")
|
||||||
|
ap.add_argument("--report-id", help="Optional explicit report_id / filename stem")
|
||||||
|
ap.add_argument("--created-at", default=_now_iso())
|
||||||
|
ap.add_argument("--dry-run", action="store_true")
|
||||||
|
args = ap.parse_args()
|
||||||
|
|
||||||
|
report_id = args.report_id or f"{_slug(args.team)}-{_slug(args.worker)}-{_slug(args.task_id)}-{_slug(args.report_kind)}"
|
||||||
|
message = build_message(
|
||||||
|
team=args.team,
|
||||||
|
worker=args.worker,
|
||||||
|
task_id=args.task_id,
|
||||||
|
progress=args.progress,
|
||||||
|
done=args.done,
|
||||||
|
next_step=args.next_step,
|
||||||
|
status=args.status,
|
||||||
|
source=args.source,
|
||||||
|
report_kind=args.report_kind,
|
||||||
|
)
|
||||||
|
body = build_report_body(
|
||||||
|
report_id=report_id,
|
||||||
|
team=args.team,
|
||||||
|
worker=args.worker,
|
||||||
|
task_id=args.task_id,
|
||||||
|
progress=args.progress,
|
||||||
|
done=args.done,
|
||||||
|
next_step=args.next_step,
|
||||||
|
status=args.status,
|
||||||
|
source=args.source,
|
||||||
|
created_at=args.created_at,
|
||||||
|
message=message,
|
||||||
|
report_kind=args.report_kind,
|
||||||
|
)
|
||||||
|
|
||||||
|
path = PENDING_DIR / f"{report_id}.md"
|
||||||
|
|
||||||
|
result = {
|
||||||
|
"ok": True,
|
||||||
|
"report_id": report_id,
|
||||||
|
"path": str(path),
|
||||||
|
"message": message,
|
||||||
|
"dry_run": args.dry_run,
|
||||||
|
}
|
||||||
|
|
||||||
|
if args.dry_run:
|
||||||
|
result["body"] = body
|
||||||
|
print(json.dumps(result, ensure_ascii=False, indent=2))
|
||||||
|
return 0
|
||||||
|
|
||||||
|
PENDING_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
|
path.write_text(body, encoding="utf-8")
|
||||||
|
print(json.dumps(result, ensure_ascii=False, indent=2))
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
raise SystemExit(main())
|
||||||
141
scripts/run_watchdog_b.sh
Executable file
141
scripts/run_watchdog_b.sh
Executable file
@@ -0,0 +1,141 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
# Watchdog B v2 dispatcher/runner.
|
||||||
|
# Unified entrypoint for timer/service/manual runs.
|
||||||
|
#
|
||||||
|
# Flow:
|
||||||
|
# 1) Call check_openclaw_state.sh to get one of: running | stalled | idle
|
||||||
|
# 2) Emit a human-readable action template for the detected state
|
||||||
|
# 3) Invoke the notification layer (dry-run/manual by default, configurable)
|
||||||
|
# 4) Persist rendered output for local verification / future integrations
|
||||||
|
#
|
||||||
|
# Notification behavior is intentionally conservative:
|
||||||
|
# - running: defaults to a manual/queue-ready owner report path
|
||||||
|
# - stalled/idle: nudge main agent first, then optionally escalate to owner report
|
||||||
|
# - outbound owner messaging reuses the existing owner-reporting-system queue
|
||||||
|
|
||||||
|
SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
SKILL_DIR="$(cd -- "$SCRIPT_DIR/.." && pwd)"
|
||||||
|
WATCHDOG_B_CONFIG_FILE_DEFAULT="$HOME/.config/openclaw/watchdog-b.env"
|
||||||
|
WATCHDOG_B_CONFIG_FILE="${WATCHDOG_B_CONFIG_FILE:-$WATCHDOG_B_CONFIG_FILE_DEFAULT}"
|
||||||
|
if [[ -f "$WATCHDOG_B_CONFIG_FILE" ]]; then
|
||||||
|
set -a
|
||||||
|
# shellcheck disable=SC1090
|
||||||
|
. "$WATCHDOG_B_CONFIG_FILE"
|
||||||
|
set +a
|
||||||
|
fi
|
||||||
|
|
||||||
|
WORKSPACE_DEFAULT="$HOME/.openclaw/workspace"
|
||||||
|
WORKSPACE_DIR="${WATCHDOG_B_WORKSPACE:-$WORKSPACE_DEFAULT}"
|
||||||
|
CHECKER="${WATCHDOG_B_CHECKER:-$SCRIPT_DIR/check_openclaw_state.sh}"
|
||||||
|
ARTIFACT_DIR="${WATCHDOG_B_ARTIFACT_DIR:-$WORKSPACE_DIR/state/watchdog-b}"
|
||||||
|
TIMESTAMP="$(date '+%Y-%m-%dT%H:%M:%S%z')"
|
||||||
|
HOSTNAME_VALUE="$(hostname 2>/dev/null || echo unknown-host)"
|
||||||
|
NOTIFIER="${WATCHDOG_B_NOTIFIER:-$SCRIPT_DIR/notify_watchdog_b.py}"
|
||||||
|
NOTIFY_DRY_RUN="${WATCHDOG_B_NOTIFY_DRY_RUN:-1}"
|
||||||
|
|
||||||
|
mkdir -p "$ARTIFACT_DIR"
|
||||||
|
|
||||||
|
if [[ ! -x "$CHECKER" ]]; then
|
||||||
|
echo "watchdog-b error: checker not executable: $CHECKER" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
STATE="$($CHECKER)"
|
||||||
|
|
||||||
|
emit_running() {
|
||||||
|
cat <<EOF
|
||||||
|
WATCHDOG_B_STATE=running
|
||||||
|
WATCHDOG_B_TIMESTAMP=$TIMESTAMP
|
||||||
|
WATCHDOG_B_HOST=$HOSTNAME_VALUE
|
||||||
|
WATCHDOG_B_ACTION=progress-template
|
||||||
|
WATCHDOG_B_TEMPLATE_BEGIN
|
||||||
|
[watchdog-b][running] OpenClaw main runtime appears active.
|
||||||
|
Suggested future progress-report message template:
|
||||||
|
- Status: still running
|
||||||
|
- Checked at: $TIMESTAMP
|
||||||
|
- Host: $HOSTNAME_VALUE
|
||||||
|
- Summary: main runtime is alive and log activity is fresh.
|
||||||
|
- Next step: if desired, attach latest task/progress snapshot before sending.
|
||||||
|
WATCHDOG_B_TEMPLATE_END
|
||||||
|
WATCHDOG_B_NEXT_HOOK=progress_report_stub
|
||||||
|
EOF
|
||||||
|
}
|
||||||
|
|
||||||
|
emit_stalled() {
|
||||||
|
cat <<EOF
|
||||||
|
WATCHDOG_B_STATE=stalled
|
||||||
|
WATCHDOG_B_TIMESTAMP=$TIMESTAMP
|
||||||
|
WATCHDOG_B_HOST=$HOSTNAME_VALUE
|
||||||
|
WATCHDOG_B_ACTION=nudge-template
|
||||||
|
WATCHDOG_B_TEMPLATE_BEGIN
|
||||||
|
[watchdog-b][stalled] OpenClaw main runtime looks alive but may be stuck.
|
||||||
|
Suggested future nudge/escalation template:
|
||||||
|
- Audience: main agent and/or Eric
|
||||||
|
- Checked at: $TIMESTAMP
|
||||||
|
- Host: $HOSTNAME_VALUE
|
||||||
|
- Observation: process is alive, but activity log appears stale beyond threshold.
|
||||||
|
- Suggested ask: please confirm current task state, unblock reason, or whether intervention is needed.
|
||||||
|
WATCHDOG_B_TEMPLATE_END
|
||||||
|
WATCHDOG_B_NEXT_HOOK=stalled_nudge_stub
|
||||||
|
EOF
|
||||||
|
}
|
||||||
|
|
||||||
|
emit_idle() {
|
||||||
|
cat <<EOF
|
||||||
|
WATCHDOG_B_STATE=idle
|
||||||
|
WATCHDOG_B_TIMESTAMP=$TIMESTAMP
|
||||||
|
WATCHDOG_B_HOST=$HOSTNAME_VALUE
|
||||||
|
WATCHDOG_B_ACTION=idle-template
|
||||||
|
WATCHDOG_B_TEMPLATE_BEGIN
|
||||||
|
[watchdog-b][idle] OpenClaw main runtime does not appear to be actively running.
|
||||||
|
Suggested future reminder template:
|
||||||
|
- Audience: main agent and/or Eric
|
||||||
|
- Checked at: $TIMESTAMP
|
||||||
|
- Host: $HOSTNAME_VALUE
|
||||||
|
- Observation: no live runtime detected from pid/log heuristic.
|
||||||
|
- Suggested ask: confirm whether the runtime should be started, ignored, or left idle.
|
||||||
|
WATCHDOG_B_TEMPLATE_END
|
||||||
|
WATCHDOG_B_NEXT_HOOK=idle_reminder_stub
|
||||||
|
EOF
|
||||||
|
}
|
||||||
|
|
||||||
|
case "$STATE" in
|
||||||
|
running)
|
||||||
|
OUTPUT="$(emit_running)"
|
||||||
|
;;
|
||||||
|
stalled)
|
||||||
|
OUTPUT="$(emit_stalled)"
|
||||||
|
;;
|
||||||
|
idle)
|
||||||
|
OUTPUT="$(emit_idle)"
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
echo "watchdog-b error: unexpected state from checker: $STATE" >&2
|
||||||
|
exit 2
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
printf '%s\n' "$OUTPUT"
|
||||||
|
|
||||||
|
NOTIFY_OUTPUT=""
|
||||||
|
if [[ -x "$NOTIFIER" ]]; then
|
||||||
|
NOTIFY_CMD=("$NOTIFIER" --state "$STATE" --timestamp "$TIMESTAMP")
|
||||||
|
if [[ "$NOTIFY_DRY_RUN" == "1" ]]; then
|
||||||
|
NOTIFY_CMD+=(--dry-run)
|
||||||
|
fi
|
||||||
|
if NOTIFY_OUTPUT="$(WATCHDOG_B_ARTIFACT_DIR="$ARTIFACT_DIR" "${NOTIFY_CMD[@]}" 2>&1)"; then
|
||||||
|
printf '%s\n' "$NOTIFY_OUTPUT"
|
||||||
|
else
|
||||||
|
printf '%s\n' "$NOTIFY_OUTPUT"
|
||||||
|
echo "watchdog-b warning: notifier returned non-zero for state=$STATE" >&2
|
||||||
|
fi
|
||||||
|
else
|
||||||
|
echo "watchdog-b warning: notifier not executable: $NOTIFIER" >&2
|
||||||
|
fi
|
||||||
|
|
||||||
|
printf '%s\n' "$OUTPUT" > "$ARTIFACT_DIR/last-output.txt"
|
||||||
|
printf '%s\n' "$NOTIFY_OUTPUT" > "$ARTIFACT_DIR/last-notify-output.txt"
|
||||||
|
printf '%s\t%s\n' "$TIMESTAMP" "$STATE" >> "$ARTIFACT_DIR/history.tsv"
|
||||||
|
printf '%s\n' "$STATE" > "$ARTIFACT_DIR/last-state.txt"
|
||||||
65
scripts/verify_watchdog_b_e2e.sh
Executable file
65
scripts/verify_watchdog_b_e2e.sh
Executable file
@@ -0,0 +1,65 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
WORKSPACE="$(cd -- "$SCRIPT_DIR/../.." && pwd)"
|
||||||
|
ARTIFACT_ROOT="${WATCHDOG_B_VERIFY_ROOT:-$WORKSPACE/state/watchdog-b-verify-e2e}"
|
||||||
|
RUN_ID="${RUN_ID:-$(date +%Y%m%dT%H%M%S)}"
|
||||||
|
RUN_DIR="$ARTIFACT_ROOT/$RUN_ID"
|
||||||
|
FIXTURE_DIR="$RUN_DIR/fixture"
|
||||||
|
LOG="$RUN_DIR/verify.log"
|
||||||
|
STATE_DIR="$RUN_DIR/state"
|
||||||
|
QUEUE_SNAPSHOT="$RUN_DIR/queue-before.txt"
|
||||||
|
QUEUE_AFTER="$RUN_DIR/queue-after.txt"
|
||||||
|
mkdir -p "$FIXTURE_DIR/host-runtime" "$FIXTURE_DIR/logs" "$STATE_DIR" "$RUN_DIR"
|
||||||
|
|
||||||
|
exec > >(tee -a "$LOG") 2>&1
|
||||||
|
|
||||||
|
echo "[verify] run_id=$RUN_ID"
|
||||||
|
echo "[verify] workspace=$WORKSPACE"
|
||||||
|
date -Iseconds
|
||||||
|
|
||||||
|
echo "[verify] snapshot owner-report queue before"
|
||||||
|
find "$HOME/.clawteam/owner-reports" -maxdepth 2 -type f | sort > "$QUEUE_SNAPSHOT" || true
|
||||||
|
|
||||||
|
sleep 180 &
|
||||||
|
FAKE_PID=$!
|
||||||
|
trap 'kill "$FAKE_PID" 2>/dev/null || true' EXIT
|
||||||
|
printf '%s\n' "$FAKE_PID" > "$FIXTURE_DIR/host-runtime/openclaw.pid"
|
||||||
|
touch "$FIXTURE_DIR/logs/openclaw.log"
|
||||||
|
|
||||||
|
echo "[verify] run watchdog-b direct E2E (enqueue + direct delivery)"
|
||||||
|
OPENCLAW_PID_FILE="$FIXTURE_DIR/host-runtime/openclaw.pid" \
|
||||||
|
OPENCLAW_LOG_FILE="$FIXTURE_DIR/logs/openclaw.log" \
|
||||||
|
STALL_AFTER_SECONDS=1200 \
|
||||||
|
WATCHDOG_B_ARTIFACT_DIR="$STATE_DIR" \
|
||||||
|
WATCHDOG_B_NOTIFY_DRY_RUN=0 \
|
||||||
|
WATCHDOG_B_RUNNING_REPORT_MODE=enqueue-and-drain \
|
||||||
|
WATCHDOG_B_RUNNING_REPORT_MIN_INTERVAL_SECONDS=0 \
|
||||||
|
"$WORKSPACE/scripts/watchdog-b/run_watchdog_b.sh" | tee "$RUN_DIR/run-output.txt"
|
||||||
|
|
||||||
|
echo "[verify] snapshot owner-report queue after"
|
||||||
|
find "$HOME/.clawteam/owner-reports" -maxdepth 2 -type f | sort > "$QUEUE_AFTER" || true
|
||||||
|
|
||||||
|
echo "[verify] summarize"
|
||||||
|
REPORT_ID="$(python3 - <<'PY' "$STATE_DIR/notify-state.json"
|
||||||
|
import json,sys
|
||||||
|
p=sys.argv[1]
|
||||||
|
with open(p,'r',encoding='utf-8') as f:
|
||||||
|
data=json.load(f)
|
||||||
|
print(data['events']['running']['last_result']['enqueue']['report_id'])
|
||||||
|
PY
|
||||||
|
)"
|
||||||
|
|
||||||
|
echo "REPORT_ID=$REPORT_ID" | tee "$RUN_DIR/result.env"
|
||||||
|
SENT_PATH="$HOME/.clawteam/owner-reports/sent/$REPORT_ID.md"
|
||||||
|
echo "SENT_PATH=$SENT_PATH" | tee -a "$RUN_DIR/result.env"
|
||||||
|
if [[ ! -f "$SENT_PATH" ]]; then
|
||||||
|
echo "[verify] ERROR: sent file missing: $SENT_PATH" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "[verify] sent file found"
|
||||||
|
sed -n '1,120p' "$SENT_PATH" | tee "$RUN_DIR/sent-head.txt"
|
||||||
|
|
||||||
|
echo "[verify] done"
|
||||||
40
scripts/watchdog-b.env.example
Normal file
40
scripts/watchdog-b.env.example
Normal file
@@ -0,0 +1,40 @@
|
|||||||
|
# Single source of truth for watchdog-b owner-facing policy.
|
||||||
|
# Preferred location: ~/.config/openclaw/watchdog-b.env
|
||||||
|
# Can also be loaded manually by:
|
||||||
|
# WATCHDOG_B_CONFIG_FILE=... ./scripts/watchdog-b/run_watchdog_b.sh
|
||||||
|
# WATCHDOG_B_CONFIG_FILE=... ./scripts/watchdog-b/notify_watchdog_b.py --state running
|
||||||
|
|
||||||
|
# --- delivery / runtime policy ---
|
||||||
|
WATCHDOG_B_NOTIFY_DRY_RUN=0
|
||||||
|
WATCHDOG_B_RUNNING_REPORT_MODE=enqueue-and-drain
|
||||||
|
WATCHDOG_B_RUNNING_REPORT_MIN_INTERVAL_SECONDS=3600
|
||||||
|
WATCHDOG_B_OWNER_DELIVERY_MODE=direct-discord
|
||||||
|
WATCHDOG_B_OWNER_REPORT_CHANNEL=discord
|
||||||
|
WATCHDOG_B_OWNER_REPORT_TARGET=channel:REPLACE_ME
|
||||||
|
|
||||||
|
# --- non-running escalation policy ---
|
||||||
|
# Set this only if the host actually has a valid OpenClaw agent id to nudge.
|
||||||
|
# If left unset, stalled/idle paths skip main-agent nudge and can still escalate owner-facing reports.
|
||||||
|
# WATCHDOG_B_MAIN_AGENT_ID=main
|
||||||
|
# WATCHDOG_B_STALLED_OWNER_MODE=escalate
|
||||||
|
# WATCHDOG_B_IDLE_OWNER_MODE=escalate
|
||||||
|
# WATCHDOG_B_STALLED_OWNER_ESCALATION_AFTER=2
|
||||||
|
# WATCHDOG_B_IDLE_OWNER_ESCALATION_AFTER=2
|
||||||
|
# WATCHDOG_B_STALLED_NUDGE_MIN_INTERVAL_SECONDS=900
|
||||||
|
# WATCHDOG_B_IDLE_NUDGE_MIN_INTERVAL_SECONDS=1800
|
||||||
|
|
||||||
|
# --- owner-facing message style ---
|
||||||
|
WATCHDOG_B_RUNNING_EMOJI=✅
|
||||||
|
WATCHDOG_B_RUNNING_SUMMARY=主程序仍在運行
|
||||||
|
WATCHDOG_B_STALLED_EMOJI=⚠️
|
||||||
|
WATCHDOG_B_STALLED_SUMMARY=主程序疑似卡住
|
||||||
|
WATCHDOG_B_IDLE_EMOJI=🛑
|
||||||
|
WATCHDOG_B_IDLE_SUMMARY=主程序目前未運行
|
||||||
|
|
||||||
|
# Optional overrides for the compact technical line.
|
||||||
|
# WATCHDOG_B_RUNNING_PROGRESS_LABEL=running
|
||||||
|
# WATCHDOG_B_STALLED_PROGRESS_LABEL=stalled
|
||||||
|
# WATCHDOG_B_IDLE_PROGRESS_LABEL=idle
|
||||||
|
# WATCHDOG_B_RUNNING_STATUS=normal
|
||||||
|
# WATCHDOG_B_STALLED_STATUS=needs-attention
|
||||||
|
# WATCHDOG_B_IDLE_STATUS=needs-attention
|
||||||
Reference in New Issue
Block a user