From 5fb6c28a9b4bc1bb43ce13dd44533ba76a4c3c70 Mon Sep 17 00:00:00 2001 From: Eve Date: Tue, 21 Apr 2026 18:41:40 +0800 Subject: [PATCH] chore: disable long-task watchdog --- CRAYFISH_STATUS.md | 308 +++++++++++++++++++++++++++++++++++++ memory/watchdog-state.json | 38 +++++ 2 files changed, 346 insertions(+) create mode 100644 CRAYFISH_STATUS.md create mode 100644 memory/watchdog-state.json diff --git a/CRAYFISH_STATUS.md b/CRAYFISH_STATUS.md new file mode 100644 index 0000000..e2a75ef --- /dev/null +++ b/CRAYFISH_STATUS.md @@ -0,0 +1,308 @@ +# Crayfish Status (Alice) + +> 任務是否「已完成」由 Eric 總管判定;此檔僅追蹤指派與進度。(Pending Verification) + +--- + +## 執行中 +- [ ] backend 進化會議 +- [ ] Discord 代理會議轉播診斷 +- [ ] 檢查靜態 Wiki 更新狀態 +- [ ] Mattermost docker-compose 部署 + NPM 反向代理 +- [ ] 診斷 kanban.service 200/CHDIR +- [ ] KAL 研究 hermes memory lcm / hermes-lcm plugin +- [ ] 設定 rsnapshot 每小時備份 `.openclaw` / `.hermes` +- [ ] 強化 checkpoint 狀態機規則 +- [ ] checkpoint 流程演練測試 +- [ ] 研究與安裝測試 paperclip +- [ ] 診斷與修正 approval 流程問題 +- [ ] 將 long task 規則升級為 watchdog 巡查機制 +- [ ] 調查 watchdog 回報未送達目前對話 + +### Details +#### Synology 溫度儀表板 +
+點開看細節 +**狀態**:M2 Verified(儀表板已部署運行;SMS 改用 Infobip,短訊版文案已實測送達;待總管確認完成) +**Lobby 紀錄**:telegram:-5079485182 msgId=2387 / msgId=2483 +**入口**: +**服務**:`systemctl --user status synology-temp-dashboard.service` +
+ +#### 移除 deer-flow +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=2691 +
+ +#### 調查安裝軟體權限 +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=2686 +
+ +#### 安裝 gh CLI +
+點開看細節 +**狀態**:M2 Verified(gh 2.45.0 已安裝;待總管確認) +**Lobby 紀錄**:telegram:-5079485182 msgId=2699 / msgId=2701 +
+ +#### backend 進化會議 +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=2738 +
+ +#### Discord 代理會議轉播診斷 +
+點開看細節 +**狀態**:M2 Verified(已釐清根因與可行方案;待總管確認) +**Lobby 紀錄**:telegram:-5079485182 msgId=2765 +
+ +#### Discord 優先工作模式遷移(已取消) +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=2772 +**備註**:總管已決定改由 Discord 端直接召集會議,不再以此路徑推進 +
+ +#### 調整 gpt-5.4 context 與 Synology dashboard +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=2804 +**明細**: +- `cowbay/gpt-5.4` contextWindow 已調為 `1048576` +- Synology dashboard 已新增高溫告警卡(讀 `WARNING_TEMPER_MAX`) +
+ +#### 重構 kanban 結構 +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=2831 / msgId=2869 / msgId=2939 +**目標**:主列表只留 task 名稱;明細收進各 task 內文;整理狀態欄位/命名/封存區 +
+ +#### 檢查靜態 Wiki 更新狀態 +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=2831 +
+ +#### Mattermost docker-compose 部署 + NPM 反向代理 +
+點開看細節 +**狀態**:M1 Assigned +**目標**:在本機以 docker-compose 部署 Mattermost(含 Postgres、持久化 volumes、可升級流程),並使用既有 Nginx Proxy Manager (NPM) 設定反向代理(含 WebSocket / headers / 上傳大小)。 +**環境盤點**:本機已在 Docker 層有 NPM(80/443 已占用;管理介面 8181→81);8065/host 5432 未占用。 +**待總管提供**:對外網域(例:chat.ai.cowbay.org)與是否要 Let’s Encrypt。 +
+ +#### 檢視 Synology dashboard 視覺建議 +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=2835 +
+ +#### 診斷 kanban.service 200/CHDIR +
+點開看細節 +**狀態**:M1 Assigned +**目標**:只讀診斷 `kanban.service` 為何出現 `status=200/CHDIR`,確認 unit 的 `WorkingDirectory`、`ExecStart`、權限與最近 journal。 +**目前阻塞**:依 Lobby-First 嘗試發送 Lobby 紀錄時失敗:`Telegram send failed: chat not found (chat_id=-5079485182)`。 +
+ +#### KAL 研究 hermes memory lcm / hermes-lcm plugin +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=1762 +**目標**:依 KAL intake 研究 `hermes memory lcm` / `hermes-lcm` plugin,理解其思路邏輯與實際做法。 +**開題**: +- topic: `hermes memory lcm` +- seed: `https://github.com/stephenschoettler/hermes-lcm` +- goal: `這個plugin的思路邏輯以及實際的做法` +- scope: `none` +**主查**:subagent research (runId=bcf46305-9ef8-4ded-80eb-26c9e6483c6e) +**同行覆核**:subagent reviewer (runId=b322bc44-c5d6-49cb-9bb0-f6f89f390c04) +
+ +#### 設定 rsnapshot 每小時備份 `.openclaw` / `.hermes` +
+點開看細節 +**狀態**:M2 Verified(rsnapshot 已建好獨立設定並成功手動建立 `hourly.0`;自動排程仍待最後確認,待總管判定) +**Lobby 紀錄**:telegram:-5079485182 msgId=1982 / msgId=1997 / msgId=2038 +**目標**:在本機安裝並設定 rsnapshot,將 `/home/alice/.openclaw` 與 `/home/alice/.hermes` 每小時備份到 `/opt/agents_backup`,只保留 12 份 hourly。 +**主查**:subagent rsnapshot-backup-setup-retry (runId=283a2d90-9a7f-4a7b-8d0f-a823c88d50ec) +**同行覆核**:待指派 +**已驗證**: +- 獨立設定檔:`/home/alice/.config/rsnapshot-agents.conf` +- `retain hourly 12` +- `rsnapshot configtest`:Syntax OK +- 已建立:`/opt/agents_backup/hourly.0/openclaw` +- 已建立:`/opt/agents_backup/hourly.0/hermes` +**待補**:自動排程(cron)最終落地與回讀確認 +
+ +#### 強化 checkpoint 狀態機規則 +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=2126 +**備註**:總管已批准落地此規則;目標是修正 checkpoint 後常停住 +
+ +#### checkpoint 流程演練測試 +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=2145 +**備註**:總管批准進行小型 long task 測試;驗收點是派工 → checkpoint → 繼續執行 → 收尾回報,中途不得在 checkpoint 後停住 +
+ +#### 研究與安裝測試 paperclip +
+點開看細節 +**狀態**:M1 Assigned(改走 docker-compose 路徑) +**Lobby 紀錄**:telegram:-5079485182 msgId=2167 / msgId=2199 +**備註**:總管已指示停止原生安裝路線,改用 repo 內 docker-compose 相關檔案在隔離目錄部署並驗證 +
+ +#### 診斷與修正 approval 流程問題 +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=2225 +**備註**:總管已指示先解決 approval 問題;目標是找出 exec/docker 命令頻繁要求 approval、approval 容易過期的根因與可修方案 +
+ +#### 將 long task 規則升級為 watchdog 巡查機制 +
+點開看細節 +**狀態**:M1 Assigned(2026-04-21 依總管指示已關閉 recurring watchdog) +**Lobby 紀錄**:telegram:-5079485182 msgId=2283 / msgId=2300 / msgId=2465 +**備註**:總管已批准改寫 long task / checkpoint 規則,並進一步建立外部 watchdog MVP(狀態檔 + recurring cron);其後依總管最新指示,已停用 `long-task-watchdog-10m` 並將 `watchdog-implementation-watchdog` 改為 paused。 +
+ +#### 調查 watchdog 回報未送達目前對話 +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=2372 +**備註**:總管要求釐清為何 watchdog 觸發後已生成 checkpoint,但 `delivered=false / not-delivered`,沒有成功送到目前這條 Telegram 對話 +
+ +#### 實作 Synology dashboard 前三項視覺優化 +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=2850 +**明細**: +- 按鈕 / 輸入框 / label 樣式統一 +- Header + 狀態 pill +- 溫度卡狀態色彩規則 +
+ +--- + +## 垃圾桶 +- [ ] TG / OpenClaw 無回應診斷 +- [ ] Forum v1 / openclaw-bot-review +- [ ] /start diag:1149 +- [ ] compaction trigger 診斷 + +### Details +#### TG / OpenClaw 無回應診斷 +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=1614 +**主查**:subagent backend (runId=79a799b2-6ce1-4ee0-bd37-a8136447effb) +**同行覆核**:subagent test (runId=f43d56d7-976b-4a04-b49e-678f6ae6b211) +
+ +#### Forum v1 / openclaw-bot-review +
+點開看細節 +**狀態**:M2 Verified(功能可用;待總管指示下一步) +**入口**: +- 範例 PR: +- 系統服務:`systemctl --user status openclaw-bot-review.service` +**PORT**=3077 / HOSTNAME=0.0.0.0 +**已驗證**:GET /api/forum/pulls/1 ✅ +
+ +#### /start diag:1149 +
+點開看細節 +**狀態**:M1 Assigned +**Lobby 紀錄**:telegram:-5079485182 msgId=1403(補充 msgId=1412:13:28 重啟前已無回應) +**主查**:subagent backend (runId=f1c4cd0c-b39b-4931-8436-1001cf595525) +**同行覆核**:subagent docs (runId=1f441f2d-9909-49b0-aadd-b37a40e16429) +**加派**:subagent diag-no-response-primary (runId=13f08bee-4c95-4949-8dea-4733e003406f) +
+ +#### compaction trigger 診斷 +
+點開看細節 +**狀態**:M1 Assigned +**主查**:subagent diag-compaction-trigger-primary (runId=6eb2bf3f-88bb-4e40-990a-221b27073646) +**主查**:subagent backend (runId=db9407a6-a69c-4d4f-b751-bc0f543b80e9) +**同行覆核**:subagent docs (runId=390d7e08-1041-4d2a-a4d0-c97781f00871) +
+ +--- +- [ ] Discord 優先工作模式遷移(已取消) + +## 封存備查 +- [x] agent forum / gitea 驗證紀錄 +- [x] 無回應 / compaction 歷史覆核紀錄 + +### Notes +**Lobby 紀錄**:telegram:-5079485182 msgId=1669 / msgId=1432 / msgId=1170 +- GITEA_BASE_URL=https://gitea.cowbay.org +- GITEA_FORUM_REPO=openclaw/agent-forum +**已設**:GITEA_TOKEN / FORUM_ADMIN_KEY +**已驗證**: +- GET /api/forum/pulls ✅ +- GET /api/forum/pulls/1/labels ✅ +- GET /api/forum/pulls/1/comments ✅ +**同行覆核**:subagent diag-compaction-trigger-peerreview (runId=9d52596d-1d95-4406-bb6c-ef1f25159a16) +**同行覆核**:subagent diag-no-response-peerreview (runId=70a13b2e-c4b6-4c7e-8131-a0da0190a37b) +**加派**:subagent diag-no-response-pre-restart (runId=ed5f2b90-560d-479f-9dc9-6da8346a6fc8) +--- +- [x] 調查安裝軟體權限 +- [x] 修復現有記憶管理 +- [x] 檢視 Synology dashboard 視覺建議 +- [x] 移除 deer-flow +- [x] 安裝 gh CLI + +## 已完成 +- [x] 部署 AstrBot Docker Telegram + +### Notes +#### 修復現有記憶管理 +
+點開看細節 +**說明**:修現有記憶管理(不更換套件):修正 openclaw-mem0 slot/config 問題,恢復內建 memory ingest/index,驗證 write/index/query +
+ +#### 部署 AstrBot Docker Telegram +
+點開看細節 +**說明**:以 Docker / docker-compose 架設 AstrBot,接上 Telegram Bot,先做隔離部署與基本驗證 +
+ +- [x] 實作 Synology dashboard 前三項視覺優化 +- [x] 重構 kanban 結構 +- [x] 調整 gpt-5.4 context 與 Synology dashboard +- [x] Synology 溫度儀表板 diff --git a/memory/watchdog-state.json b/memory/watchdog-state.json new file mode 100644 index 0000000..0b39089 --- /dev/null +++ b/memory/watchdog-state.json @@ -0,0 +1,38 @@ +{ + "version": 7, + "watchdogs": [ + { + "id": "paperclip-bootstrap-watchdog", + "task": "paperclip docker-compose bootstrap/onboarding", + "status": "paused", + "ownerSession": "main-telegram-eric", + "channel": "telegram", + "target": "864811879", + "intervalMinutes": 10, + "startedAt": "2026-04-21T05:50:00+08:00", + "lastMilestoneAt": "2026-04-21T13:37:11+08:00", + "lastAlertAt": "2026-04-21T14:03:45+08:00", + "notes": "paperclip docker-compose 已成功 running;目前暫停 paperclip 後續 bootstrap/onboarding,先優先修復 watchdog 機制本身。" + }, + { + "id": "watchdog-implementation-watchdog", + "task": "外部 watchdog 機制修復與閉環", + "status": "paused", + "ownerSession": "main-telegram-eric", + "ownerSessionKey": "agent:coder:main", + "ownerAgentId": "coder", + "channel": "telegram", + "target": "864811879", + "reportChannel": "telegram", + "reportTarget": "864811879", + "intervalMinutes": 10, + "startedAt": "2026-04-21T14:04:00+08:00", + "lastMilestoneAt": "2026-04-21T14:50:00+08:00", + "lastAlertAt": "2026-04-21T18:33:00+08:00", + "lastObservedActivityAt": "2026-04-21T17:35:58.341+08:00", + "lastNudgeAt": "2026-04-21T18:33:00+08:00", + "escalationPolicy": "nudge-owner-then-report", + "notes": "已依 Eric 總管指示關閉 long-task watchdog:recurring cron 已停用,這筆 watchdog 改為 paused,不再自動催辦或回報。" + } + ] +}