From 8e3c142ed69bbb82e3dfb031e5e18d10143c6ba1 Mon Sep 17 00:00:00 2001 From: Eric Chang Date: Mon, 23 Sep 2019 12:11:02 +0800 Subject: [PATCH] add install pg_auto_failover with psql 11 in ubuntu 18.04 --- ...pg_auto_failover_in_ubuntu_1804_psql_11.md | 24 +- .../index.html | 1205 ++++++++--------- 2 files changed, 604 insertions(+), 625 deletions(-) diff --git a/content/post/pg_auto_failover_in_ubuntu_1804_psql_11.md b/content/post/pg_auto_failover_in_ubuntu_1804_psql_11.md index 8f0e8ce4..fb1ee460 100644 --- a/content/post/pg_auto_failover_in_ubuntu_1804_psql_11.md +++ b/content/post/pg_auto_failover_in_ubuntu_1804_psql_11.md @@ -62,7 +62,7 @@ sudo apt install postgresql-11-auto-failover -y postgresql 的相關執行檔路徑,預設不會載入到PATH變數中,所以要自己手動增加 直接加入 /etc/environment ,或者是去修改使用者的 .profile 載入正確的 $PATH 都可以 -```` +``` #replace PATH variable in /etc/environment PATH="/usr/lib/postgresql/11/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games" @@ -960,3 +960,25 @@ pgautofailover RUNNING pid 2232, uptime 0:00:04 ``` 搞定,收工!準備寫 ansible playbook! + + +*** +UPDATE + +新增用logrotate 來管理 pgautofailover logfile + +``` +sudo vim /etc/logrotate.d/pgautofailover + +/var/log/supervisor/pgautofail*.log { + daily + rotate 7 + copytruncate + delaycompress # today and yesterday will not compress + compress + missingok + notifempty +} + +``` + diff --git a/public/post/pg_auto_failover_in_ubuntu_1804_psql_11/index.html b/public/post/pg_auto_failover_in_ubuntu_1804_psql_11/index.html index 30d27c32..81aa45db 100644 --- a/public/post/pg_auto_failover_in_ubuntu_1804_psql_11/index.html +++ b/public/post/pg_auto_failover_in_ubuntu_1804_psql_11/index.html @@ -177,263 +177,243 @@ sudo apt install postgresql-11-auto-failover -y

postgresql 的相關執行檔路徑,預設不會載入到PATH變數中,所以要自己手動增加

-

直接加入 /etc/environment ,或者是去修改使用者的 .profile 載入正確的 $PATH 都可以 -```` -#replace PATH variable in /etc/environment -PATH=“/usr/lib/postgresql/11/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games”

+

直接加入 /etc/environment ,或者是去修改使用者的 .profile 載入正確的 $PATH 都可以

+ +
#replace PATH variable in /etc/environment
+PATH="/usr/lib/postgresql/11/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"
 
-
因為要用 pg_auto_failover 來控制postgresql ,所以要把系統內建的服務先停掉,並且設定開啟不啟動
 
-

sudo systemctl disable postgresql -sudo systemctl stop postgresql

+

因為要用 pg_auto_failover 來控制postgresql ,所以要把系統內建的服務先停掉,並且設定開啟不啟動

-

-為了讓三台機器可以直接用hostname溝通,所以要修改 /etc/hosts
+
sudo systemctl disable postgresql
+sudo systemctl stop postgresql
 
-

#add to /etc/hosts +

為了讓三台機器可以直接用hostname溝通,所以要修改 /etc/hosts

+ +
#add to /etc/hosts
 192.168.11.151 monitor monitor.abc.com
 192.168.11.152 pg-primary pg-primary.abc.com
-192.168.11.153 pg-slave pg-slave.abc.com

+192.168.11.153 pg-slave pg-slave.abc.com +
-
以上完成在三台node上,安裝postgresql/pg_auto_failover 的工作
+

以上完成在三台node上,安裝postgresql/pg_auto_failover 的工作

-*** +
-#### 設定 monitor node +

設定 monitor node

-**在monitor node操作** +

在monitor node操作

-**建立monitor node** +

建立monitor node

+
sudo runuser -l postgres -c "/usr/bin/pg_autoctl create monitor --pgdata /var/lib/postgresql/11/main --pgport 5432 --nodename monitor"
 
-

sudo runuser -l postgres -c “/usr/bin/pg_autoctl create monitor –pgdata /var/lib/postgresql/11/main –pgport 5432 –nodename monitor”

- -
這個步驟會在pg_hba.conf 中,新增一筆紀錄
+

這個步驟會在pg_hba.conf 中,新增一筆紀錄

+
host "pg_auto_failover" "autoctl_node" 192.168.11.0/24 trust # Auto-generated by pg_auto_failover
 
-

host “pg_auto_failover” “autoctl_node” 192.168.11.0/24 trust # Auto-generated by pg_auto_failover

- -

-檢查一下狀態,看看有沒有安裝成功
-
-
+

檢查一下狀態,看看有沒有安裝成功

-

2019-09-19 06:56:09 [administrator@monitor ~]$ sudo runuser -l postgres -c “psql -c ‘\‘du” +

2019-09-19 06:56:09 [administrator@monitor ~]$ sudo runuser -l postgres -c "psql -c '\'du"
                                      List of roles
-  Role name   |                         Attributes                         | Member of
-————–+————————————————————+———–
+  Role name   |                         Attributes                         | Member of 
+--------------+------------------------------------------------------------+-----------
  autoctl      |                                                            | {}
  autoctl_node |                                                            | {}
- postgres     | Superuser, Create role, Create DB, Replication, Bypass RLS | {}

+ postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {} -

2019-09-19 06:56:15 [administrator@monitor ~]$ sudo runuser -l postgres -c “psql -c ‘\‘l” + +2019-09-19 06:56:15 [administrator@monitor ~]$ sudo runuser -l postgres -c "psql -c '\'l" List of databases - Name | Owner | Encoding | Collate | Ctype | Access privileges
-——————+———-+———–+———+——-+———————– - pg_auto_failover | autoctl | SQL_ASCII | C | C | - postgres | postgres | SQL_ASCII | C | C | + Name | Owner | Encoding | Collate | Ctype | Access privileges +------------------+----------+-----------+---------+-------+----------------------- + pg_auto_failover | autoctl | SQL_ASCII | C | C | + postgres | postgres | SQL_ASCII | C | C | template0 | postgres | SQL_ASCII | C | C | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | SQL_ASCII | C | C | =c/postgres + | | | | | postgres=CTc/postgres -(4 rows)

- -

-OK,看起來沒有問題
+(4 rows)
+
-**啟動pg_auto_failover monitor** +

OK,看起來沒有問題

-
+

啟動pg_auto_failover monitor

-

2019-09-19 06:56:19 [administrator@monitor ~]$ sudo runuser -l postgres -c “pg_autoctl run –pgdata /var/lib/postgresql/11/main” -06:57:38 INFO Managing PostgreSQL installation at “/var/lib/postgresql/11/main” -06:57:38 INFO PostgreSQL is running in “/var/lib/postgresql/11/main” on port 5432 -06:57:38 INFO The version of extenstion “pgautofailover” is “1.0” on the monitor +

2019-09-19 06:56:19 [administrator@monitor ~]$ sudo runuser -l postgres -c "pg_autoctl run --pgdata /var/lib/postgresql/11/main"
+06:57:38 INFO  Managing PostgreSQL installation at "/var/lib/postgresql/11/main"
+06:57:38 INFO  PostgreSQL is running in "/var/lib/postgresql/11/main" on port 5432
+06:57:38 INFO  The version of extenstion "pgautofailover" is "1.0" on the monitor
 06:57:38 INFO  pg_auto_failover monitor is ready at postgres://autoctl_node@monitor:5432/pg_auto_failover
-06:57:38 INFO  Contacting the monitor to LISTEN to its events

- -
這個指令會停留在console畫面上,要不就是在最後加上 ```&``` 放去背景執行
+06:57:38 INFO  Contacting the monitor to LISTEN to its events
+
-要不就是用systemd/supervisor 來控制,這個後面再來改 +

這個指令會停留在console畫面上,要不就是在最後加上 & 放去背景執行

-就先放著讓他跑,還可以順便觀察cluster變動時的狀態 +

要不就是用systemd/supervisor 來控制,這個後面再來改

-再開一個terminal 跑底下的指令,產生monitor node 的URI ,在建立maser/slave node 的時候會用到 +

就先放著讓他跑,還可以順便觀察cluster變動時的狀態

-
+

再開一個terminal 跑底下的指令,產生monitor node 的URI ,在建立maser/slave node 的時候會用到

-

2019-09-19 06:58:39 [administrator@monitor ~]$ sudo runuser -l postgres -c “pg_autoctl show uri –pgdata /var/lib/postgresql/11/main” +

2019-09-19 06:58:39 [administrator@monitor ~]$ sudo runuser -l postgres -c "pg_autoctl show uri --pgdata /var/lib/postgresql/11/main"
 postgres://autoctl_node@monitor:5432/pg_auto_failover
-2019-09-19 06:59:01 [administrator@monitor ~]$

- -

-以上完成monitor node 的準備工作
+2019-09-19 06:59:01 [administrator@monitor ~]$
+
-*** +

以上完成monitor node 的準備工作

-#### 設定master/primary node +
-**建立pg_auto_failover master/primary node** +

設定master/primary node

-
+

建立pg_auto_failover master/primary node

-

2019-09-19 15:04:16 [administrator@pg-primary ~]$ sudo runuser -l postgres -c “pg_autoctl create postgres –pgdata /var/lib/postgresql/11/main –pgport 5432 –nodename pg-primary –monitor postgres://autoctl_node@monitor:5432/pg_auto_failover” +

2019-09-19 15:04:16 [administrator@pg-primary ~]$ sudo runuser -l postgres -c "pg_autoctl create postgres --pgdata /var/lib/postgresql/11/main --pgport 5432 --nodename pg-primary --monitor postgres://autoctl_node@monitor:5432/pg_auto_failover"
 15:04:19 INFO  Found pg_ctl for PostgreSQL 11.5 at /usr/lib/postgresql/11/bin/pg_ctl
-15:04:19 INFO  Registered node pg-primary:5432 with id 1 in formation “default”, group 0.
-15:04:19 INFO  Writing keeper init state file at “/var/lib/postgresql/.local/share/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.init”
-15:04:19 INFO  Successfully registered as “single” to the monitor.
-15:04:21 INFO  Initialising a PostgreSQL cluster at “/var/lib/postgresql/11/main”
+15:04:19 INFO  Registered node pg-primary:5432 with id 1 in formation "default", group 0.
+15:04:19 INFO  Writing keeper init state file at "/var/lib/postgresql/.local/share/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.init"
+15:04:19 INFO  Successfully registered as "single" to the monitor.
+15:04:21 INFO  Initialising a PostgreSQL cluster at "/var/lib/postgresql/11/main"
 15:04:21 INFO  Postgres is not running, starting postgres
-15:04:21 INFO   /usr/lib/postgresql/11/bin/pg_ctl –pgdata /var/lib/postgresql/11/main –options “-p 5432” –options “-h *” –wait start
+15:04:21 INFO   /usr/lib/postgresql/11/bin/pg_ctl --pgdata /var/lib/postgresql/11/main --options "-p 5432" --options "-h *" --wait start
 15:04:21 INFO  CREATE DATABASE postgres;
-15:04:21 INFO  The database “postgres” already exists, skipping.
-15:04:21 INFO  FSM transition from “init” to “single”: Start as a single node
+15:04:21 INFO  The database "postgres" already exists, skipping.
+15:04:21 INFO  FSM transition from "init" to "single": Start as a single node
 15:04:21 INFO  Initialising postgres as a primary
-15:04:21 INFO  Transition complete: current state is now “single”
+15:04:21 INFO  Transition complete: current state is now "single"
 15:04:21 INFO  Keeper has been succesfully initialized.
-2019-09-19 15:04:21 [administrator@pg-primary ~]$

- -

-**啟動 pg_auto_failover master/primary node**
-
+2019-09-19 15:04:21 [administrator@pg-primary ~]$
 
-

2019-09-19 15:14:34 [administrator@pg-primary ~]$ sudo runuser -l postgres -c “pg_autoctl run –pgdata /var/lib/postgresql/11/main” -15:15:11 INFO Managing PostgreSQL installation at “/var/lib/postgresql/11/main” -15:15:11 INFO The version of extenstion “pgautofailover” is “1.0” on the monitor -15:15:11 INFO pg_autoctl service is starting -15:15:11 INFO Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is “”, WAL delta is -1. -15:15:16 INFO Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is “”, WAL delta is -1. -15:15:21 INFO Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is “”, WAL delta is -1.

- -

-一樣,執行完之後,會停留在畫面上,所以開另一個視窗來執行以下檢查的指令
+

啟動 pg_auto_failover master/primary node

+
2019-09-19 15:14:34 [administrator@pg-primary ~]$ sudo runuser -l postgres -c "pg_autoctl run --pgdata /var/lib/postgresql/11/main"
+15:15:11 INFO  Managing PostgreSQL installation at "/var/lib/postgresql/11/main"
+15:15:11 INFO  The version of extenstion "pgautofailover" is "1.0" on the monitor
+15:15:11 INFO  pg_autoctl service is starting
+15:15:11 INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", WAL delta is -1.
+15:15:16 INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", WAL delta is -1.
+15:15:21 INFO  Calling node_active for node default/1/0 with current state: single, PostgreSQL is running, sync_state is "", WAL delta is -1.
 
-

2019-09-19 15:17:08 [administrator@pg-primary ~]$ sudo runuser -l postgres -c “pg_autoctl show state –pgdata /var/lib/postgresql/11/main” +

一樣,執行完之後,會停留在畫面上,所以開另一個視窗來執行以下檢查的指令

+ +
2019-09-19 15:17:08 [administrator@pg-primary ~]$ sudo runuser -l postgres -c "pg_autoctl show state --pgdata /var/lib/postgresql/11/main"
       Name |   Port | Group |  Node |     Current State |    Assigned State
-———–+——–+——-+——-+——————-+——————
-pg-primary |   5432 |     0 |     1 |            single |            single

+-----------+--------+-------+-------+-------------------+------------------ +pg-primary | 5432 | 0 | 1 | single | single -

2019-09-19 15:17:16 [administrator@pg-primary ~]$

+2019-09-19 15:17:16 [administrator@pg-primary ~]$ +
-

-以上,master/primary node 設定結束
+

以上,master/primary node 設定結束

-*** +
-####設定 slave/secondary node +

####設定 slave/secondary node

-**建立 pg_auto_failover slave/secondary node** +

建立 pg_auto_failover slave/secondary node

-
- -

2019-09-19 07:04:08 [administrator@pg-slave ~]$ sudo runuser -l postgres -c “pg_autoctl create postgres –pgdata /var/lib/postgresql/11/main –pgport 5432 –nodename pg-slave –monitor postgres://autoctl_node@monitor:5432/pg_auto_failover” +

2019-09-19 07:04:08 [administrator@pg-slave ~]$ sudo runuser -l postgres -c "pg_autoctl create postgres --pgdata /var/lib/postgresql/11/main --pgport 5432 --nodename pg-slave --monitor postgres://autoctl_node@monitor:5432/pg_auto_failover"
 07:19:57 INFO  Found pg_ctl for PostgreSQL 11.5 at /usr/lib/postgresql/11/bin/pg_ctl
-07:19:57 INFO  Registered node pg-slave:5432 with id 2 in formation “default”, group 0.
-07:19:57 INFO  Writing keeper init state file at “/var/lib/postgresql/.local/share/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.init”
-07:19:57 INFO  Successfully registered as “wait_standby” to the monitor.
-07:19:57 INFO  FSM transition from “init” to “wait_standby”: Start following a primary
-07:19:57 INFO  Transition complete: current state is now “wait_standby”
-07:20:02 INFO  FSM transition from “wait_standby” to “catchingup”: The primary is now ready to accept a standby
+07:19:57 INFO  Registered node pg-slave:5432 with id 2 in formation "default", group 0.
+07:19:57 INFO  Writing keeper init state file at "/var/lib/postgresql/.local/share/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.init"
+07:19:57 INFO  Successfully registered as "wait_standby" to the monitor.
+07:19:57 INFO  FSM transition from "init" to "wait_standby": Start following a primary
+07:19:57 INFO  Transition complete: current state is now "wait_standby"
+07:20:02 INFO  FSM transition from "wait_standby" to "catchingup": The primary is now ready to accept a standby
 07:20:02 INFO  The primary node returned by the monitor is pg-primary:5432
 07:20:02 INFO  Initialising PostgreSQL as a hot standby
-07:20:02 INFO  Running /usr/lib/postgresql/11/bin/pg_basebackup -w -h pg-primary -p 5432 –pgdata /var/lib/postgresql/11/backup -U pgautofailover_replicator –write-recovery-conf –max-rate 100M –wal-method=stream –slot pgautofailover_standby …
+07:20:02 INFO  Running /usr/lib/postgresql/11/bin/pg_basebackup -w -h pg-primary -p 5432 --pgdata /var/lib/postgresql/11/backup -U pgautofailover_replicator --write-recovery-conf --max-rate 100M --wal-method=stream --slot pgautofailover_standby ...
 07:20:04 INFO  pg_basebackup: initiating base backup, waiting for checkpoint to complete
 pg_basebackup: checkpoint completed
 pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1
 pg_basebackup: starting background WAL receiver
-    0/23751 kB (0%), 0/1 tablespace (…ostgresql/11/backup/backup_label)
-1310323751 kB (55%), 0/1 tablespace (…gresql/11/backup/base/13091/2602)
-2376123761 kB (100%), 0/1 tablespace (…esql/11/backup/global/pg_control)
-2376123761 kB (100%), 11 tablespace
+ 0/23751 kB (0%), 0/1 tablespace (...ostgresql/11/backup/backup_label) +13103/23751 kB (55%), 0/1 tablespace (...gresql/11/backup/base/13091/2602) +23761/23761 kB (100%), 0/1 tablespace (...esql/11/backup/global/pg_control) +23761/23761 kB (100%), 1/1 tablespace pg_basebackup: write-ahead log end point: 0/20000F8 -pg_basebackup: waiting for background process to finish streaming … -pg_basebackup: base backup completed

+pg_basebackup: waiting for background process to finish streaming ... +pg_basebackup: base backup completed -

07:20:04 INFO Postgres is not running, starting postgres -07:20:04 INFO /usr/lib/postgresql/11/bin/pg_ctl –pgdata /var/lib/postgresql/11/main –options “-p 5432” –options “-h *” –wait start +07:20:04 INFO Postgres is not running, starting postgres +07:20:04 INFO /usr/lib/postgresql/11/bin/pg_ctl --pgdata /var/lib/postgresql/11/main --options "-p 5432" --options "-h *" --wait start 07:20:04 INFO PostgreSQL started on port 5432 -07:20:04 INFO Transition complete: current state is now “catchingup” +07:20:04 INFO Transition complete: current state is now "catchingup" 07:20:04 INFO Keeper has been succesfully initialized. -2019-09-19 07:20:04 [administrator@pg-slave ~]$

- -
可以看到在建立slave/secondary node 的時候,就會開始第一次的同步
-
-**啟動 pg_auto_failover slave/secondary node
-
+2019-09-19 07:20:04 [administrator@pg-slave ~]$
 
-

2019-09-19 07:20:04 [administrator@pg-slave ~]$ sudo runuser -l postgres -c “pg_autoctl run –pgdata /var/lib/postgresql/11/main” -07:21:33 INFO Managing PostgreSQL installation at “/var/lib/postgresql/11/main” -07:21:33 INFO The version of extenstion “pgautofailover” is “1.0” on the monitor -07:21:33 INFO pg_autoctl service is starting -07:21:33 INFO Calling node_active for node default/2/0 with current state: catchingup, PostgreSQL is running, sync_state is “”, WAL delta is -1. -07:21:33 INFO FSM transition from “catchingup” to “secondary”: Convinced the monitor that I’m up and running, and eligible for promotion again -07:21:33 INFO Transition complete: current state is now “secondary” -07:21:34 INFO Calling node_active for node default/2/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -07:21:39 INFO Calling node_active for node default/2/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.

+

可以看到在建立slave/secondary node 的時候,就會開始第一次的同步

-

-開新視窗確認狀態
+

**啟動 pg_auto_failover slave/secondary node

+
2019-09-19 07:20:04 [administrator@pg-slave ~]$ sudo runuser -l postgres -c "pg_autoctl run --pgdata /var/lib/postgresql/11/main"
+07:21:33 INFO  Managing PostgreSQL installation at "/var/lib/postgresql/11/main"
+07:21:33 INFO  The version of extenstion "pgautofailover" is "1.0" on the monitor
+07:21:33 INFO  pg_autoctl service is starting
+07:21:33 INFO  Calling node_active for node default/2/0 with current state: catchingup, PostgreSQL is running, sync_state is "", WAL delta is -1.
+07:21:33 INFO  FSM transition from "catchingup" to "secondary": Convinced the monitor that I'm up and running, and eligible for promotion again
+07:21:33 INFO  Transition complete: current state is now "secondary"
+07:21:34 INFO  Calling node_active for node default/2/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+07:21:39 INFO  Calling node_active for node default/2/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
 
-

2019-09-19 07:22:03 [administrator@pg-slave ~]$ sudo runuser -l postgres -c “pg_autoctl show state –pgdata /var/lib/postgresql/11/main” +

開新視窗確認狀態

+ +
2019-09-19 07:22:03 [administrator@pg-slave ~]$ sudo runuser -l postgres -c "pg_autoctl show state --pgdata /var/lib/postgresql/11/main"
       Name |   Port | Group |  Node |     Current State |    Assigned State
-———–+——–+——-+——-+——————-+——————
+-----------+--------+-------+-------+-------------------+------------------
 pg-primary |   5432 |     0 |     1 |           primary |           primary
-  pg-slave |   5432 |     0 |     2 |         secondary |         secondary

+ pg-slave | 5432 | 0 | 2 | secondary | secondary -

2019-09-19 07:22:45 [administrator@pg-slave ~]$

- -

-正常的話,應該就是會出現兩個 node ,一個 pg-primary 一個pg-slave
+2019-09-19 07:22:45 [administrator@pg-slave ~]$ 
+
-**generate URI for applications** +

正常的話,應該就是會出現兩個 node ,一個 pg-primary 一個pg-slave

-這個步驟是用來產生給client/applications 連線用的連線字串(connection string) +

generate URI for applications

-
+

這個步驟是用來產生給client/applications 連線用的連線字串(connection string)

-

2019-09-19 07:29:56 [administrator@pg-slave ~]$ sudo runuser -l postgres -c “pg_autoctl show uri –formation default –pgdata /var/lib/postgresql/11/main” +

2019-09-19 07:29:56 [administrator@pg-slave ~]$ sudo runuser -l postgres -c "pg_autoctl show uri --formation default --pgdata /var/lib/postgresql/11/main"
 postgres://pg-slave:5432,pg-primary:5432/postgres?target_session_attrs=read-write
-2019-09-19 07:30:06 [administrator@pg-slave ~]$

- -
以後就都通過這個URI來存取這個cluster
+2019-09-19 07:30:06 [administrator@pg-slave ~]$
+
-切記,**不可以用系統內建的postgresql service,一定要用 pg_auto_failover 來啟動DB** +

以後就都通過這個URI來存取這個cluster

-以上 slave/secondary node 設定完成 +

切記,不可以用系統內建的postgresql service,一定要用 pg_auto_failover 來啟動DB

-*** +

以上 slave/secondary node 設定完成

-設定上並不複雜,比起keepalived 要簡單太多了.. +
-那就要評估看看在異常狀況發生,切換資料庫時的表現了 +

設定上並不複雜,比起keepalived 要簡單太多了..

-接下來就繼續測試auto failover 的功能! +

那就要評估看看在異常狀況發生,切換資料庫時的表現了

-*** +

接下來就繼續測試auto failover 的功能!

-#### 測試 auto failover +
-首先,在 pg-prmiary node 上,透過上面產生的URI來進入psql +

測試 auto failover

-然後建立一個測試db、建立一個測試table,插入幾筆資料 +

首先,在 pg-prmiary node 上,透過上面產生的URI來進入psql

-
+

然後建立一個測試db、建立一個測試table,插入幾筆資料

-

2019-09-20 10:58:21 [administrator@pg-primary ~]$ sudo runuser -l postgres -c “psql postgres://pg-slave:5432,pg-primary:5432/postgres?target_session_attrs=read-write” +

2019-09-20 10:58:21 [administrator@pg-primary ~]$ sudo runuser -l postgres -c "psql postgres://pg-slave:5432,pg-primary:5432/postgres?target_session_attrs=read-write"
 psql (11.5 (Ubuntu 11.5-1.pgdg18.04+1))
-Type “help” for help.

+Type "help" for help. -

postgres=# create database testdb_1058; +postgres=# create database testdb_1058; CREATE DATABASE postgres=# create table testtbl (serial int); CREATE TABLE @@ -445,161 +425,142 @@ postgres=# insert into testtbl values (3); INSERT 0 1 postgres=# insert into testtbl values (444); INSERT 0 1 -postgres=# select * from testtbl;

+postgres=# select * from testtbl; + serial +-------- + 111 + 222 + 3 + 444 +(4 rows) -

serial

- -
111
-222
-  3
-444
+postgres=# \q
+2019-09-20 10:59:30 [administrator@pg-primary ~]$
 
-

(4 rows)

- -

postgres=# \q -2019-09-20 10:59:30 [administrator@pg-primary ~]$

- -

-然後切換到 pg-slave ,一樣透過URI進入psql 撈看看資料
+

然後切換到 pg-slave ,一樣透過URI進入psql 撈看看資料

-
- -

2019-09-20 03:02:12 [administrator@pg-slave ~]$ sudo runuser -l postgres -c “psql postgres://pg-slave:5432,pg-primary:5432/postgres?target_session_attrs=read-write” +

2019-09-20 03:02:12 [administrator@pg-slave ~]$ sudo runuser -l postgres -c "psql postgres://pg-slave:5432,pg-primary:5432/postgres?target_session_attrs=read-write"
 psql (11.5 (Ubuntu 11.5-1.pgdg18.04+1))
-Type “help” for help.

- -

postgres=# select * from testtbl;

- -

serial

- -
111
-222
-  3
-444
+Type "help" for help.
+
+postgres=# select * from testtbl;
+ serial 
+--------
+    111
+    222
+      3
+    444
+(4 rows)
+
+postgres=# 
 
-

(4 rows)

- -

postgres=#

+

確認透過URI的確是可以正常的存取資料

-

-確認透過URI的確是可以正常的存取資料
+

在兩台單機上(master/slave)上也都可以看到同樣的資料,代表資料有正確的被同步到兩台node上

-在兩台單機上(master/slave)上也都可以看到同樣的資料,代表資料有正確的被同步到兩台node上 +

pg-slave

-**pg-slave** - -
- -

2019-09-20 03:03:06 [administrator@pg-slave ~]$ sudo runuser -l postgres -c psql +

2019-09-20 03:03:06 [administrator@pg-slave ~]$ sudo runuser -l postgres -c psql
 psql (11.5 (Ubuntu 11.5-1.pgdg18.04+1))
-Type “help” for help.

+Type "help" for help. -

postgres=# \l +postgres=# \l List of databases - Name | Owner | Encoding | Collate | Ctype | Access privileges
-————-+———-+———–+———+——-+———————– - nexus | postgres | SQL_ASCII | C | C | - postgres | postgres | SQL_ASCII | C | C | + Name | Owner | Encoding | Collate | Ctype | Access privileges +-------------+----------+-----------+---------+-------+----------------------- + nexus | postgres | SQL_ASCII | C | C | + postgres | postgres | SQL_ASCII | C | C | template0 | postgres | SQL_ASCII | C | C | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | SQL_ASCII | C | C | =c/postgres + | | | | | postgres=CTc/postgres - testdb_1058 | postgres | SQL_ASCII | C | C | -(5 rows)

- -

postgres=# select * from testtbl;

- -

serial

- -
111
-222
-  3
-444
+ testdb_1058 | postgres | SQL_ASCII | C       | C     | 
+(5 rows)
+
+postgres=# select * from testtbl;
+ serial 
+--------
+    111
+    222
+      3
+    444
+(4 rows)
+
+postgres=#
 
-

(4 rows)

- -

postgres=#

- -

-**pg-master**
-
+

pg-master

-

2019-09-20 10:59:30 [administrator@pg-primary ~]$ sudo runuser -l postgres -c psql +

2019-09-20 10:59:30 [administrator@pg-primary ~]$ sudo runuser -l postgres -c psql
 psql (11.5 (Ubuntu 11.5-1.pgdg18.04+1))
-Type “help” for help.

+Type "help" for help. -

postgres=# \l +postgres=# \l List of databases - Name | Owner | Encoding | Collate | Ctype | Access privileges
-————-+———-+———–+———+——-+———————– - nexus | postgres | SQL_ASCII | C | C | - postgres | postgres | SQL_ASCII | C | C | + Name | Owner | Encoding | Collate | Ctype | Access privileges +-------------+----------+-----------+---------+-------+----------------------- + nexus | postgres | SQL_ASCII | C | C | + postgres | postgres | SQL_ASCII | C | C | template0 | postgres | SQL_ASCII | C | C | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | SQL_ASCII | C | C | =c/postgres + | | | | | postgres=CTc/postgres - testdb_1058 | postgres | SQL_ASCII | C | C | -(5 rows)

- -

postgres=# select * from testtbl;

- -

serial

- -
111
-222
-  3
-444
+ testdb_1058 | postgres | SQL_ASCII | C       | C     | 
+(5 rows)
+
+postgres=# select * from testtbl;
+ serial 
+--------
+    111
+    222
+      3
+    444
+(4 rows)
+
+postgres=# 
 
-

(4 rows)

- -

postgres=#

+

接著來把本來是master的機器關掉,看看會有什麼變化

-

-接著來把本來是master的機器關掉,看看會有什麼變化
+

首先,在pg-slave 這台機器上,就會看到cluster 有狀況,primary 不見了

-首先,在pg-slave 這台機器上,就會看到cluster 有狀況,primary 不見了 -
- -

03:06:02 INFO Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -03:06:08 INFO Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -03:06:13 INFO Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. +

03:06:02 INFO  Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+03:06:08 INFO  Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+03:06:13 INFO  Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
 03:06:18 ERROR PostgreSQL cannot reach the primary server: the system view pg_stat_wal_receiver has no rows.
-03:06:18 INFO  Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is -1.
+03:06:18 INFO  Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is -1.
 03:06:23 ERROR PostgreSQL cannot reach the primary server: the system view pg_stat_wal_receiver has no rows.
-03:06:23 INFO  Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is -1.
+03:06:23 INFO  Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is -1.
 03:06:28 ERROR PostgreSQL cannot reach the primary server: the system view pg_stat_wal_receiver has no rows.
-03:06:28 INFO  Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is -1.
+03:06:28 INFO  Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is -1.
 03:06:33 ERROR PostgreSQL cannot reach the primary server: the system view pg_stat_wal_receiver has no rows.
-03:06:33 INFO  Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is -1.
+03:06:33 INFO  Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is -1.
 03:06:38 ERROR PostgreSQL cannot reach the primary server: the system view pg_stat_wal_receiver has no rows.
-03:06:38 INFO  Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is -1.

- -

-這時候在pg-slave 還能不能透過URI 連線進去psql操作?
+03:06:38 INFO  Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is -1.
 
-

2019-09-20 03:08:06 [administrator@pg-slave ~]$ sudo runuser -l postgres -c “psql postgres://pg-slave:5432,pg-primary:5432/postgres?target_session_attrs=read-write” +

這時候在pg-slave 還能不能透過URI 連線進去psql操作?

+ +
2019-09-20 03:08:06 [administrator@pg-slave ~]$ sudo runuser -l postgres -c "psql postgres://pg-slave:5432,pg-primary:5432/postgres?target_session_attrs=read-write"
 psql (11.5 (Ubuntu 11.5-1.pgdg18.04+1))
-Type “help” for help.

+Type "help" for help. -

postgres=# \l +postgres=# \l List of databases - Name | Owner | Encoding | Collate | Ctype | Access privileges
-————-+———-+———–+———+——-+———————– - nexus | postgres | SQL_ASCII | C | C | - postgres | postgres | SQL_ASCII | C | C | + Name | Owner | Encoding | Collate | Ctype | Access privileges +-------------+----------+-----------+---------+-------+----------------------- + nexus | postgres | SQL_ASCII | C | C | + postgres | postgres | SQL_ASCII | C | C | template0 | postgres | SQL_ASCII | C | C | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | SQL_ASCII | C | C | =c/postgres + | | | | | postgres=CTc/postgres - testdb_1058 | postgres | SQL_ASCII | C | C | -(5 rows)

+ testdb_1058 | postgres | SQL_ASCII | C | C | +(5 rows) -

postgres=# create database testdb_1108; +postgres=# create database testdb_1108; CREATE DATABASE postgres=# create table testtbl_1108 (serial int); CREATE TABLE @@ -614,422 +575,400 @@ INSERT 0 1 postgres=# insert into testtbl_1108 values (55); INSERT 0 1 postgres=# \q -2019-09-20 03:09:34 [administrator@pg-slave ~]$

- -

-很好,看起來沒有問題,這時候把 pg-primary 打開,執行啟動 pg_auto_failover node 的指令,看看會發生什麼事
-
+2019-09-20 03:09:34 [administrator@pg-slave ~]$
 
-

2019-09-20 11:12:13 [administrator@pg-primary ~]$ sudo runuser -l postgres -c “pg_autoctl run” -11:12:29 INFO Managing PostgreSQL installation at “/var/lib/postgresql/11/main” -11:12:29 INFO The version of extenstion “pgautofailover” is “1.0” on the monitor +

很好,看起來沒有問題,這時候把 pg-primary 打開,執行啟動 pg_auto_failover node 的指令,看看會發生什麼事

+ +
2019-09-20 11:12:13 [administrator@pg-primary ~]$ sudo runuser -l postgres -c "pg_autoctl run"
+11:12:29 INFO  Managing PostgreSQL installation at "/var/lib/postgresql/11/main"
+11:12:29 INFO  The version of extenstion "pgautofailover" is "1.0" on the monitor
 11:12:29 INFO  pg_autoctl service is starting
-11:12:29 INFO  Calling node_active for node default/1/0 with current state: primary, PostgreSQL is not running, sync_state is “”, WAL delta is -1.
+11:12:29 INFO  Calling node_active for node default/1/0 with current state: primary, PostgreSQL is not running, sync_state is "", WAL delta is -1.
 11:12:29 INFO  Postgres is not running, starting postgres
-11:12:29 INFO   /usr/lib/postgresql/11/bin/pg_ctl –pgdata /var/lib/postgresql/11/main –options “-p 5432” –options “-h *” –wait start
+11:12:29 INFO   /usr/lib/postgresql/11/bin/pg_ctl --pgdata /var/lib/postgresql/11/main --options "-p 5432" --options "-h *" --wait start
 11:12:30 WARN  PostgreSQL was not running, restarted with pid 1407
-11:12:30 INFO  FSM transition from “primary” to “demoted”: A failover occurred, no longer primary
-11:12:30 INFO  Transition complete: current state is now “demoted”
-11:12:30 INFO  Calling node_active for node default/1/0 with current state: demoted, PostgreSQL is not running, sync_state is “”, WAL delta is -1.
-11:12:30 INFO  FSM transition from “demoted” to “catchingup”: A new primary is available. First, try to rewind. If that fails, do a pg_basebackup.
+11:12:30 INFO  FSM transition from "primary" to "demoted": A failover occurred, no longer primary
+11:12:30 INFO  Transition complete: current state is now "demoted"
+11:12:30 INFO  Calling node_active for node default/1/0 with current state: demoted, PostgreSQL is not running, sync_state is "", WAL delta is -1.
+11:12:30 INFO  FSM transition from "demoted" to "catchingup": A new primary is available. First, try to rewind. If that fails, do a pg_basebackup.
 11:12:30 INFO  The primary node returned by the monitor is pg-slave:5432
 11:12:30 INFO  Rewinding PostgreSQL to follow new primary pg-slave:5432
 11:12:30 ERROR Connection to database failed: could not connect to server: No such file or directory
-    Is the server running locally and accepting
-    connections on Unix domain socket “/var/run/postgresql/.s.PGSQL.5432”?

+ Is the server running locally and accepting + connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"? -

11:12:30 ERROR Failed to get the postgresql.conf path from the local postgres server, see above for details +11:12:30 ERROR Failed to get the postgresql.conf path from the local postgres server, see above for details 11:12:30 WARN Failed to rewind demoted primary to standby, trying pg_basebackup instead 11:12:30 INFO Initialising PostgreSQL as a hot standby -11:12:30 INFO Target directory exists: “/var/lib/postgresql/11/main”, stopping PostgreSQL -11:12:30 INFO pg_ctl: no server running

+11:12:30 INFO Target directory exists: "/var/lib/postgresql/11/main", stopping PostgreSQL +11:12:30 INFO pg_ctl: no server running -

11:12:30 INFO pg_ctl stop failed, but PostgreSQL is not running anyway -11:12:30 INFO Running /usr/lib/postgresql/11/bin/pg_basebackup -w -h pg-slave -p 5432 –pgdata /var/lib/postgresql/11/backup -U pgautofailover_replicator –write-recovery-conf –max-rate 100M –wal-method=stream –slot pgautofailover_standby … +11:12:30 INFO pg_ctl stop failed, but PostgreSQL is not running anyway +11:12:30 INFO Running /usr/lib/postgresql/11/bin/pg_basebackup -w -h pg-slave -p 5432 --pgdata /var/lib/postgresql/11/backup -U pgautofailover_replicator --write-recovery-conf --max-rate 100M --wal-method=stream --slot pgautofailover_standby ... 11:12:33 INFO pg_basebackup: initiating base backup, waiting for checkpoint to complete pg_basebackup: checkpoint completed pg_basebackup: write-ahead log start point: 0/12000028 on timeline 6 pg_basebackup: starting background WAL receiver - 0/46937 kB (0%), 0/1 tablespace (…ostgresql/11/backup/backup_label) -4694746947 kB (100%), 0/1 tablespace (…esql/11/backup/global/pg_control) -4694746947 kB (100%), 11 tablespace
+ 0/46937 kB (0%), 0/1 tablespace (...ostgresql/11/backup/backup_label) +46947/46947 kB (100%), 0/1 tablespace (...esql/11/backup/global/pg_control) +46947/46947 kB (100%), 1/1 tablespace pg_basebackup: write-ahead log end point: 0/120000F8 -pg_basebackup: waiting for background process to finish streaming … -pg_basebackup: base backup completed

+pg_basebackup: waiting for background process to finish streaming ... +pg_basebackup: base backup completed -

11:12:33 INFO Postgres is not running, starting postgres -11:12:33 INFO /usr/lib/postgresql/11/bin/pg_ctl –pgdata /var/lib/postgresql/11/main –options “-p 5432” –options “-h *” –wait start +11:12:33 INFO Postgres is not running, starting postgres +11:12:33 INFO /usr/lib/postgresql/11/bin/pg_ctl --pgdata /var/lib/postgresql/11/main --options "-p 5432" --options "-h *" --wait start 11:12:33 INFO PostgreSQL started on port 5432 -11:12:33 INFO Drop replication slot “pgautofailover_standby” -11:12:33 INFO Transition complete: current state is now “catchingup” -11:12:33 INFO Calling node_active for node default/1/0 with current state: catchingup, PostgreSQL is running, sync_state is “”, WAL delta is -1. -11:12:38 INFO Calling node_active for node default/1/0 with current state: catchingup, PostgreSQL is running, sync_state is “”, WAL delta is -1. -11:12:43 INFO Calling node_active for node default/1/0 with current state: catchingup, PostgreSQL is running, sync_state is “”, WAL delta is -1. -11:12:48 INFO Calling node_active for node default/1/0 with current state: catchingup, PostgreSQL is running, sync_state is “”, WAL delta is -1. -11:12:48 INFO FSM transition from “catchingup” to “secondary”: Convinced the monitor that I’m up and running, and eligible for promotion again -11:12:48 INFO Transition complete: current state is now “secondary” -11:12:48 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -11:12:53 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -11:12:58 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -11:13:03 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.

- -

-可以看到 pg_auto_failover 偵測到了cluster node 有意異動,開始同步資料,然後切換角色,從primary變成了 secondary
-
-
-同一時間,在pg-slave 的訊息也有變動
+11:12:33 INFO  Drop replication slot "pgautofailover_standby"
+11:12:33 INFO  Transition complete: current state is now "catchingup"
+11:12:33 INFO  Calling node_active for node default/1/0 with current state: catchingup, PostgreSQL is running, sync_state is "", WAL delta is -1.
+11:12:38 INFO  Calling node_active for node default/1/0 with current state: catchingup, PostgreSQL is running, sync_state is "", WAL delta is -1.
+11:12:43 INFO  Calling node_active for node default/1/0 with current state: catchingup, PostgreSQL is running, sync_state is "", WAL delta is -1.
+11:12:48 INFO  Calling node_active for node default/1/0 with current state: catchingup, PostgreSQL is running, sync_state is "", WAL delta is -1.
+11:12:48 INFO  FSM transition from "catchingup" to "secondary": Convinced the monitor that I'm up and running, and eligible for promotion again
+11:12:48 INFO  Transition complete: current state is now "secondary"
+11:12:48 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+11:12:53 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+11:12:58 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+11:13:03 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
 
-

03:12:34 INFO Calling node_active for node default/4/0 with current state: wait_primary, PostgreSQL is running, sync_state is “async”, WAL delta is 0. -03:12:39 INFO Calling node_active for node default/4/0 with current state: wait_primary, PostgreSQL is running, sync_state is “async”, WAL delta is 0. -03:12:45 INFO Calling node_active for node default/4/0 with current state: wait_primary, PostgreSQL is running, sync_state is “async”, WAL delta is 0. -03:12:50 INFO Calling node_active for node default/4/0 with current state: wait_primary, PostgreSQL is running, sync_state is “async”, WAL delta is 0. -03:12:50 INFO FSM transition from “wait_primary” to “primary”: A healthy secondary appeared -03:12:50 INFO Enabling synchronous replication -03:12:50 INFO Transition complete: current state is now “primary” -03:12:50 INFO Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is “sync”, WAL delta is 0. -03:12:55 INFO Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is “sync”, WAL delta is 0.

+

可以看到 pg_auto_failover 偵測到了cluster node 有意異動,開始同步資料,然後切換角色,從primary變成了 secondary

-

-pg-slave 的狀態從原本的secondary 變成了 primary
+

同一時間,在pg-slave 的訊息也有變動

-來檢查看看 +
03:12:34 INFO  Calling node_active for node default/4/0 with current state: wait_primary, PostgreSQL is running, sync_state is "async", WAL delta is 0.
+03:12:39 INFO  Calling node_active for node default/4/0 with current state: wait_primary, PostgreSQL is running, sync_state is "async", WAL delta is 0.
+03:12:45 INFO  Calling node_active for node default/4/0 with current state: wait_primary, PostgreSQL is running, sync_state is "async", WAL delta is 0.
+03:12:50 INFO  Calling node_active for node default/4/0 with current state: wait_primary, PostgreSQL is running, sync_state is "async", WAL delta is 0.
+03:12:50 INFO  FSM transition from "wait_primary" to "primary": A healthy secondary appeared
+03:12:50 INFO  Enabling synchronous replication
+03:12:50 INFO  Transition complete: current state is now "primary"
+03:12:50 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is "sync", WAL delta is 0.
+03:12:55 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is "sync", WAL delta is 0.
 
-

2019-09-20 03:19:04 [administrator@pg-slave ~]$ sudo runuser -l postgres -c “pg_autoctl show state” +

pg-slave 的狀態從原本的secondary 變成了 primary

+ +

來檢查看看

+ +
2019-09-20 03:19:04 [administrator@pg-slave ~]$ sudo runuser -l postgres -c "pg_autoctl show state"
       Name |   Port | Group |  Node |     Current State |    Assigned State
-———–+——–+——-+——-+——————-+——————
+-----------+--------+-------+-------+-------------------+------------------
 pg-primary |   5432 |     0 |     1 |         secondary |         secondary
-  pg-slave |   5432 |     0 |     4 |           primary |           primary

- -

2019-09-20 03:19:07 [administrator@pg-slave ~]$

+ pg-slave | 5432 | 0 | 4 | primary | primary -

-跟上面第一次執行的結果不同了,兩台角色互換了,這也是預期中的結果
+2019-09-20 03:19:07 [administrator@pg-slave ~]$ 
+
-那在pg-master 離線期間的異動資料呢?在 pg-master上查得到嗎? +

跟上面第一次執行的結果不同了,兩台角色互換了,這也是預期中的結果

-先透過 URI存取DB 檢查看看 +

那在pg-master 離線期間的異動資料呢?在 pg-master上查得到嗎?

-
+

先透過 URI存取DB 檢查看看

-

2019-09-20 11:12:16 [administrator@pg-primary ~]$ sudo runuser -l postgres -c “psql postgres://pg-slave:5432,pg-primary:5432/postgres?target_session_attrs=read-write” +

2019-09-20 11:12:16 [administrator@pg-primary ~]$ sudo runuser -l postgres -c "psql postgres://pg-slave:5432,pg-primary:5432/postgres?target_session_attrs=read-write"
 psql (11.5 (Ubuntu 11.5-1.pgdg18.04+1))
-Type “help” for help.

+Type "help" for help. -

postgres=# \l +postgres=# \l List of databases - Name | Owner | Encoding | Collate | Ctype | Access privileges
-————-+———-+———–+———+——-+———————– - nexus | postgres | SQL_ASCII | C | C | - postgres | postgres | SQL_ASCII | C | C | + Name | Owner | Encoding | Collate | Ctype | Access privileges +-------------+----------+-----------+---------+-------+----------------------- + nexus | postgres | SQL_ASCII | C | C | + postgres | postgres | SQL_ASCII | C | C | template0 | postgres | SQL_ASCII | C | C | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | SQL_ASCII | C | C | =c/postgres + | | | | | postgres=CTc/postgres - testdb_1058 | postgres | SQL_ASCII | C | C | - testdb_1108 | postgres | SQL_ASCII | C | C | -(6 rows)

- -

postgres=# select * from testtbl_1108;

- -

serial

- -
999
-888
- 77
- 66
- 55
+ testdb_1058 | postgres | SQL_ASCII | C       | C     | 
+ testdb_1108 | postgres | SQL_ASCII | C       | C     | 
+(6 rows)
+
+postgres=# select * from testtbl_1108;
+ serial 
+--------
+    999
+    888
+     77
+     66
+     55
+(5 rows)
+
+postgres=#
 
-

(5 rows)

- -

postgres=#

+

GOOD!看來資料也同步過來了

-

-GOOD!看來資料也同步過來了
-
-再來看看單機的狀態
-
+

再來看看單機的狀態

-

2019-09-20 11:22:07 [administrator@pg-primary ~]$ sudo runuser -l postgres -c “psql” +

2019-09-20 11:22:07 [administrator@pg-primary ~]$ sudo runuser -l postgres -c "psql"
 psql (11.5 (Ubuntu 11.5-1.pgdg18.04+1))
-Type “help” for help.

+Type "help" for help. -

postgres=# \l +postgres=# \l List of databases - Name | Owner | Encoding | Collate | Ctype | Access privileges
-————-+———-+———–+———+——-+———————– - nexus | postgres | SQL_ASCII | C | C | - postgres | postgres | SQL_ASCII | C | C | + Name | Owner | Encoding | Collate | Ctype | Access privileges +-------------+----------+-----------+---------+-------+----------------------- + nexus | postgres | SQL_ASCII | C | C | + postgres | postgres | SQL_ASCII | C | C | template0 | postgres | SQL_ASCII | C | C | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | SQL_ASCII | C | C | =c/postgres + | | | | | postgres=CTc/postgres - testdb_1058 | postgres | SQL_ASCII | C | C | - testdb_1108 | postgres | SQL_ASCII | C | C | -(6 rows)

- -

postgres=# select * from testtbl_1108;

- -

serial

- -
999
-888
- 77
- 66
- 55
+ testdb_1058 | postgres | SQL_ASCII | C       | C     | 
+ testdb_1108 | postgres | SQL_ASCII | C       | C     | 
+(6 rows)
+
+postgres=# select * from testtbl_1108;
+ serial 
+--------
+    999
+    888
+     77
+     66
+     55
+(5 rows)
+
+postgres=# 
 
-

(5 rows)

- -

postgres=#

+

很好,確認資料有正確的抄寫過來!

-

-很好,確認資料有正確的抄寫過來!
-
-***
+
-以上測試可以發現 pg_auto_failover 這個postgresql的extension 還真的很好用! +

以上測試可以發現 pg_auto_failover 這個postgresql的extension 還真的很好用!

-設定簡單、快速、方便,設定完成後也不用傷腦筋,反正有異常,會自動幫你搞定master-slave之間的主從關係 +

設定簡單、快速、方便,設定完成後也不用傷腦筋,反正有異常,會自動幫你搞定master-slave之間的主從關係

-資料也都會自動同步好,真的是非常推薦啊! +

資料也都會自動同步好,真的是非常推薦啊!

-UPDATE +

UPDATE

-#### 將 pg_auto_failover 用 systemd 管理 +

將 pg_auto_failover 用 systemd 管理

-每次都要手動執行 pg_autoctl run 太麻煩了,應該用systemd 或者是 supervisor 來管理 +

每次都要手動執行 pg_autoctl run 太麻煩了,應該用systemd 或者是 supervisor 來管理

-而pg_auto_failover 有指令可以做成 systemd 的service 檔案 +

而pg_auto_failover 有指令可以做成 systemd 的service 檔案

-既然人家都做好了,當然就直接用 systemd 來做 +

既然人家都做好了,當然就直接用 systemd 來做

-執行以下指令,產生 systemd configuration +

執行以下指令,產生 systemd configuration

+
sudo runuser -l postgres -c "pg_autoctl -q show systemd" | sudo tee /etc/systemd/system/pgautofailover.service
 
-

sudo runuser -l postgres -c “pg_autoctl -q show systemd” | sudo tee /etc/systemd/system/pgautofailover.service

+

在/etc/systemd/system底下會多出一個 pgautofailover.service檔案

-
在/etc/systemd/system底下會多出一個 pgautofailover.service檔案
+

內容就是 pg_auto_failover 所提供的 systemd config

-內容就是 pg_auto_failover 所提供的 systemd config +

接下來就可以直接用 sudo service pgautofailover start 來啟動

-接下來就可以直接用 sudo service pgautofailover start 來啟動 +

沒有什麼問題,不過呢,這樣的作法,會在 /var/log/syslog 塞滿了 pg_auto_failover 的紀錄

-沒有什麼問題,不過呢,這樣的作法,會在 /var/log/syslog 塞滿了 pg_auto_failover 的紀錄 -
- -

Sep 20 14:59:34 pg-primary systemd[1]: Started pg_auto_failover. -Sep 20 14:59:34 pg-primary pg_autoctl[7817]: 14:59:34 INFO Managing PostgreSQL installation at “/var/lib/postgresql/11/main” -Sep 20 14:59:34 pg-primary pg_autoctl[7817]: 14:59:34 INFO Found a stale pidfile at “/tmp/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.pid” -Sep 20 14:59:34 pg-primary pg_autoctl[7817]: 14:59:34 WARN Removing the stale pid file “/tmp/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.pid” -Sep 20 14:59:34 pg-primary pg_autoctl[7817]: 14:59:34 INFO The version of extenstion “pgautofailover” is “1.0” on the monitor +

Sep 20 14:59:34 pg-primary systemd[1]: Started pg_auto_failover.
+Sep 20 14:59:34 pg-primary pg_autoctl[7817]: 14:59:34 INFO  Managing PostgreSQL installation at "/var/lib/postgresql/11/main"
+Sep 20 14:59:34 pg-primary pg_autoctl[7817]: 14:59:34 INFO  Found a stale pidfile at "/tmp/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.pid"
+Sep 20 14:59:34 pg-primary pg_autoctl[7817]: 14:59:34 WARN  Removing the stale pid file "/tmp/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.pid"
+Sep 20 14:59:34 pg-primary pg_autoctl[7817]: 14:59:34 INFO  The version of extenstion "pgautofailover" is "1.0" on the monitor
 Sep 20 14:59:34 pg-primary pg_autoctl[7817]: 14:59:34 INFO  pg_autoctl service is starting
-Sep 20 14:59:34 pg-primary pg_autoctl[7817]: 14:59:34 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 14:59:39 pg-primary pg_autoctl[7817]: 14:59:39 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 14:59:44 pg-primary pg_autoctl[7817]: 14:59:44 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 14:59:49 pg-primary pg_autoctl[7817]: 14:59:49 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 14:59:54 pg-primary pg_autoctl[7817]: 14:59:54 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 14:59:59 pg-primary pg_autoctl[7817]: 14:59:59 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:00:04 pg-primary pg_autoctl[7817]: 15:00:04 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:00:09 pg-primary pg_autoctl[7817]: 15:00:09 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:00:15 pg-primary pg_autoctl[7817]: 15:00:14 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:00:20 pg-primary pg_autoctl[7817]: 15:00:20 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:00:25 pg-primary pg_autoctl[7817]: 15:00:25 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:00:30 pg-primary pg_autoctl[7817]: 15:00:30 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:00:35 pg-primary pg_autoctl[7817]: 15:00:35 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:00:40 pg-primary pg_autoctl[7817]: 15:00:40 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:00:45 pg-primary pg_autoctl[7817]: 15:00:45 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:00:50 pg-primary pg_autoctl[7817]: 15:00:50 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:00:55 pg-primary pg_autoctl[7817]: 15:00:55 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:01:00 pg-primary pg_autoctl[7817]: 15:01:00 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:01:05 pg-primary pg_autoctl[7817]: 15:01:05 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:01:10 pg-primary pg_autoctl[7817]: 15:01:10 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:01:15 pg-primary pg_autoctl[7817]: 15:01:15 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:01:20 pg-primary pg_autoctl[7817]: 15:01:20 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:01:25 pg-primary pg_autoctl[7817]: 15:01:25 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:01:30 pg-primary pg_autoctl[7817]: 15:01:30 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:01:35 pg-primary pg_autoctl[7817]: 15:01:35 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:01:40 pg-primary pg_autoctl[7817]: 15:01:40 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:01:45 pg-primary pg_autoctl[7817]: 15:01:45 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:01:50 pg-primary pg_autoctl[7817]: 15:01:50 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:01:55 pg-primary pg_autoctl[7817]: 15:01:55 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:02:00 pg-primary pg_autoctl[7817]: 15:02:00 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:02:05 pg-primary pg_autoctl[7817]: 15:02:05 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:02:10 pg-primary pg_autoctl[7817]: 15:02:10 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:02:15 pg-primary pg_autoctl[7817]: 15:02:15 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:02:20 pg-primary pg_autoctl[7817]: 15:02:20 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:02:25 pg-primary pg_autoctl[7817]: 15:02:25 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:02:30 pg-primary pg_autoctl[7817]: 15:02:30 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:02:35 pg-primary pg_autoctl[7817]: 15:02:35 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:02:40 pg-primary pg_autoctl[7817]: 15:02:40 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:02:46 pg-primary pg_autoctl[7817]: 15:02:46 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:02:51 pg-primary pg_autoctl[7817]: 15:02:51 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:02:56 pg-primary pg_autoctl[7817]: 15:02:56 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:03:01 pg-primary pg_autoctl[7817]: 15:03:01 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:03:06 pg-primary pg_autoctl[7817]: 15:03:06 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:03:11 pg-primary pg_autoctl[7817]: 15:03:11 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:03:16 pg-primary pg_autoctl[7817]: 15:03:16 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:03:21 pg-primary pg_autoctl[7817]: 15:03:21 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:03:26 pg-primary pg_autoctl[7817]: 15:03:26 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:03:31 pg-primary pg_autoctl[7817]: 15:03:31 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:03:36 pg-primary pg_autoctl[7817]: 15:03:36 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:03:41 pg-primary pg_autoctl[7817]: 15:03:41 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:03:46 pg-primary pg_autoctl[7817]: 15:03:46 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:03:51 pg-primary pg_autoctl[7817]: 15:03:51 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:03:57 pg-primary pg_autoctl[7817]: 15:03:57 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:04:02 pg-primary pg_autoctl[7817]: 15:04:02 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:04:07 pg-primary pg_autoctl[7817]: 15:04:07 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:04:12 pg-primary pg_autoctl[7817]: 15:04:12 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:04:17 pg-primary pg_autoctl[7817]: 15:04:17 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:04:22 pg-primary pg_autoctl[7817]: 15:04:22 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:04:27 pg-primary pg_autoctl[7817]: 15:04:27 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:04:32 pg-primary pg_autoctl[7817]: 15:04:32 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:04:37 pg-primary pg_autoctl[7817]: 15:04:37 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:04:42 pg-primary pg_autoctl[7817]: 15:04:42 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:04:47 pg-primary pg_autoctl[7817]: 15:04:47 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.
-Sep 20 15:04:52 pg-primary pg_autoctl[7817]: 15:04:52 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0.

- -

-這樣下去不用多久,syslog 一定爆,而且嚴重干擾到其他的系統訊息
-
-所以要修改一下 config ,把log 寫到另外的檔案去
-
-在/etc/systemd/system/pgautofailover.service 加入底下兩行
+Sep 20 14:59:34 pg-primary pg_autoctl[7817]: 14:59:34 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 14:59:39 pg-primary pg_autoctl[7817]: 14:59:39 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 14:59:44 pg-primary pg_autoctl[7817]: 14:59:44 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 14:59:49 pg-primary pg_autoctl[7817]: 14:59:49 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 14:59:54 pg-primary pg_autoctl[7817]: 14:59:54 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 14:59:59 pg-primary pg_autoctl[7817]: 14:59:59 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:00:04 pg-primary pg_autoctl[7817]: 15:00:04 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:00:09 pg-primary pg_autoctl[7817]: 15:00:09 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:00:15 pg-primary pg_autoctl[7817]: 15:00:14 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:00:20 pg-primary pg_autoctl[7817]: 15:00:20 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:00:25 pg-primary pg_autoctl[7817]: 15:00:25 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:00:30 pg-primary pg_autoctl[7817]: 15:00:30 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:00:35 pg-primary pg_autoctl[7817]: 15:00:35 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:00:40 pg-primary pg_autoctl[7817]: 15:00:40 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:00:45 pg-primary pg_autoctl[7817]: 15:00:45 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:00:50 pg-primary pg_autoctl[7817]: 15:00:50 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:00:55 pg-primary pg_autoctl[7817]: 15:00:55 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:01:00 pg-primary pg_autoctl[7817]: 15:01:00 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:01:05 pg-primary pg_autoctl[7817]: 15:01:05 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:01:10 pg-primary pg_autoctl[7817]: 15:01:10 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:01:15 pg-primary pg_autoctl[7817]: 15:01:15 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:01:20 pg-primary pg_autoctl[7817]: 15:01:20 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:01:25 pg-primary pg_autoctl[7817]: 15:01:25 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:01:30 pg-primary pg_autoctl[7817]: 15:01:30 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:01:35 pg-primary pg_autoctl[7817]: 15:01:35 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:01:40 pg-primary pg_autoctl[7817]: 15:01:40 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:01:45 pg-primary pg_autoctl[7817]: 15:01:45 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:01:50 pg-primary pg_autoctl[7817]: 15:01:50 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:01:55 pg-primary pg_autoctl[7817]: 15:01:55 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:02:00 pg-primary pg_autoctl[7817]: 15:02:00 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:02:05 pg-primary pg_autoctl[7817]: 15:02:05 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:02:10 pg-primary pg_autoctl[7817]: 15:02:10 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:02:15 pg-primary pg_autoctl[7817]: 15:02:15 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:02:20 pg-primary pg_autoctl[7817]: 15:02:20 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:02:25 pg-primary pg_autoctl[7817]: 15:02:25 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:02:30 pg-primary pg_autoctl[7817]: 15:02:30 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:02:35 pg-primary pg_autoctl[7817]: 15:02:35 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:02:40 pg-primary pg_autoctl[7817]: 15:02:40 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:02:46 pg-primary pg_autoctl[7817]: 15:02:46 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:02:51 pg-primary pg_autoctl[7817]: 15:02:51 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:02:56 pg-primary pg_autoctl[7817]: 15:02:56 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:03:01 pg-primary pg_autoctl[7817]: 15:03:01 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:03:06 pg-primary pg_autoctl[7817]: 15:03:06 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:03:11 pg-primary pg_autoctl[7817]: 15:03:11 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:03:16 pg-primary pg_autoctl[7817]: 15:03:16 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:03:21 pg-primary pg_autoctl[7817]: 15:03:21 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:03:26 pg-primary pg_autoctl[7817]: 15:03:26 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:03:31 pg-primary pg_autoctl[7817]: 15:03:31 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:03:36 pg-primary pg_autoctl[7817]: 15:03:36 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:03:41 pg-primary pg_autoctl[7817]: 15:03:41 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:03:46 pg-primary pg_autoctl[7817]: 15:03:46 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:03:51 pg-primary pg_autoctl[7817]: 15:03:51 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:03:57 pg-primary pg_autoctl[7817]: 15:03:57 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:04:02 pg-primary pg_autoctl[7817]: 15:04:02 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:04:07 pg-primary pg_autoctl[7817]: 15:04:07 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:04:12 pg-primary pg_autoctl[7817]: 15:04:12 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:04:17 pg-primary pg_autoctl[7817]: 15:04:17 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:04:22 pg-primary pg_autoctl[7817]: 15:04:22 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:04:27 pg-primary pg_autoctl[7817]: 15:04:27 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:04:32 pg-primary pg_autoctl[7817]: 15:04:32 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:04:37 pg-primary pg_autoctl[7817]: 15:04:37 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:04:42 pg-primary pg_autoctl[7817]: 15:04:42 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:04:47 pg-primary pg_autoctl[7817]: 15:04:47 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+Sep 20 15:04:52 pg-primary pg_autoctl[7817]: 15:04:52 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
 
-

StandardOutput=file:/var/log/pgautofailover.log -StandardError=file:/var/log/pgautofailover-error.log

+

這樣下去不用多久,syslog 一定爆,而且嚴重干擾到其他的系統訊息

-
然後 sudo systemctl daemon-reload 接著重起 sudo service pgautofailover restart
+

所以要修改一下 config ,把log 寫到另外的檔案去

-就會把訊息都寫到 /var/log/pgautofailover.log , /var/log/pgautofailover-error.log +

在/etc/systemd/system/pgautofailover.service 加入底下兩行

-再用 logrotate 來管理就可以了 +
StandardOutput=file:/var/log/pgautofailover.log
+StandardError=file:/var/log/pgautofailover-error.log
+
-這個動作需要在三台node都執行 +

然後 sudo systemctl daemon-reload 接著重起 sudo service pgautofailover restart

-不過我在想應該是可以不要產生那麼多log ,只要紀錄 critical 就好了?來試試看加入 LogLevelMax=3 的效果 +

就會把訊息都寫到 /var/log/pgautofailover.log , /var/log/pgautofailover-error.log

-在 /etc/systemd/system/pgautofailover.service 檔案中的service 區段 加入 -
+

再用 logrotate 來管理就可以了

-

LogLevelMax=3

+

這個動作需要在三台node都執行

-
loglevel 的等級定義看這邊
+

不過我在想應該是可以不要產生那麼多log ,只要紀錄 critical 就好了?來試試看加入 LogLevelMax=3 的效果

-https://www.ctrl.blog/entry/systemd-log-levels.html +

在 /etc/systemd/system/pgautofailover.service 檔案中的service 區段 加入

+
LogLevelMax=3
+
-* emergency (0) -* alert (1) -* critical (2) -* error (3) -* warning (4) -* notice (5) -* info (6) -* debug (7) +

loglevel 的等級定義看這邊

-完成後,一樣執行 +

https://www.ctrl.blog/entry/systemd-log-levels.html

-
+
    +
  • emergency (0)
  • +
  • alert (1)
  • +
  • critical (2)
  • +
  • error (3)
  • +
  • warning (4)
  • +
  • notice (5)
  • +
  • info (6)
  • +
  • debug (7)
  • +
-

sudo systemctl daemon-reload -sudo service pgautofailover restart

+

完成後,一樣執行

-
看一下檔案內容,怎麼就沒有訊息了!
+
sudo systemctl daemon-reload
+sudo service pgautofailover restart
 
-

2019-09-20 15:23:15 [administrator@pg-primary ~]$ sudo tail -30 /var/log/pgautofailover.log -15:14:22 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:14:27 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:14:32 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:14:37 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:14:42 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:14:47 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:14:52 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:14:57 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:15:02 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:15:07 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:15:12 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:15:17 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:15:22 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:15:27 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:15:32 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:15:37 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:15:42 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:15:47 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:15:52 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:15:58 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:16:03 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:16:08 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:16:13 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:16:18 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:16:23 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:16:28 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:16:34 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. -15:16:39 INFO Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 0. +

看一下檔案內容,怎麼就沒有訊息了!

+ +
2019-09-20 15:23:15 [administrator@pg-primary ~]$ sudo tail -30 /var/log/pgautofailover.log 
+15:14:22 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:14:27 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:14:32 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:14:37 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:14:42 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:14:47 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:14:52 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:14:57 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:15:02 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:15:07 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:15:12 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:15:17 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:15:22 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:15:27 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:15:32 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:15:37 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:15:42 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:15:47 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:15:52 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:15:58 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:16:03 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:16:08 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:16:13 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:16:18 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:16:23 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:16:28 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:16:34 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
+15:16:39 INFO  Calling node_active for node default/1/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 0.
 15:16:39 WARN  Smart shutdown: received signal Terminated
 15:16:39 INFO  pg_autoctl service stopping
-2019-09-20 15:23:29 [administrator@pg-primary ~]$

- -

-可是我的服務活著啊!
-
+2019-09-20 15:23:29 [administrator@pg-primary ~]$ 
 
-

2019-09-20 15:23:29 [administrator@pg-primary ~]$ sudo service pgautofailover status +

可是我的服務活著啊!

+ +
2019-09-20 15:23:29 [administrator@pg-primary ~]$ sudo service pgautofailover status
 ● pgautofailover.service - pg_auto_failover
    Loaded: loaded (/etc/systemd/system/pgautofailover.service; disabled; vendor preset: enabled)
    Active: active (running) since Fri 2019-09-20 15:16:39 CST; 7min ago
  Main PID: 8611 (pg_autoctl)
     Tasks: 1 (limit: 2321)
    CGroup: /system.slice/pgautofailover.service
-           └─8611 /usr/bin/pg_autoctl run

- -

Sep 20 15:16:39 pg-primary systemd[1]: Started pg_auto_failover. -2019-09-20 15:23:56 [administrator@pg-primary ~]$

- -

-拿另一台來做實驗,先不要加入 loglevelmax=3 ,看看log到底長怎樣!
+           └─8611 /usr/bin/pg_autoctl run
 
+Sep 20 15:16:39 pg-primary systemd[1]: Started pg_auto_failover.
+2019-09-20 15:23:56 [administrator@pg-primary ~]$
 
-

2019-09-20 07:25:34 [administrator@pg-slave ~]$ sudo service pgautofailover stop +

拿另一台來做實驗,先不要加入 loglevelmax=3 ,看看log到底長怎樣!

+ +
2019-09-20 07:25:34 [administrator@pg-slave ~]$ sudo service pgautofailover stop
 2019-09-20 07:25:58 [administrator@pg-slave ~]$ sudo tail -10 /var/log/syslog
-Sep 20 07:25:32 pg-slave pg_autoctl[5840]: 07:25:32 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is “sync”, WAL delta is 0.
-Sep 20 07:25:37 pg-slave pg_autoctl[5840]: 07:25:37 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is “sync”, WAL delta is 0.
-Sep 20 07:25:42 pg-slave pg_autoctl[5840]: 07:25:42 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is “sync”, WAL delta is 0.
-Sep 20 07:25:47 pg-slave pg_autoctl[5840]: 07:25:47 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is “sync”, WAL delta is 0.
-Sep 20 07:25:52 pg-slave pg_autoctl[5840]: 07:25:52 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is “sync”, WAL delta is 0.
-Sep 20 07:25:57 pg-slave pg_autoctl[5840]: 07:25:57 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is “sync”, WAL delta is 0.
-Sep 20 07:25:58 pg-slave systemd[1]: Stopping pg_auto_failover…
+Sep 20 07:25:32 pg-slave pg_autoctl[5840]: 07:25:32 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is "sync", WAL delta is 0.
+Sep 20 07:25:37 pg-slave pg_autoctl[5840]: 07:25:37 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is "sync", WAL delta is 0.
+Sep 20 07:25:42 pg-slave pg_autoctl[5840]: 07:25:42 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is "sync", WAL delta is 0.
+Sep 20 07:25:47 pg-slave pg_autoctl[5840]: 07:25:47 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is "sync", WAL delta is 0.
+Sep 20 07:25:52 pg-slave pg_autoctl[5840]: 07:25:52 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is "sync", WAL delta is 0.
+Sep 20 07:25:57 pg-slave pg_autoctl[5840]: 07:25:57 INFO  Calling node_active for node default/4/0 with current state: primary, PostgreSQL is running, sync_state is "sync", WAL delta is 0.
+Sep 20 07:25:58 pg-slave systemd[1]: Stopping pg_auto_failover...
 Sep 20 07:25:58 pg-slave pg_autoctl[5840]: 07:25:58 WARN  Smart shutdown: received signal Terminated
 Sep 20 07:25:58 pg-slave pg_autoctl[5840]: 07:25:58 INFO  pg_autoctl service stopping
-Sep 20 07:25:58 pg-slave systemd[1]: Stopped pg_auto_failover.

+Sep 20 07:25:58 pg-slave systemd[1]: Stopped pg_auto_failover. +
-

-好吧,看不出有什麼差別! 直接把primary 關機測試! 結果還是看不出來 0rz
+

好吧,看不出有什麼差別! 直接把primary 關機測試! 結果還是看不出來 0rz

-放棄加入loglevel 的想法,就用logrotate 來管理吧! +

放棄加入loglevel 的想法,就用logrotate 來管理吧!

-UPDATE +

UPDATE

-用systemd 來管理的作法失敗了,重起之後, pgautofailover 不會自動啟動... +

用systemd 來管理的作法失敗了,重起之後, pgautofailover 不會自動啟動…

-找了很久找不出原因,改用 supervisor來做好了 +

找了很久找不出原因,改用 supervisor來做好了

-*** +
-#### 設定 supervisor +

設定 supervisor

-**安裝supervisor** -
+

安裝supervisor

-

sudo apt install supervisor +

sudo apt install supervisor
 sudo vim /etc/supervisor/conf.d/pgautofailover.conf
 加入以下內容
 [program:pgautofailover]
@@ -1038,47 +977,45 @@ user = postgres
 autostart=true
 autorestart=true
 redirect_stderr = true
-environment=PGDATA=/var/lib/postgresql/11/main,HOME=/var/lib/postgresql

- -
重新啟動 supervisor & 檢查狀態
+environment=PGDATA=/var/lib/postgresql/11/main,HOME=/var/lib/postgresql
 
-

2019-09-20 17:26:22 [administrator@pg-primary postgresql]$ sudo service supervisor restart +

重新啟動 supervisor & 檢查狀態

+ +
2019-09-20 17:26:22 [administrator@pg-primary postgresql]$ sudo service supervisor restart
 2019-09-20 17:26:57 [administrator@pg-primary postgresql]$ sudo supervisorctl status
 pgautofailover                   RUNNING   pid 14554, uptime 0:00:05
-2019-09-20 17:27:03 [administrator@pg-primary postgresql]$

- -
檢查 supervisor log
+2019-09-20 17:27:03 [administrator@pg-primary postgresql]$ 
 
-

2019-09-20 17:27:03 [administrator@pg-primary postgresql]$ sudo tail -30 /var/log/supervisor/pgautofailover-stdout—supervisor-5V8qET.log -17:26:58 INFO Managing PostgreSQL installation at “/var/lib/postgresql/11/main” -17:26:58 INFO Found a stale pidfile at “/tmp/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.pid” -17:26:58 WARN Removing the stale pid file “/tmp/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.pid” -17:26:58 INFO The version of extenstion “pgautofailover” is “1.0” on the monitor -17:26:58 INFO pg_autoctl service is starting -17:26:58 INFO Calling node_active for node default/1/0 with current state: primary, PostgreSQL is running, sync_state is “sync”, WAL delta is 0. -17:27:03 INFO Calling node_active for node default/1/0 with current state: primary, PostgreSQL is running, sync_state is “sync”, WAL delta is 0. -17:27:08 INFO Calling node_active for node default/1/0 with current state: primary, PostgreSQL is running, sync_state is “sync”, WAL delta is 0. -17:27:13 INFO Calling node_active for node default/1/0 with current state: primary, PostgreSQL is running, sync_state is “sync”, WAL delta is 0. -17:27:18 INFO Calling node_active for node default/1/0 with current state: primary, PostgreSQL is running, sync_state is “sync”, WAL delta is 0. -2019-09-20 17:27:21 [administrator@pg-primary postgresql]$

+

檢查 supervisor log

-

-簡單多了 ... 在剩下的monitor/slave 也一樣操作安裝、設定supervisor 就可以了
+
2019-09-20 17:27:03 [administrator@pg-primary postgresql]$ sudo tail -30 /var/log/supervisor/pgautofailover-stdout---supervisor-5V8qET.log 
+17:26:58 INFO  Managing PostgreSQL installation at "/var/lib/postgresql/11/main"
+17:26:58 INFO  Found a stale pidfile at "/tmp/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.pid"
+17:26:58 WARN  Removing the stale pid file "/tmp/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.pid"
+17:26:58 INFO  The version of extenstion "pgautofailover" is "1.0" on the monitor
+17:26:58 INFO  pg_autoctl service is starting
+17:26:58 INFO  Calling node_active for node default/1/0 with current state: primary, PostgreSQL is running, sync_state is "sync", WAL delta is 0.
+17:27:03 INFO  Calling node_active for node default/1/0 with current state: primary, PostgreSQL is running, sync_state is "sync", WAL delta is 0.
+17:27:08 INFO  Calling node_active for node default/1/0 with current state: primary, PostgreSQL is running, sync_state is "sync", WAL delta is 0.
+17:27:13 INFO  Calling node_active for node default/1/0 with current state: primary, PostgreSQL is running, sync_state is "sync", WAL delta is 0.
+17:27:18 INFO  Calling node_active for node default/1/0 with current state: primary, PostgreSQL is running, sync_state is "sync", WAL delta is 0.
+2019-09-20 17:27:21 [administrator@pg-primary postgresql]$ 
+
-喔,順便把剛剛新增的 systemd config 給砍了.. +

簡單多了 … 在剩下的monitor/slave 也一樣操作安裝、設定supervisor 就可以了

-不過不砍也沒差,反正開機也不會自己啟動 = =+ +

喔,順便把剛剛新增的 systemd config 給砍了..

-
+

不過不砍也沒差,反正開機也不會自己啟動 = =+

-

2019-09-20 17:24:17 [administrator@pg-slave ~]$ sudo rm -rf /etc/systemd/system/pgautofailover.service +

2019-09-20 17:24:17 [administrator@pg-slave ~]$ sudo rm -rf /etc/systemd/system/pgautofailover.service
 #安裝supervisor
 2019-09-20 17:23:08 [administrator@pg-slave ~]$ sudo apt install supervisor
-Reading package lists… Done
-Building dependency tree
-Reading state information… Done +Reading package lists... Done +Building dependency tree +Reading state information... Done The following additional packages will be installed: python-meld3 python-pkg-resources Suggested packages: @@ -1089,46 +1026,66 @@ The following NEW packages will be installed: Need to get 415 kB of archives. After this operation, 2,138 kB of additional disk space will be used. Do you want to continue? [Y/n] y -Get:1 http://archive.ubuntu.com/ubuntu bionic/main amd64 python-pkg-resources all 39.0.1-2 [128 kB] -Get:2 http://archive.ubuntu.com/ubuntu bionic/universe amd64 python-meld3 all 1.0.2-2 [30.9 kB] -Get:3 http://archive.ubuntu.com/ubuntu bionic/universe amd64 supervisor all 3.3.1-1.1 [256 kB] -Fetched 415 kB in 0s (6,752 kB/s)
+Get:1 http://archive.ubuntu.com/ubuntu bionic/main amd64 python-pkg-resources all 39.0.1-2 [128 kB] +Get:2 http://archive.ubuntu.com/ubuntu bionic/universe amd64 python-meld3 all 1.0.2-2 [30.9 kB] +Get:3 http://archive.ubuntu.com/ubuntu bionic/universe amd64 supervisor all 3.3.1-1.1 [256 kB] +Fetched 415 kB in 0s (6,752 kB/s) Selecting previously unselected package python-pkg-resources. -(Reading database … 103586 files and directories currently installed.) -Preparing to unpack …/python-pkg-resources_39.0.1-2_all.deb … -Unpacking python-pkg-resources (39.0.1-2) … +(Reading database ... 103586 files and directories currently installed.) +Preparing to unpack .../python-pkg-resources_39.0.1-2_all.deb ... +Unpacking python-pkg-resources (39.0.1-2) ... Selecting previously unselected package python-meld3. -Preparing to unpack …/python-meld3_1.0.2-2_all.deb … -Unpacking python-meld3 (1.0.2-2) … +Preparing to unpack .../python-meld3_1.0.2-2_all.deb ... +Unpacking python-meld3 (1.0.2-2) ... Selecting previously unselected package supervisor. -Preparing to unpack …/supervisor_3.3.1-1.1_all.deb … -Unpacking supervisor (3.3.1-1.1) … -Processing triggers for ureadahead (0.100.0-20) … -Setting up python-meld3 (1.0.2-2) … -Setting up python-pkg-resources (39.0.1-2) … -Setting up supervisor (3.3.1-1.1) … +Preparing to unpack .../supervisor_3.3.1-1.1_all.deb ... +Unpacking supervisor (3.3.1-1.1) ... +Processing triggers for ureadahead (0.100.0-20) ... +Setting up python-meld3 (1.0.2-2) ... +Setting up python-pkg-resources (39.0.1-2) ... +Setting up supervisor (3.3.1-1.1) ... Created symlink /etc/systemd/system/multi-user.target.wants/supervisor.service → /lib/systemd/system/supervisor.service. -Processing triggers for systemd (237-3ubuntu10.29) … -Processing triggers for man-db (2.8.3-2) … -Processing triggers for ureadahead (0.100.0-20) … +Processing triggers for systemd (237-3ubuntu10.29) ... +Processing triggers for man-db (2.8.3-2) ... +Processing triggers for ureadahead (0.100.0-20) ... 2019-09-20 17:23:37 [administrator@pg-slave ~]$ sudo vim /etc/supervisor/conf.d/pgautofailover.conf #貼上上面的 conf 內容 2019-09-20 17:30:12 [administrator@pg-slave ~]$ sudo service supervisor restart 2019-09-20 17:30:58 [administrator@pg-slave ~]$ sudo supervisorctl status pgautofailover RUNNING pid 2232, uptime 0:00:04 -2019-09-20 17:31:03 [administrator@pg-slave ~]$ sudo tail -30 /var/log/supervisor/pgautofailover-stdout—supervisor-LC_FG8.log -17:30:59 INFO Managing PostgreSQL installation at “/var/lib/postgresql/11/main” -17:30:59 INFO Found a stale pidfile at “/tmp/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.pid” -17:30:59 WARN Removing the stale pid file “/tmp/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.pid” -17:30:59 INFO The version of extenstion “pgautofailover” is “1.0” on the monitor +2019-09-20 17:31:03 [administrator@pg-slave ~]$ sudo tail -30 /var/log/supervisor/pgautofailover-stdout---supervisor-LC_FG8.log +17:30:59 INFO Managing PostgreSQL installation at "/var/lib/postgresql/11/main" +17:30:59 INFO Found a stale pidfile at "/tmp/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.pid" +17:30:59 WARN Removing the stale pid file "/tmp/pg_autoctl/var/lib/postgresql/11/main/pg_autoctl.pid" +17:30:59 INFO The version of extenstion "pgautofailover" is "1.0" on the monitor 17:30:59 INFO pg_autoctl service is starting -17:30:59 INFO Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 320. -17:31:04 INFO Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 320. -17:31:09 INFO Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is “”, WAL delta is 320. -2019-09-20 17:31:13 [administrator@pg-slave ~]$ -```

+17:30:59 INFO Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 320. +17:31:04 INFO Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 320. +17:31:09 INFO Calling node_active for node default/4/0 with current state: secondary, PostgreSQL is running, sync_state is "", WAL delta is 320. +2019-09-20 17:31:13 [administrator@pg-slave ~]$ +

搞定,收工!準備寫 ansible playbook!

+ +
+ +

UPDATE

+ +

新增用logrotate 來管理 pgautofailover logfile

+ +
sudo vim /etc/logrotate.d/pgautofailover
+
+/var/log/supervisor/pgautofail*.log {
+    daily
+    rotate 7                                                         
+    copytruncate
+    delaycompress    # today and yesterday will not compress
+    compress
+    missingok
+    notifempty
+}    
+
+