You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We want to bind standby VIP to the sync standby with highest sync_priority according to the sync_standby list in DCS. But sync_standby list order is inconsistent with synchronous_standby_names. It is sorted by application name not by sync_ priority. Besides, synchronous_standby_names is not stable enough when changing synchronous_node_count.
To make sync_standby and synchronous_standby_names order more stable, I suggest to consider sync_priority as the third sort condition following sync_state and lsn when collecting ReplicaList from pg_stat_replication. Please help evaluate this proposal. Thanks.
How can we reproduce it (as minimally and precisely as possible)?
Set up a cluster with 3 nodes
Set synchronous_mode to true via patronictl edit-config, ee_03 was selected as sync standby
Set synchronous_node_count to 2 via patronictl edit-config, ee_03 and ee_02 were selected as sync standby
check sync standby list
postgresql.conf:
synchronous_standby_names = '2 (ee_03,ee_02)'
/sync key in etcd:
{"leader":"ee_01","sync_standby":"ee_02,ee_03"}
set synchronous_node_count to 1, sometimes ee_02 was selected as sync standby, sometimes ee_03 was selected as sync standby
What did you expect to happen?
sync_standby list order is consistent with synchronous_standby_names
synchronous_standby_names is changed consistently when changing synchronous_node_count. For example, when decrease synchronous_node_count, sync standby with higher sync_priority is expected to remain in the list.
Patroni/PostgreSQL/DCS version
Patroni version: 3.1.0
PostgreSQL version: 13.0
DCS (and its version): etcd3.5.9
Patroni configuration file
scope: postgres-clusternamespace: /service/name: ee_01restapi:
listen: 192.168.61.105:8008connect_address: 192.168.61.105:8008etcd:
hosts: 192.168.61.105:2379,192.168.61.106:2379,192.168.61.107:2379log:
level: INFOtraceback_level: INFOdir: /home/postgres/patronifile_num: 10file_size: 104857600bootstrap:
dcs:
ttl: 30loop_wait: 10retry_timeout: 10maximum_lag_on_failover: 1048576master_start_timeout: 300synchronous_mode: falsepostgresql:
use_pg_rewind: trueparameters:
wal_level: replicahot_standby: "on"wal_keep_size: 100max_wal_senders: 10max_replication_slots: 10wal_log_hints: "on"archive_mode: "off"archive_timeout: 1800slogging_collector: "on"postgresql:
database: postgreslisten: 0.0.0.0:5432connect_address: 192.168.61.105:5432bin_dir: /usr/local/pgsql/bindata_dir: /usr/local/pgsql/datapgpass: /home/postgres/tmp/.pgpassauthentication:
replication:
username: postgrespassword: postgressuperuser:
username: postgrespassword: postgresrewind:
username: postgrespassword: postgrespg_hba:
- local all all trust
- host all all 0.0.0.0/0 trust
- host all all ::1/128 trust
- local replication all trust
- host replication all 0.0.0.0/0 trust
- host replication all ::1/128 trusttags:
nofailover: falsenoloadbalance: falseclonefrom: falsenosync: false
From the patroni log, the list order changed when changing synchronous_node_count from 1 to 2. This caused some confusion. At the begining, it showed ['ee_03', 'ee_02'], because list(picked) was ['ee_03', 'ee_02'] as candidates. Then it showd ['ee_02', 'ee_03'], because list(allow_promote) was changed to ['ee_02', 'ee_03'] as sync_nodes. When sync_state and lsn are the same, if we use sync_priority as the third sort condition of ReplicaList, the list(allow_promote) will be ['ee_03', 'ee_02'] same as list(picked).
2024-05-09 16:12:23,737 INFO: Assigning synchronous standby status to ['ee_03']
2024-05-09 16:12:26,024 INFO: Synchronous standby status assigned to ['ee_03']
2024-05-09 16:12:26,080 INFO: no action. I am (ee_01), the leader with the lock
2024-05-09 16:12:33,723 INFO: no action. I am (ee_01), the leader with the lock
2024-05-09 16:12:43,765 INFO: no action. I am (ee_01), the leader with the lock
2024-05-09 16:12:48,987 INFO: no action. I am (ee_01), the leader with the lock
2024-05-09 16:12:59,085 INFO: no action. I am (ee_01), the leader with the lock
2024-05-09 16:13:08,985 INFO: no action. I am (ee_01), the leader with the lock
2024-05-09 16:13:19,042 INFO: no action. I am (ee_01), the leader with the lock
2024-05-09 16:13:28,938 INFO: Lock owner: ee_01; I am ee_01
**2024-05-09 16:13:28,992 INFO: Assigning synchronous standby status to ['ee_03', 'ee_02']
2024-05-09 16:13:31,303 INFO: Synchronous standby status assigned to ['ee_02', 'ee_03']**
2024-05-09 16:13:31,356 INFO: no action. I am (ee_01), the leader with the lock
2024-05-09 16:13:39,031 INFO: no action. I am (ee_01), the leader with the lock
2024-05-09 16:13:49,091 INFO: no action. I am (ee_01), the leader with the lock
2024-05-09 16:13:59,009 INFO: no action. I am (ee_01), the leader with the lock
2024-05-09 16:13:59,010 INFO: Lock owner: ee_01; I am ee_01
2024-05-09 16:13:59,059 INFO: Updating synchronous privilege temporarily from ['ee_02', 'ee_03'] to ['ee_02']
2024-05-09 16:13:59,106 INFO: Assigning synchronous standby status to ['ee_02']
2024-05-09 16:13:59,431 INFO: no action. I am (ee_01), the leader with the lock
PostgreSQL log files
From the postgresql log, ee_02 with lower priority remained when changing synchronous_node_count from 2 to 1. For the priority-based synchronous replication, the standbys with higher priority will be considered as sync and other standbys may be considered as potential. Although we set precise node number in synchronous_standby_names, it is more reasonable to keep original higher priority node in the list.
2024-05-09 16:12:23.865 CST [6977] LOG: parameter "synchronous_standby_names" changed to "ee_03"**2024-05-09 16:12:23.975 CST [7013] LOG: standby "ee_03" is now a synchronous standby with priority 1**
2024-05-09 16:12:23.975 CST [7013] STATEMENT: START_REPLICATION SLOT "ee_03" 0/ED000000 TIMELINE 50
2024-05-09 16:12:49.111 CST [6977] LOG: received SIGHUP, reloading configuration files
2024-05-09 16:13:29.116 CST [6977] LOG: parameter "synchronous_standby_names" changed to "2 (ee_03,ee_02)"**2024-05-09 16:13:29.235 CST [17473] LOG: standby "ee_02" is now a synchronous standby with priority 2**
2024-05-09 16:13:29.235 CST [17473] STATEMENT: START_REPLICATION SLOT "ee_02" 0/F3000000 TIMELINE 50
2024-05-09 16:13:59.235 CST [6977] LOG: received SIGHUP, reloading configuration files
2024-05-09 16:13:59.236 CST [6977] LOG: parameter "synchronous_standby_names" changed to "ee_02"
2024-05-09 16:13:59.550 CST [6977] LOG: received SIGHUP, reloading configuration files
Have you tried to use GitHub issue search?
Yes
Anything else we need to know?
No response
The text was updated successfully, but these errors were encountered:
What happened?
We want to bind standby VIP to the sync standby with highest sync_priority according to the sync_standby list in DCS. But sync_standby list order is inconsistent with synchronous_standby_names. It is sorted by application name not by sync_ priority. Besides, synchronous_standby_names is not stable enough when changing synchronous_node_count.
To make sync_standby and synchronous_standby_names order more stable, I suggest to consider sync_priority as the third sort condition following sync_state and lsn when collecting ReplicaList from pg_stat_replication. Please help evaluate this proposal. Thanks.
How can we reproduce it (as minimally and precisely as possible)?
postgresql.conf:
synchronous_standby_names = '2 (ee_03,ee_02)'
/sync key in etcd:
{"leader":"ee_01","sync_standby":"ee_02,ee_03"}
What did you expect to happen?
Patroni/PostgreSQL/DCS version
Patroni configuration file
patronictl show-config
Patroni log files
From the patroni log, the list order changed when changing synchronous_node_count from 1 to 2. This caused some confusion. At the begining, it showed ['ee_03', 'ee_02'], because list(picked) was ['ee_03', 'ee_02'] as candidates. Then it showd ['ee_02', 'ee_03'], because list(allow_promote) was changed to ['ee_02', 'ee_03'] as sync_nodes. When sync_state and lsn are the same, if we use sync_priority as the third sort condition of ReplicaList, the list(allow_promote) will be ['ee_03', 'ee_02'] same as list(picked).
PostgreSQL log files
From the postgresql log, ee_02 with lower priority remained when changing synchronous_node_count from 2 to 1. For the priority-based synchronous replication, the standbys with higher priority will be considered as sync and other standbys may be considered as potential. Although we set precise node number in synchronous_standby_names, it is more reasonable to keep original higher priority node in the list.
Have you tried to use GitHub issue search?
Anything else we need to know?
No response
The text was updated successfully, but these errors were encountered: