Pgpool-II 이란?
- Pgpool-II 는 PostgreSQL DB서버와 클라이언트 사이에 위치하는 proxy 개념의 미들웨어로 아래의 세가지 기능을 제공함
- connection pooling : DB서버와의 연결을 유지하고 동일한 속성(user,db,protocol) 을 가진 새로운 연결 요청이 오면 이를 재사용함
- load balancing : replication 을 사용하는 서비스에서 DB서버의 편중된 load를 줄이기 위해 SELECT 쿼리를 처리 가능한 서버 간 배포하는 기능
- automated failover : DB 서버 중 하나가 down되면 이를 차단하고 나머지 DB에서 서비스를 지속함, master down 될 시엔 다른 standby 서버를 승격시킴
PGPool 구성방법
pgpool 설치
https://pgpool.net/mediawiki/index.php/Main_Page
2020.04 기준 최신 stable 버전인 4.0 사용yum
yum install http://www.pgpool.net/yum/rpms/4.0/redhat/rhel-6-x86_64/pgpool-II-release-4.0-1.noarch.rpm
- yum 안될 때
yum install -y memcached
rpm -ivh pgpool-II-pg11-4.0.8-1pgdg.rhel6.x86_64.rpm
rpm -ivh pgpool-II-pg11-debuginfo-4.0.8-1pgdg.rhel6.x86_64.rpm
rpm -ivh pgpool-II-pg11-devel-4.0.8-1pgdg.rhel6.x86_64.rpm
rpm -ivh pgpool-II-pg11-extensions-4.0.8-1pgdg.rhel6.x86_64.rpm
pgpool.conf 설정
- vi /etc/pgpool-II/pgpool.conf
$ sudo cp /etc/pgpool-II/pgpool.conf.sample-stream pgpool.conf
=> pgpool.conf sample 파일을 이용하고 필요한 파라미터만 수정
- conf 파일 수정 내용
...
#------------------------------------------------------------------------------
# CONNECTIONS
#------------------------------------------------------------------------------
# - pgpool Connection Settings -
listen_addresses = '*'
port = 4000
socket_dir = '/tmp/.s.PGSQL.3000'
...
# - Backend Connection Settings -
backend_hostname0 = 'PRIMARY IP address'
backend_port0 = 3000
backend_weight0 = 1
backend_data_directory0 = '/home1/testuser/psql/engn/PGSQL'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = 'STANDBY IP address'
backend_port1 = 3000
backend_weight1 = 1
backend_data_directory1 = '/home1/testuser/psql/engn/PGSQL'
backend_flag1 = 'ALLOW_TO_FAILOVER'
...
#------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------
pid_file_name = '/var/run/pgpool.pid'
logdir = '/home1/testuser/psql/logs/$시스템명/'
...
#------------------------------------------------------------------------------
# LOAD BALANCING MODE
#------------------------------------------------------------------------------
load_balance_mode = on
...
#------------------------------------------------------------------------------
# MASTER/SLAVE MODE
#------------------------------------------------------------------------------
master_slave_mode = on
master_slave_sub_mode = 'stream'
...
# - Streaming -
sr_check_period = 5
sr_check_user = 'repl'
sr_check_password = 'qhdks123'
#------------------------------------------------------------------------------
# FAILOVER AND FAILBACK
#------------------------------------------------------------------------------
failover_command = '/usr/pgsql-11/share/failover.sh %d %H /home1/testuser/psql/engn/PGSQL/failover_trigger'
...
- pgpool Connection Settings : 외부에서 pgpool로 접속할 때 사용할 접속 정보
- Backend Connection Setting : Cluster에 속한 DB 정보 명시하는 부분, backend_weight : load-balancing 분산 비율 설정, backend_flag : failover 기능 사용 ‘ALLOW_TO_FAILOVER’로 설정
- LOAD BALANCING MODE : load balance 기능 사용 설정
- MASTER/SLAVE MODE : 현재 Streaming Replication을 사용하고 있기 때문에 master_slave_sub_mode는 ‘stream’으로 설정
- Streaming : Streaming Replication 상태 체크를 위한 설정으로 5초마다 repl 계정으로 상태 Check 함
- failover_command : pgpool 에서 Cluster 내 DB 서버에 이상을 감지하면 실행하는 커맨드
'/usr/pgsql-9.6/share/failover.sh %d %H /var/lib/pgsql/9.6/data/failover_trigger'
=> %d : 이상현상이 발생한 node id , %H : new master가 될 node id standby 서버에 failover_trigger 파일이 생기면 primary server 로 승격하겠다는 의미
failover.sh 스크립트 작성
- vi /usr/pgsql-11/share/failover.sh
if [ $# -ne 3 ]
then
echo "failover failed_node new_master trigger_file"
exit 1
fi
FAILED_NODE=$1
NEW_MASTER=$2
TRIGGER_FILE=$3
# Do nothing if standby server goes down
if [$FAILED_NODE = 1]; then
echo "Standby Server is downed\n" >> /etc/pgpool-II/failover.log
exit 0
fi
echo "failover.sh FAILED_NODE:${FAILED_NODE}; NEW_MASTER:${NEW_MASTER}; at $(date)\n" >> /etc/pgpool-II/failover.log
sudo -u testuser ssh -T testuser@$NEW_MASTER touch $TRIGGER_FILE
exit 0
=> Primay 서버에 이상이 생기면 Standby 서버에 failover trigger 파일을 생성하여 standby 서버가 승격될 수 있도록 함
ssh key 설정
ssh-keygen -t rsa 수행 후
서로의 id_rsa.pub >> authorized_keys 작업
=> failover.sh 스크립트에서 ssh 를 통해 trigger file을 생성하기 때문에 pgpool 서버 <-> DB 서버 간 ssh 인증 설정필요
pgpool 기동
- pgpool 기동
[testusersu@psql-server-1 15:38:26 /etc/pgpool-II ]$ sudo pgpool
[testuser@psql-server-1 15:40:42 ~/psql/logs/testdb ]$ ps -ef | grep pgpool
root 42462 1 0 15:38 ? 00:00:00 pgpool
root 42463 42462 0 15:38 ? 00:00:00 pgpool: wait for connection request
root 42464 42462 0 15:38 ? 00:00:00 pgpool: wait for connection request
.
.
.
root 42496 42462 0 15:38 ? 00:00:00 pgpool: PCP: wait for connection request
root 42497 42462 0 15:38 ? 00:00:00 pgpool: worker process
root 42498 42462 0 15:38 ? 00:00:00 pgpool: health check process(0)
root 42499 42462 0 15:38 ? 00:00:00 pgpool: health check process(1)
- 접속 테스트
[testuser@psql-server-2 15:38:32 ~ ]$ psql -h <pgpool IP> -p 4000 -d postgres
psql (11.7)
Type "help" for help.
postgres=# \x
Expanded display is on.
postgres=# show pool_nodes;
-[ RECORD 1 ]------+--------------------
node_id | 0
hostname | psql-server-1
port | 3000
status | up
lb_weight | 0.500000
role | primary
select_cnt | 16
load_balance_node | false
replication_delay | 0
last_status_change | 2020-04-09 18:56:42
-[ RECORD 2 ]------+--------------------
node_id | 1
hostname | psql-server-2
port | 3000
status | up
lb_weight | 0.500000
role | standby
select_cnt | 19
load_balance_node | true
replication_delay | 0
last_status_change | 2020-04-09 18:56:42
=> client 는 pgpool의 ip/port 로 접속하며 load_balancing 설정에 의해 1:1 로 쿼리 분산됨
failover test
현재 상태 체크
postgres=# show pool_nodes;
-[ RECORD 1 ]------+--------------------
node_id | 0
hostname | psql-server-1
port | 3000
status | up
lb_weight | 0.500000
role | primary
select_cnt | 0
load_balance_node | true
replication_delay | 0
last_status_change | 2020-04-10 10:11:15
-[ RECORD 2 ]------+--------------------
node_id | 1
hostname | psql-server-2
port | 3000
status | up
lb_weight | 0.500000
role | standby
select_cnt | 0
load_balance_node | false
replication_delay | 0
last_status_change | 2020-04-10 10:11:15
Primary stop & 로그 확인
- primary stop
[testuser@psql-server-1 10:10:48 ~ ]$ stop
waiting for server to shut down.... done
server stopped
- standby log
### error_log
2020-04-10 01:11:49 GMT LOG: replication terminated by primary server
2020-04-10 01:11:49 GMT DETAIL: End of WAL reached on timeline 1 at 0/7C000098.
2020-04-10 01:11:49 GMT FATAL: could not send end-of-streaming message to primary: no COPY in progress
2020-04-10 01:11:49 GMT LOG: invalid record length at 0/7C000098: wanted 24, got 0
2020-04-10 01:11:49 GMT FATAL: could not connect to the primary server: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2020-04-10 01:11:54 GMT LOG: trigger file found: /home1/testuser/psql/engn/PGSQL/failover_trigger
2020-04-10 01:11:54 GMT LOG: redo done at 0/7C000028
2020-04-10 01:11:54 GMT LOG: last completed transaction was at log time 2020-04-10 10:11:49.045489+09
2020-04-10 01:11:54 GMT LOG: selected new timeline ID: 2
2020-04-10 01:11:54 GMT LOG: archive recovery complete
2020-04-10 01:11:54 GMT LOG: database system is ready to accept connections
### read_only ON=>OFF
postgres=# select name,setting from pg_settings where name like '%read%';
name | setting
-------------------------------+---------
default_transaction_read_only | off
transaction_read_only | off
(2 rows)
=> primary 와 복제 끊긴 것 확인 pgpool failover command 에 의해 생성된 failover trigerr file 확인 recovery.conf failover_trigger 옵션에 의해 primary 로 승격시킴
failover 정상 체크
postgres=# show pool_nodes;
-[ RECORD 1 ]------+--------------------
node_id | 0
hostname | psql-server-1
port | 3000
status | down
lb_weight | 0.500000
role | standby
select_cnt | 0
load_balance_node | false
replication_delay | 0
last_status_change | 2020-04-10 10:11:54
-[ RECORD 2 ]------+--------------------
node_id | 1
hostname | psql-server-2
port | 3000
status | up
lb_weight | 0.500000
role | primary
select_cnt | 2
load_balance_node | true
replication_delay | 0
last_status_change | 2020-04-10 10:11:54
failover 후 추가작업
failover 후 old master를 올려도 자동으로 slave 로 추가되는 기능은 제공되지 않기 때문에 아래와 같은 과정을 수동으로 진행한다
- 현재 상태
postgres=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | last_status_change
---------+-------------------+------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
0 | psql-server-1 | 3000 | down | 0.500000 | standby | 0 | false | 0 | 2020-04-22 14:28:21
1 | psql-server-2 | 3000 | up | 0.500000 | primary | 0 | true | 0 | 2020-04-22 14:28:21
(2 rows)
=> old master t1801 서버가 failover 되어 t1802 가 new master 가 된 상황 t1801 서버를 올려도 자동으로 slave로 추가되지 않으며 pgpool 은 이 서버를 인식하지 못하는 상태
- old master slave로 재구성
pg_basebackup -h new_master -p3000 -D /home1/testuser/psql/engn/PGSQL -U repl -P -v -X stream
$ vi recovery.conf
standby_mode = on
primary_conninfo='host=new_master port=3000 user=repl password=qhdks123'
primary_slot_name='repl_slot_01'
trigger_file='/home1/testuser/psql/engn/PGSQL/failover_trigger'
- old master up
pg_ctl -D /home1/testuser/psql/engn/PGSQL start
- pgpool node로 추가
[testuser@psql-server-1 14:53:40 ~/psql/logs/testdb/error_log ]$ pcp_attach_node -n 0
Password:
pcp_attach_node -- Command Successful
postgres=# show pool_nodes;
node_id | hostname | port | status | lb_weight | role | select_cnt | load_balance_node | replication_delay | last_status_change
---------+-------------------+------+--------+-----------+---------+------------+-------------------+-------------------+---------------------
0 | psql-server-1 | 3000 | up | 0.500000 | standby | 0 | true | 0 | 2020-04-22 14:55:26
1 | psql-server-2 | 3000 | up | 0.500000 | primary | 6 | false | 0 | 2020-04-22 14:28:21
(2 rows)
=> old master 가 new master의 slave node 로 추가됨