pg_dump

PostgreSQL에서 기본 제공하는 logical backup 방식으로 mysqldump 와 같은 방법
백업 수행 시점의 스냅샷만 가능하며 시점복구는 불가함
pg_dump는 단일 DB backup, pg_dumpall은 전체 DB backup할 때 사용함

backup script

#!/bin/bash
DEL_FILE=$(date -d '2 day ago' +'%Y-%m-%d_')"*.dump"
FILENAME=$(date +"%Y-%m-%d_%H%M").dump
BACKUP_DIR=/home1/irteam/psql/backup/testdb
PG_USER="irteam"
PASSWORD=""

cd $BACKUP_DIR
echo "DB backup start time : " $(date +"%Y-%m-%d %H:%M:%S")
pg_dumpall -h localhost -p 3000 -U $PG_USER -v -j 3 -f "${BACKUP_DIR}/${FILENAME}" >& "${BACKUP_DIR}/${FILENAME}".log?
echo "Successful db backup ( ${BACKUP_DIR}/DBNAME_${FILENAME} )"
echo "Delete old file DBNAME_${DEL_FILE}"
rm "${BACKUP_DIR}/DBNAME_${DEL_FILE}"
echo "BACKUP - End time : " $(date +"%Y-%m-%d %H:%M:%S")
  • -d, –dbname : Backup할 Database 명
  • -h, –host : Database 주소
  • -U, –username : Database 접속 시 User ID
  • -f, –file : Backup File Name
  • -t, –table : 특정 Table만 Backup하려할 때 대상이 되는 Table 명
  • -j, –jobs : Backup 시 병렬 처리
  • -v, –verbose : 진행 과정 표시

restore test

psql -p 3000 -d postgres -f 2020-04-10_1431.dump
  • pg_dump 를 통한 백업은 pg_restore를 통해 복구함

pg_start_backup

PostgreSQL 에서 제공하는 hot backup 기능, 백업 본 + WAL 로그 파일 replay 로 원하는 시점 복구 가능

hot_backup.sh

#!/bin/bash

### [set ENV.] ###########################################################
export PG_PORT=3000
export PG_DATA=/home1/irteam/psql/engn/PGSQL
export PG_ARCH=/home1/irteam/psql/arch/testdb
export PG_USER=irteam
export PG_DB=testdb

export BACKUP_DIR=/home1/irteam/psql/backup/testdb
##########################################################################

# Default ENV
export LANG=C
export CURRENT_DATE=`date "+%Y-%m-%d-%H:%M:%S"`
export BULIST=${BACKUP_DIR}/LOG/.HOT-${PG_DB}-BULIST.dat
export TMPFILE=${BACKUP_DIR}/LOG/HOT-${PG_DB}.log
export LOGFILE=${BACKUP_DIR}/LOG/HOT-${PG_DB}-${CURRENT_DATE}.log

# SQL LIST
BACKUP_START="SELECT pg_start_backup('full_backup');"
BACKUP_STOP="SELECT pg_stop_backup();"

echo "########## [Step 1] `date "+%Y-%m-%d-%H:%M:%S"` BEGIN BACKUP ##########" >> ${TMPFILE}

psql -p ${PG_PORT} -d ${PG_DB} -U ${PG_USER} -c "${BACKUP_START}" >> ${TMPFILE}  2>&1

if [ $? -eq "0" ]
then
  echo ">>>>> `date "+%Y-%m-%d-%H:%M:%S"` Begin backup completed" >> ${TMPFILE}
else
  echo ">>>>> `date "+%Y-%m-%d-%H:%M:%S"` Begin backup failed" >> ${TMPFILE}
  exit 1
fi

echo " " >> ${TMPFILE}
echo "########## [Step 2] `date "+%Y-%m-%d-%H:%M:%S"` DATA COPY ##########" >> ${TMPFILE}

export TMP=${BACKUP_DIR}/LOG/.HOT-${PG_DB}.tmp

psql -p ${PG_PORT} -d ${PG_DB} -U ${PG_USER} <<! > ${TMP}
select distinct 'value', case when length(pg_tablespace_location(oid))=0 then (select setting from pg_settings where name='data_directory')
else pg_tablespace_location(oid) end
from pg_tablespace
where spcname not in ('ts_temp'); \q
!

/bin/cat ${TMP} | /bin/awk '$1 == "value" { print $3 }' > ${BULIST}

echo ${PG_ARCH} >> ${BULIST}
IFS=$'\n' read -d '' -r -a lines < ${BULIST}


echo ">>>>> Backup List" >> ${TMPFILE}
cat ${BULIST} >> ${TMPFILE}
echo " " >> ${TMPFILE}

echo ${lines[*]}
#CHKPOINT=$(find /home1/irteam/psql/arch/testdb -type f -mtime +1 -name '*backup' -printf '%f\n' | sort -r | head -1)
tar -cv  --exclude="${PG_DATA}/pg_wal/*" \-f $BACKUP_DIR/backup.tar ${lines[*]}

echo ">>>>> `date "+%Y-%m-%d-%H:%M:%S"` data copy completed" >> ${TMPFILE}

echo " " >> ${TMPFILE}
echo "########## [Step 3] `date "+%Y-%m-%d-%H:%M:%S"` END BACKUP ##########" >> ${TMPFILE}
psql -p ${PG_PORT} -d ${PG_DB} -U ${PG_USER} -c "${BACKUP_STOP}" >> ${TMPFILE}  2>&1

if [ $? -eq "0" ]
then
  echo ">>>>> `date "+%Y-%m-%d-%H:%M:%S"`  Backup completed" >> ${TMPFILE}
  echo " " >> ${TMPFILE}
#  psql -p ${PG_PORT} -d ${PG_DB} -U ${PG_USER} -c "select pg_switch_xlog();" >> ${TMPFILE}  2>&1
#  psql -p ${PG_PORT} -d ${PG_DB} -U ${PG_USER} -c "select pg_switch_xlog();" >> ${TMPFILE}  2>&1
  echo ">>>>> `date "+%Y-%m-%d-%H:%M:%S"` HOT Backup completed successfully!!!" >> ${TMPFILE}
  mv ${TMPFILE} ${LOGFILE}
else
  echo ">>>>> `date "+%Y-%m-%d-%H:%M:%S"` HOT Backup failed!!!" >> ${TMPFILE}
  exit 1
fi

rm -rf ${BACKUP_DIR}/LOG/.*
find ${BACKUP_DIR}/LOG -name "HOT-*.log" -mtime +2 -exec rm {} \;

restore test

  • 데이터 삭제 전
postgres=# select now(), count(*) from tb_test;
             now              | count
------------------------------+--------
 2020-04-13 00:08:07.88325+09 | 100000
(1 row)
  • backup 수행
### hotbackup.sh 수행

### backup info 확인 
[irteam@testserver1 00:21:24 ~/psql/arch/testdb ]$ cat 000000020000000100000030.00000028.backup

START WAL LOCATION: 1/30000028 (file 000000020000000100000030)
STOP WAL LOCATION: 1/30006338 (file 000000020000000100000030)
CHECKPOINT LOCATION: 1/30006268
BACKUP METHOD: pg_start_backup
BACKUP FROM: master
START TIME: 2020-04-12 15:10:52 GMT
LABEL: full_backup
START TIMELINE: 2
STOP TIME: 2020-04-12 15:10:53 GMT
STOP TIMELINE: 2
  • START WAL LOCATION : 복구를 위한 시작점
  • CHECKPOINT LOCATION : pg_start_backup를 실행했을 때 발생한 체크포인트가 기록된 LSN 위치
postgres=# select pg_walfile_name ('1/30006268');
     pg_walfile_name
--------------------------
 000000020000000100000031
(1 row)
  • BACKUP METHOD : 백업 방식에 대한 설명. pg_start_backup 또는 pg_basebackup 값을 갖는다
  • BACKUP FROM : 백업이 primary와 standby 중에 어디에서 발생했는지 기록한다
  • START TIME : pg_start_backup가 실행된 시간
  • LABEL : pg_start_backup 실행시 입력한 라벨 값

백업 수행 시 백업 시작과 완료 시점의 WAL LOCATION 정보가 남기 때문에 복구 시 모든 WAL archive 파일을 읽을 필요없음
START WAL LOCATION 부터 recover_target_time 에 해당하는 WAL logfile 까지 필요함

  • data 삭제
postgres=# call insert_test_data(10);
CALL

postgres=# select now(), count(*) from tb_test;
             now              | count
------------------------------+--------
 2020-04-13 00:12:54.00624+09 | 200000
(1 row)


postgres=# truncate public.tb_test;

=>truncate 직전 20만건으로 복구시도

  • backup 본으로 복구 수행
### DB STOP

### backup 본 $PGDATA 로 copy
[irteam@testserver1 23:30:19 ~/psql/backup/testdb/home1/irteam/psql ]$ ls
arch  data  engn  indx  testdb


### $/PGDATA/recovery.conf 설정

cat recovery.conf

restore_command = 'cp /home1/irteam/psql/arch/testdb/%f %p'
recovery_target_time = '2020-04-13 00:12:54'

==> wal archive logfile을 pg_wal 디렉토리로 가져오고 원하는 시간대로 복구하도록 설정  
DB 기동 시 recovery.conf 를 읽고 필요한 아카이브 파일을 적용하게됨  

### DB start


### recovery 완료
postgres=# select now(), count(*) from public.tb_test;
              now              | count
-------------------------------+--------
 2020-04-13 00:16:55.094237+09 | 200000
(1 row)
  • recovery log
2020-04-12 15:16:23 GMT LOG:  restored log file "00000002.history" from archive
2020-04-12 15:16:23 GMT LOG:  restored log file "000000020000000100000030" from archive

2020-04-12 15:16:23 GMT LOG:  redo starts at 1/30000028
2020-04-12 15:16:23 GMT LOG:  consistent recovery state reached at 1/30006338
=> 백업 진행 중 변경사항 적용

2020-04-12 15:16:23 GMT LOG:  database system is ready to accept read only connections
2020-04-12 15:16:24 GMT LOG:  restored log file "000000020000000100000031" from archive
2020-04-12 15:16:24 GMT LOG:  restored log file "000000020000000100000032" from archive
2020-04-12 15:16:24 GMT LOG:  recovery stopping before commit of transaction 1966, time 2020-04-13 00:13:19.690669+09
2020-04-12 15:16:24 GMT LOG:  recovery has paused
2020-04-12 15:16:24 GMT HINT:  Execute pg_wal_replay_resume() to continue.

=> 백업 이후 ~ recovery_target_time  사이의 데이터 반영  
archive 반영 후 데이터 복구는 완료 되어도 DB는 계속 recovery 상태 (read_only )이기 때문에 정상화 해야함
  • recovery 상태 해제
postgres=# call insert_test_data(1);
ERROR:  cannot execute INSERT in a read-only transaction


postgres=# SELECT pg_wal_replay_resume();
 pg_wal_replay_resume
----------------------

(1 row)

postgres=# call insert_test_data(1);
CALL