MySQL online DDL을 위한 TOOL pt-osc & gh-ost
ONLINE DDL
MySQL은 DDL에 대해서 한정적으로 online 작업을 지원합니다.
online ddl 방식으로는 INPLACE, INSTANT (mysql 8.0부터) 가 있고
주로 쓰이는 algorithm=INPLACE 방식은 아래와 같은 절차로 ONLINE DDL을 지원합니다.
1. INPLACE 스키마 변경이 지원되는 커맨드인지, 스토리지 엔진(InnoDB) 인지 확인
2. online 스키마 변경 작업동안 변경되는 데이터를 저장할 준비 (innodb_online_alter_log_max_size)
3. 테이블 스키마 변경 및 DML 로깅
4. 변경 완료된 테이블에 DML 로깅 적용
5. INPLACE 작업 완료
online 작업이니만큼 아래와 같은 주의사항이 있습니다.
2번과 4번 단계에서는 아주 짧게나마 metadata lock이 필요함, metadata lock을 못잡으면 작업 실패 (lock_wait_timeout 설정 확인 필요)
- mysql 8.0의 algorithm=INSTANT는 metadata lock을 잡지않음
3번 단계 DML 로깅을 하는 online alter log memory가 부족하면 작업이 실패함 (innodb_online_alter_log_max_size)
ERROR 1799 (HY000): Creating index ‘idx2’ required more than ‘innodb_online_alter_log_max_size’ bytes of modification log. Please try again.
ONLINE DDL Tool
online DDL이 한정적이다 보니 지원하지 않는 작업을 online 으로 처리하기 위한 도구가 나왔는데 이번 글에서는 간단한 사용법과 차이점을 살펴보도록 하겠습니다.
pt-online-schema-change
- INSTALL - rpm
yum install -y perl-IO* perl-DBD-MySQL perl-TermReadKey perl-devel
https://www.percona.com/downloads/percona-toolkit/3.1.0/binary/redhat/6/x86_64/percona-toolkit-3.1.0-2.el6.x86_64.rpm
rpm -ivh percona-toolkit-3.1.0-2.el6.x86_64.rpm
- INSTALL tar.gz
wget percona.com/get/percona-toolkit.tar.gz
tar zxvpf percona-toolkit.tar.gz
perl ./Makefile.PL
make
sudo make install
동작방식
(https://gywn.net/2017/08/small-talk-pt-osc/)
- 수행하려는 DDL 이 반영된 new table을 생성
- old-table로 발생한 DML을 new table로 가져오기 위한 update/insert/delete trigger 생성
- chunk-size 만큼 old table로 부터 new table로 데이터 복사 (old table에 pk 필수)
- 복사가 끝나면 table rename 후 작업 완료
COMMAND
pt-online-schema-change --alter "add column ptosc int not null default 0" D=test,t=tb_test \
--no-drop-old-table \
--no-drop-new-table \
--chunk-size=1000 \
--recursion-method=dsn=t=mysql.dsns \
--max-lag=5 \
--defaults-file=/home/kimdubi/db/mysql/my.cnf \
--host=127.0.0.1 \
--port=13306 \
--user=root \
--password=password \
--progress=time,30 \
--max-load="Threads_running=100" \
--critical-load="Threads_running=200" \
--chunk-index=PRIMARY \
--charset=utf8mb4 \
--set-vars="tx_isolation='repeatable-read',innodb_lock_wait_timeout=1,lock_wait_timeout=1" \
--no-check-alter \
--execute
- alter : 수행할 DDL 커맨드
- D : 사용할 데이터베이스명
- t : 테이블명
- no-drop-old-table : 원본 테이블 rename 후 drop 하지 않음
- no-drop-new-table : 작업 실패했을 때 생성한 new 테이블 drop 하지 않음
- chunk-size : 복사할 데이터 크기 단위
- max-lag : pt-osc는 master에서만 작업할 수 있는데 slave에서 복제지연이 발생하면 pt-osc 작업을 중단하고 기다림
- progress :
- max-load : 작업 도중 서버에 과한 load 를 막기위해 수행하는 세션이 일정 수치 이상 되면 pt-osc 작업 일시 중지함
- critical-load : max-load가 일시정지라면 이 옵션은 pt-osc 작업 중단시킴
- set-vars: pt-osc 작업 시작할 때 설정할 session variables => 위 예시에서는 pt-osc 작업이 서비스에 영향 없도록 lock 1초안에 획득 못하면 작업중지하도록 설정함
- no-check-alter : alter DDL 구문 체크
- recursion-method : db port가 기본 port (3306)이 아닐 때 pt-osc는 slave를 찾지 못합니다.
dsn방식으로 접속할 slave 정보를 저장해놓고 이 테이블을 조회하라고 명시해야합니다.
CREATE TABLE `dsn` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`parent_id` int(11) NOT NULL,
`dsn` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
### slave IP:port insert
INSERT INTO dsn(parent_id,dsn) VALUES(1,'h=11.111.11.11,P=13306');
TEST LOG
1. 정상 시나리오
[kimdubi@test-dbatst-t1801 01:34:57 ~ ]$ pt-online-schema-change --alter "add column ptosc int not null default 0" D=test,t=tb_test \
> --no-drop-old-table \
> --no-drop-new-table \
> --chunk-size=1000 \
> --recursion-method=dsn=t=mysql.dsn \
> --max-lag=5 \
> --defaults-file=/home/kimdubi/db/mysql/my.cnf \
> --port=13306 \
> --user=ptosc \
> --password=qhdks123 \
> --progress=time,30 \
> --max-load="Threads_running=100" \
> --critical-load="Threads_running=200" \
> --chunk-index=PRIMARY \
> --charset=utf8mb4 \
> --set-vars="innodb_lock_wait_timeout=1,lock_wait_timeout=1" \
> --no-check-alter \
> --execute
Found 1 slaves:
test-dbatst-t1802 -> 11.111.11.11:13306
Will check slave lag on:
test-dbatst-t1802 -> 11.111.11.11:13306
Operation, tries, wait:
analyze_table, 10, 1
copy_rows, 10, 0.25
create_triggers, 10, 1
drop_triggers, 10, 1
swap_tables, 10, 1
update_foreign_keys, 10, 1
Altering `test`.`tb_test`...
Creating new table...
Created new table test.___tb_test_new OK.
Altering new table...
Altered `test`.`___tb_test_new` OK.
2020-03-19T01:34:58 Creating triggers...
2020-03-19T01:34:58 Created triggers OK.
2020-03-19T01:34:58 Copying approximately 10079347 rows...
Copying `test`.`tb_test`: 25% 01:27 remain
Copying `test`.`tb_test`: 50% 00:57 remain
Copying `test`.`tb_test`: 76% 00:27 remain
2020-03-19T01:37:06 Copied rows OK.
2020-03-19T01:37:06 Analyzing new table...
2020-03-19T01:37:06 Swapping tables...
2020-03-19T01:37:06 Swapped original and new tables OK.
Not dropping old table because --no-drop-old-table was specified.
2020-03-19T01:37:06 Dropping triggers...
2020-03-19T01:37:06 Dropped triggers OK.
Successfully altered `test`.`tb_test`.
- DB 내에선 아래와 같은 작업들이 수행 됨
- 원본 table에서 new table 로 data copy 하는 세션 ( pk 기준 chunk 크기만큼씩)
mysql> select * from information_schema.processlist where user='ptosc'\G;
*************************** 1. row ***************************
ID: 751321
USER: ptosc
HOST: localhost
DB: test
COMMAND: Query
TIME: 0
STATE: Sending data
INFO: INSERT LOW_PRIORITY IGNORE INTO `test`.`____tb_test_new` (`id`, `a`, `b`, `c`, `d`, `ptosc`) SELECT `id`, `a`, `b`, `c`, `d`, `ptosc` FROM `test`.`tb_test` FORCE INDEX(`PRIMARY`) WHERE ((`id` >= '11089001')) AND ((`id` <= '11090000')) LOCK IN SHARE MODE /*pt-online-schema-change 47357 copy nibble*/
=> LOW_PRIORITY : 다른 세션들이 해당 테이블을 사용하는 경우 그 작업이 모두 완료되면 insert 하겠다는 의미 IGNORE : error 발생하는 경우 무시하고 pass, 어차피 trigger로 최신 DML이 반영될 것이기 때문 LOCK IN SHARE MODE : 가져올 데이터 select 후 트랜잭션이 완료될 때 까지 해당 row 변경되지 않도록 보장
- 원본 table의 DML을 new table로 반영하기 위한 trigger 세션 (insert /update /delete)
*************************** 3. row ***************************
ID: 748143
USER: ptosc
HOST: %
DB: test
COMMAND: Query
TIME: 0
STATE: update
INFO: REPLACE INTO `test`.`____tb_test_new` (`id`, `a`, `b`, `c`, `d`, `ptosc`) VALUES (NEW.`id`, NEW.`a`, NEW.`b`, NEW.`c`, NEW.`d`, NEW.`ptosc`)
- 작업 완료 후
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| _tb_test_old |
| tb_test |
+----------------+
2 rows in set (0.00 sec)
mysql> show triggers;
Empty set (0.00 sec)
mysql> desc tb_test;
+-------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| a | varchar(100) | YES | | NULL | |
| b | varchar(1001) | YES | | NULL | |
| c | varchar(100) | YES | | NULL | |
| d | varchar(100) | YES | | NULL | |
| ptosc | int(11) | NO | | 0 | |
+-------+---------------+------+-----+---------+----------------+
6 rows in set (0.00 sec)
2. lock 획득 실패로 인한 작업 실패 LOG
2020-03-19T14:51:59 Dropping triggers...
2020-03-19T14:52:18 Error dropping trigger: 2020-03-19T14:52:18 DBD::mysql::db do failed: Lock wait timeout exceeded; try restarting transaction [for Statement "DROP TRIGGER IF EXISTS `test`.`pt_osc_test_tb_test_del`"] at /home/kimdubi/db/mysql/util/ptkit/bin/pt-online-schema-change line 11297.
2020-03-19T14:52:37 Error dropping trigger: 2020-03-19T14:52:37 DBD::mysql::db do failed: Lock wait timeout exceeded; try restarting transaction [for Statement "DROP TRIGGER IF EXISTS `test`.`pt_osc_test_tb_test_upd`"] at /home/kimdubi/db/mysql/util/ptkit/bin/pt-online-schema-change line 11297.
2020-03-19T14:52:56 Error dropping trigger: 2020-03-19T14:52:56 DBD::mysql::db do failed: Lock wait timeout exceeded; try restarting transaction [for Statement "DROP TRIGGER IF EXISTS `test`.`pt_osc_test_tb_test_ins`"] at /home/kimdubi/db/mysql/util/ptkit/bin/pt-online-schema-change line 11297.
2020-03-19T14:52:56 To try dropping the triggers again, execute:
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_tb_test_del`
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_tb_test_upd`
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_tb_test_ins`
Not dropping the new table `test`.`___tb_test_new` because --no-drop-new-table was specified. To drop the new table, execute:
DROP TABLE IF EXISTS `test`.`___tb_test_new`;
=> 재수행하려면 생성된 trigger 모두 삭제 필요
3. max_load=? thread running 대기 LOG
.
.
.
Altering new table...
Altered `test`.`___tb_test_new` OK.
2020-03-19T23:54:41 Creating triggers...
2020-03-19T23:54:41 Created triggers OK.
2020-03-19T23:54:41 Copying approximately 10686036 rows...
Copying `test`.`tb_test`: 23% 01:35 remain
Copying `test`.`tb_test`: 47% 01:05 remain
Copying `test`.`tb_test`: 71% 00:35 remain
Pausing because Threads_running=100.
Pausing because Threads_running=100.
Copying `test`.`tb_test`: 74% 03:56 remain
Copying `test`.`tb_test`: 96% 00:27 remain
2020-03-20T00:07:00 Copied rows OK.
2020-03-20T00:07:00 Adding original triggers to new table.
=> DB 내 session이 max_load 수 만큼 존재하면 pt-osc작업을 일시정지함 session 빠지면 자동 재수행
4. critical-load=? 설정한 최대 thread 초과로 작업 실패 LOG
Altering new table...
Altered `test`.`___tb_test_new` OK.
2020-03-20T00:26:13 Creating triggers...
2020-03-20T00:26:13 Created triggers OK.
2020-03-20T00:26:13 Copying approximately 10771088 rows...
Copying `test`.`tb_test`: 23% 01:39 remain
2020-03-20T00:26:47 Dropping triggers...
2020-03-20T00:26:47 Dropped triggers OK.
Not dropping the new table `test`.`___tb_test_new` because --no-drop-new-table was specified. To drop the new table, execute:
DROP TABLE IF EXISTS `test`.`___tb_test_new`;
`test`.`tb_test` was not altered.
2020-03-20T00:26:47 Error copying rows from `test`.`tb_test` to `test`.`___tb_test_new`: Threads_running=4 exceeds its critical threshold 200
6. replication lag 대기 LOG
Altering new table...
Altered `test`.`_______tb_test_new` OK.
2020-03-20T01:08:52 Creating triggers...
2020-03-20T01:08:52 Created triggers OK.
2020-03-20T01:08:52 Copying approximately 11423307 rows...
Replica lag is 1083 seconds on test-dbatst-t1802. Waiting.
Replica lag is 1113 seconds on test-dbatst-t1802. Waiting.
Replica lag is 1262 seconds on test-dbatst-t1802. Waiting.
Copying `test`.`tb_test`: 0% 14+19:19:46 remain
6. 작업 시 charater set error LOG
- ERROR LOG
Character set 'utf8mb4' is not a compiled character set and is not specified in the '/usr/share/mysql/charsets/Index.xml' file
Character set 'utf8mb4' is not a compiled character set and is not specified in the '/usr/share/mysql/charsets/Index.xml' file
Cannot connect to MySQL: DBI connect('test;mysql_read_default_file=/home/kimdubi/db/mysql/my.cnf;host=127.0.0.1;port=13306;charset=utf8mb4;mysql_read_default_group=client','admin',...) failed: Can't initialize character set utf8mb4 (path: /usr/share/mysql/charsets/) at /home/kimdubi/db/mysql/util/ptkit/bin/pt-online-schema-change line 2345
- utf8 => utf8mb4 로 수정
vi /usr/share/mysql/charsets/Index.xml
<charset name="utf8mb4">
<family>Unicode</family>
<description>UTF-8 Unicode</description>
<alias>utf-8</alias>
<collation name="utf8_general_ci" id="33">
<flag>primary</flag>
<flag>compiled</flag>
</collation>
<collation name="utf8_bin" id="83">
<flag>binary</flag>
<flag>compiled</flag>
</collation>
</charset>
chunk copy 중간 sleep 넣기
- vi /home/kimdubi/db/mysql/util/ptkit/bin/pt-online-schema-change
8250 package pt_online_schema_change;
8251
8252 use strict;
8253 use warnings FATAL => 'all';
8254 use English qw(-no_match_vars);
8255
8256 use Percona::Toolkit;
8257 use constant PTDEBUG => $ENV{PTDEBUG} || 0;
8258
8259 use List::Util qw(max);
8260 use Time::HiRes qw(time sleep usleep);
.
.
11405 sub exec_nibble {
11406
11407 usleep(10000);
11408 my (%args) = @_;
11409 my @required_args = qw(Cxn tbl stats tries Retry NibbleIterator Quoter);
11410 foreach my $arg ( @required_args ) {
11411 die "I need a $arg argument" unless $args{$arg};
=> pt-osc 작업이 서비스 성능에 영향을 준다면 위 처럼 코드를 수정하여 chunk copy 건건 완료 마다 sleep() 하도록 수정
gh-ost
INSTALL
wget https://github.com/github/gh-ost/releases/download/v1.0.49/gh-ost-1.0.49-1.x86_64.rpm
rpm -ivh gh-ost-1.0.49-1.x86_64.rpm
동작방식
- alter 문이 반영 된 ghost table 생성
- PK로 chunk-size 만큼 data copy
- master 서버에서 수행된 DML에 대한 binary log를 ghost table에 반영함
- binary log를 가져올 서버는 log-bin, binlog-format=ROW 설정 필요
- SLAVE에서 가져오는 경우엔 log_slave_updates 옵션도 추가로 필요함
- triggerless 방식
- trigger로 인한 부하가 없음
- 작업 일시정지가 자유로움
- org 테이블과 ghost 테이블 swapping
COMMAND
gh-ost \
--max-load=Threads_running=100 \
--critical-load=Threads_running=200 \
--chunk-size=1000 \
--throttle-control-replicas="11.111.11.11:13306" \
--max-lag-millis=500 \
--user="ghost" \
--password='qhdks123' \
--host="11.111.11.11" \
--port=13306 \
--database="test" \
--table="tb_test" \
--verbose \
--alter="add column ghost5 int not null default 0" \
--switch-to-rbr \
--allow-master-master \
--allow-on-master \
--assume-master-host="22.222.22.22:13306" \
--cut-over=default \
--default-retries=120 \
--panic-flag-file=/tmp/ghost.panic.flag \
--postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \
--throttle-flag-file=/tmp/throttle.flag \
--execute
- host : binary log 를 가져올 서버 설정
- throttle-control-replicas : –max-lag-millis 를 체크할 리플리카 서버 지정하는 옵션
- switch-to-rbr : gh-ost 는 binlog_format=ROW 일 때만 동작하기 때문에 작업 대상 서버의 binlog_format을 ROW로 변경하는 옵션. 주로 Slave의 binlog_format 변경과 log_slave_updates 옵션 설정 후 재기동 수행 (원복 X)
- allow-master-master : MMM 사용할 때 처럼 master-master (active-passive) 구성일 때 설정
- allow-on-master : master에서 작업할 때 설정하는 옵션
- migrate-on-replica : slave 에서 작업할 때 설정하는 옵션 ( 양방향 복제가 아닌 이상 slave에만 작업 적용되므로 참고! )
- assume-master-host : allow-master-master 설정 시 작업할 master 서버 지정하는 옵션
- cut-over : org-table과 gh-ost table 간 swapping하는 migration의 마지막 단계로 atomic 하기 때문에 swapping 도중 실패하는 쿼리는 없음
- panic-flag-file : flag-file 생성 시 작업 취소
- postpone-cut-over-flag-file : flag-file 존재하면 cut-over 단계를 수행하지 않고 대기함 이 옵션 설정 시 작업 시작 할때 이 파일을 생성하기 때문에 마지막 rename 단계 전에 수동으로 삭제해줘야함, 대기하더라도 DML은 계속 copy 진행됨
- throttle-flag-file : throttle-flag-file 존재하면 작업 일시정지
TEST LOG
1. 정상 시나리오
2020-03-24 11:49:13 INFO starting gh-ost 1.0.49
2020-03-24 11:49:13 INFO Migrating `test`.`tb_test`
2020-03-24 11:49:13 INFO connection validated on 11.111.11.11:13306
2020-03-24 11:49:13 INFO User has SUPER, REPLICATION SLAVE privileges, and has ALL privileges on `test`.*
2020-03-24 11:49:13 INFO binary logs validated on 11.111.11.11:13306
2020-03-24 11:49:13 INFO Restarting replication on 11.111.11.11:13306 to make sure binlog settings apply to replication thread
2020-03-24 11:49:14 INFO Inspector initiated on test-dbatst-t1802:13306, version 8.0.16
2020-03-24 11:49:14 INFO Table found. Engine=InnoDB
2020-03-24 11:49:14 INFO Estimated number of rows via EXPLAIN: 11474891
2020-03-24 11:49:14 INFO Master forced to be 22.222.22.22:13306
2020-03-24 11:49:14 INFO log_slave_updates validated on 11.111.11.11:13306
2020-03-24 11:49:14 INFO connection validated on 11.111.11.11:13306
2020-03-24 11:49:14 INFO Connecting binlog streamer at mysql-bin.000012:638658499
[2020/03/24 11:49:14] [info] binlogsyncer.go:133 create BinlogSyncer with config {99999 mysql 11.111.11.11 13306 ghost false false <nil> false UTC true 0 0s 0s 0 false}
[2020/03/24 11:49:14] [info] binlogsyncer.go:354 begin to sync binlog from position (mysql-bin.000012, 638658499)
[2020/03/24 11:49:14] [info] binlogsyncer.go:203 register slave for master server 11.111.11.11:13306
2020-03-24 11:49:14 INFO rotate to next log from mysql-bin.000012:0 to mysql-bin.000012
[2020/03/24 11:49:14] [info] binlogsyncer.go:723 rotate to (mysql-bin.000012, 638658499)
2020-03-24 11:49:14 INFO connection validated on 22.222.22.22:13306
2020-03-24 11:49:14 INFO connection validated on 22.222.22.22:13306
2020-03-24 11:49:14 INFO will use time_zone='SYSTEM' on applier
2020-03-24 11:49:14 INFO Examining table structure on applier
2020-03-24 11:49:14 INFO Applier initiated on test-dbatst-t1801:13306, version 8.0.16
=> master / slave 파악 및 binary log 가져올 서버(여기선 slave) 와 binlog position 확인
2020-03-24 11:49:14 INFO Dropping table `test`.`_tb_test_ghc`
2020-03-24 11:49:14 INFO Table dropped
2020-03-24 11:49:14 INFO Creating changelog table `test`.`_tb_test_ghc`
2020-03-24 11:49:14 INFO Changelog table created
2020-03-24 11:49:14 INFO Creating ghost table `test`.`_tb_test_gho`
2020-03-24 11:49:14 INFO Ghost table created
2020-03-24 11:49:14 INFO Altering ghost table `test`.`_tb_test_gho`
2020-03-24 11:49:14 INFO Ghost table altered
=> migration 과정을 기록할 _tb_test_ghc 테이블 , 데이터 copy 및 rename 할 _tb_test_gho 임시테이블 생성
2020-03-24 11:49:14 INFO Created postpone-cut-over-flag-file: /tmp/ghost.postpone.flag
2020-03-24 11:49:14 INFO Waiting for ghost table to be migrated. Current lag is 0s
2020-03-24 11:49:14 INFO Intercepted changelog state GhostTableMigrated
2020-03-24 11:49:14 INFO Handled changelog state GhostTableMigrated
2020-03-24 11:49:14 INFO Chosen shared unique key is PRIMARY
2020-03-24 11:49:14 INFO Shared columns are id,a,b,c,d,ptosc,ghost,ghost2
2020-03-24 11:49:14 INFO Listening on unix socket file: /tmp/gh-ost.test.tb_test.sock
2020-03-24 11:49:14 INFO Migration min values: [1]
2020-03-24 11:49:14 INFO Migration max values: [12404000]
2020-03-24 11:49:14 INFO Waiting for first throttle metrics to be collected
2020-03-24 11:49:14 INFO First throttle metrics collected
# Migrating `test`.`tb_test`; Ghost table is `test`.`_tb_test_gho`
# Migrating test-dbatst-t1801:13306; inspecting test-dbatst-t1802:13306; executing on test-dbatst-t1801
# Migration started at Tue Mar 24 11:49:13 +0900 2020
# chunk-size: 1000; max-lag-millis: 500ms; dml-batch-size: 10; max-load: Threads_running=100; critical-load: Threads_running=200; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# throttle-control-replicas count: 1
# postpone-cut-over-flag-file: /tmp/ghost.postpone.flag [set]
# panic-flag-file: /tmp/ghost.panic.flag
# Serving on unix socket: /tmp/gh-ost.test.tb_test.sock
=> 설정 재확인 및 postpone 파일 생성 (마지막 rename 단계에서 수동으로 지워줘야함)
Copy: 0/11474891 0.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 0s(copy); streamer: mysql-bin.000012:638660757; Lag: 0.01s, State: migrating; ETA: N/A
Copy: 0/11474891 0.0%; Applied: 0; Backlog: 0/1000; Time: 2s(total), 1s(copy); streamer: mysql-bin.000012:638665367; Lag: 0.01s, State: migrating; ETA: N/A
Copy: 87000/11474891 0.8%; Applied: 0; Backlog: 0/1000; Time: 3s(total), 2s(copy); streamer: mysql-bin.000012:645087979; Lag: 0.01s, State: migrating; ETA: 4m21s
Copy: 177000/11474891 1.5%; Applied: 0; Backlog: 0/1000; Time: 4s(total), 3s(copy); streamer: mysql-bin.000012:651809118; Lag: 0.01s, State: migrating; ETA: 3m11s
Copy: 267000/11474891 2.3%; Applied: 0; Backlog: 0/1000; Time: 5s(total), 4s(copy); streamer: mysql-bin.000012:658530259; Lag: 0.01s, State: migrating; ETA: 2m47s
Copy: 357000/11474891 3.1%; Applied: 0; Backlog: 0/1000; Time: 6s(total), 5s(copy); streamer: mysql-bin.000012:665251399; Lag: 0.01s, State: migrating; ETA: 2m35s
.
.
.
Copy: 12404000/12404000 100.0%; Applied: 0; Backlog: 0/1000; Time: 3m30s(total), 2m21s(copy); streamer: mysql-bin.000013:494768124; Lag: 0.01s, State: postponing cut-over; ETA: due
Copy: 12404000/12404000 100.0%; Applied: 0; Backlog: 0/1000; Time: 4m0s(total), 2m21s(copy); streamer: mysql-bin.000013:494892172; Lag: 0.01s, State: postponing cut-over; ETA: due
=> org_table에서 tmp table로 data copy
2020-03-24 11:53:22 INFO Grabbing voluntary lock: gh-ost.160965.lock
2020-03-24 11:53:22 INFO Setting LOCK timeout as 6 seconds
2020-03-24 11:53:22 INFO Looking for magic cut-over table
2020-03-24 11:53:22 INFO Creating magic cut-over table `test`.`_tb_test_del`
2020-03-24 11:53:22 INFO Magic cut-over table created
2020-03-24 11:53:22 INFO Locking `test`.`tb_test`, `test`.`_tb_test_del`
2020-03-24 11:53:22 INFO Tables locked
2020-03-24 11:53:22 INFO Session locking original & magic tables is 160965
2020-03-24 11:53:22 INFO Writing changelog state: AllEventsUpToLockProcessed:1585018402096959497
2020-03-24 11:53:22 INFO Waiting for events up to lock
2020-03-24 11:53:22 INFO Intercepted changelog state AllEventsUpToLockProcessed
2020-03-24 11:53:22 INFO Handled changelog state AllEventsUpToLockProcessed
2020-03-24 11:53:23 INFO Waiting for events up to lock: got AllEventsUpToLockProcessed:1585018402096959497
2020-03-24 11:53:23 INFO Done waiting for events up to lock; duration=983.332503ms
# Migrating `test`.`tb_test`; Ghost table is `test`.`_tb_test_gho`
# Migrating test-dbatst-t1801:13306; inspecting test-dbatst-t1802:13306; executing on test-dbatst-t1801
# Migration started at Tue Mar 24 11:49:13 +0900 2020
# chunk-size: 1000; max-lag-millis: 500ms; dml-batch-size: 10; max-load: Threads_running=100; critical-load: Threads_running=200; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# throttle-control-replicas count: 1
# postpone-cut-over-flag-file: /tmp/ghost.postpone.flag
# panic-flag-file: /tmp/ghost.panic.flag
# Serving on unix socket: /tmp/gh-ost.test.tb_test.sock
Copy: 12404000/12404000 100.0%; Applied: 0; Backlog: 0/1000; Time: 4m9s(total), 2m21s(copy); streamer: mysql-bin.000013:494931666; Lag: 0.01s, State: migrating; ETA: due
2020-03-24 11:53:23 INFO Setting RENAME timeout as 3 seconds
2020-03-24 11:53:23 INFO Session renaming tables is 160954
2020-03-24 11:53:23 INFO Issuing and expecting this to block: rename /* gh-ost */ table `test`.`tb_test` to `test`.`_tb_test_del`, `test`.`_tb_test_gho` to `test`.`tb_test`
2020-03-24 11:53:23 INFO Found atomic RENAME to be blocking, as expected. Double checking the lock is still in place (though I don't strictly have to)
2020-03-24 11:53:23 INFO Checking session lock: gh-ost.160965.lock
2020-03-24 11:53:23 INFO Connection holding lock on original table still exists
2020-03-24 11:53:23 INFO Will now proceed to drop magic table and unlock tables
2020-03-24 11:53:23 INFO Dropping magic cut-over table
2020-03-24 11:53:23 INFO Releasing lock from `test`.`tb_test`, `test`.`_tb_test_del`
2020-03-24 11:53:23 INFO Tables unlocked
2020-03-24 11:53:23 INFO Tables renamed
2020-03-24 11:53:23 INFO Lock & rename duration: 1.005631871s. During this time, queries on `tb_test` were blocked
2020-03-24 11:53:23 INFO Looking for magic cut-over table
[2020/03/24 11:53:23] [info] binlogsyncer.go:164 syncer is closing...
2020-03-24 11:53:23 INFO Closed streamer connection. err=<nil>
2020-03-24 11:53:23 INFO Dropping table `test`.`_tb_test_ghc`
[2020/03/24 11:53:23] [error] binlogsyncer.go:631 connection was bad
[2020/03/24 11:53:23] [error] binlogstreamer.go:77 close sync with err: Sync was closed
[2020/03/24 11:53:23] [info] binlogsyncer.go:179 syncer is closed
2020-03-24 11:53:23 INFO Table dropped
2020-03-24 11:53:23 INFO Am not dropping old table because I want this operation to be as live as possible. If you insist I should do it, please add `--ok-to-drop-table` next time. But I prefer you do not. To drop the old table, issue:
2020-03-24 11:53:23 INFO -- drop table `test`.`_tb_test_del`
2020-03-24 11:53:23 INFO Done migrating `test`.`tb_test`
2020-03-24 11:53:23 INFO Removing socket file: /tmp/gh-ost.test.tb_test.sock
2020-03-24 11:53:23 INFO Tearing down inspector
2020-03-24 11:53:23 INFO Tearing down applier
2020-03-24 11:53:23 INFO Tearing down streamer
2020-03-24 11:53:23 INFO Tearing down throttler
# Done
=> copy 완료 후 rename 및 정리
2. binlog_format ROW 가 아닐 때
gh-ost \
--max-load=Threads_running=100 \
--critical-load=Threads_running=200 \
--chunk-size=1000 \
.
.
.
2020-03-24 14:04:13 FATAL 11.111.11.11:13306 has MIXED binlog_format, but I'm too scared to change it to ROW because it has replicas. Bailing out
=> binary log 를 가져오려는 Slave 서버가 MIXED format 을 사용하는 경우 –switch-to-rbr 옵션을 통해 Slave의 replication 재기동 및 binlog_format=MIXED 로 변경함 그러나 양방향 복제를 하는 환경에서는 위의 에러로 작업이 실패하기 때문에 수동으로 Slave binlog_format 변경해야함 (dynamic parameter, replication 재기동 필요)
3. log_slave_updates 안 켰을 때
2020-03-24 16:30:27 FATAL 11.111.11.11:13306 must have log_slave_updates enabled for executing migration
=> slave에서 binary log 를 가져오는 경우 log_slave_updates=ON 이어야함 OFF 설정시 작업실패하며 DB 재기동이 필요한 옵션임
4. throttle 로 일시정지
### throttle file 생성
[kimdubi@test-dbatst-t1801 14:23:32 /tmp ]$ touch /tmp/throttle.flag
### 작업 일시정지
Copy: 1558000/11087382 14.1%; Applied: 0; Backlog: 0/1000; Time: 20s(total), 19s(copy); streamer: mysql-bin.000014:469195727; Lag: 0.01s, State: migrating; ETA: 1m56s
Copy: 1647000/11087382 14.9%; Applied: 0; Backlog: 0/1000; Time: 21s(total), 20s(copy); streamer: mysql-bin.000014:476020512; Lag: 0.01s, State: migrating; ETA: 1m54s
Copy: 1657000/11087382 14.9%; Applied: 0; Backlog: 0/1000; Time: 22s(total), 21s(copy); streamer: mysql-bin.000014:476868746; Lag: 0.01s, State: throttled, flag-file; ETA: 1m59s
Copy: 1657000/11087382 14.9%; Applied: 0; Backlog: 0/1000; Time: 23s(total), 22s(copy); streamer: mysql-bin.000014:476873383; Lag: 0.01s, State: throttled, flag-file; ETA: 2m5s
Copy: 1657000/11087382 14.9%; Applied: 0; Backlog: 0/1000; Time: 24s(total), 23s(copy); streamer: mysql-bin.000014:476878018; Lag: 0.01s, State: throttled, flag-file; ETA: 2m10s
.
.
### throttle 파일 삭제
[kimdubi@test-dbatst-t1801 14:23:43 /tmp ]$ rm /tmp/throttle.flag
rm: remove regular empty file `/tmp/throttle.flag'? y
.
.
### 작업 재개
Copy: 1657000/11087382 14.9%; Applied: 0; Backlog: 0/1000; Time: 1m5s(total), 1m4s(copy); streamer: mysql-bin.000014:477065980; Lag: 0.01s, State: throttled, flag-file; ETA: 6m4s
Copy: 2090000/11087382 18.9%; Applied: 0; Backlog: 0/1000; Time: 1m10s(total), 1m9s(copy); streamer: mysql-bin.000014:510249819; Lag: 0.01s, State: migrating; ETA: 4m57s
Copy: 2535000/11087382 22.9%; Applied: 0; Backlog: 0/1000; Time: 1m15s(total), 1m14s(copy); streamer: mysql-bin.000014:544314297; Lag: 0.01s, State: migrating; ETA: 4m9s
5. panic 으로 작업 중단
### panic 파일 생성
[kimdubi@test-dbatst-t1801 14:25:36 /tmp ]$ touch /tmp/ghost.panic.flag
### 작업 정지
Copy: 7839000/11087382 70.7%; Applied: 0; Backlog: 0/1000; Time: 2m16s(total), 2m15s(copy); streamer: mysql-bin.000014:951100060; Lag: 0.01s, State: migrating; ETA: 55s
2020-03-24 14:25:38 FATAL Found panic-file /tmp/ghost.panic.flag. Aborting without cleanup
### 작업 테이블은 남아있으므로 수동 삭제 필요
mysql> show tables like '%tb_test%';
+----------------------------+
| Tables_in_test (%tb_test%) |
+----------------------------+
| _tb_test_ghc |
| _tb_test_gho |
공통점
- Table Copy 방식을 사용
- 기존 테이블의 row를 chunk 단위로 임시 테이블에 복사
- 복사 완료 후 rename table로 임시 테이블을 기존 테이블처럼 사용
- 툴을 원격 서버에서 사용 가능
- PK 혹은 Unique key 반드시 필요 (중복없이 chunk 를 나누는 단위)
차이점
pt-osc | gh-ost | |
---|---|---|
작업 중 DML 반영 방식 | trigger | binlog |
binlog_format | 상관없음 | ROW(Mixed안됨) |
log_slave_updates | 상관없음 | master에서 binlog 가져오면 상관없음, slave에서 가져오면 필수 |
FK | 지원 | 미지원 |
trigger | 지원 | 미지원 |
임의 중단 기능 | 미지원 | throttle, panic 옵션으로 지원 |
작업 도중 설정 변경 | 미지원 | 지원 (https://github.com/github/gh-ost/blob/d2726c77f86cb65e25bce5c0ac5fe3fa0c997488/doc/interactive-commands.md) |
작업 수행 서버 | Master | Master / Slave (–migrate-on-replica) |