pgloader란?
csv 같은 File이나 실제 DB로부터 데이터를 읽어와 target postgresql 로 데이터 migration을 지원해주는 툴
아래와 같은 단계로 source DB로 부터 데이터를 읽어와 target DB로 copy 수행
- Fetch meta data and catalogs
### table, column metadata
select c.table_name, t.table_comment,
c.column_name, c.column_comment,
c.data_type, c.column_type, c.column_default,
c.is_nullable, c.extra
from information_schema.columns c
join information_schema.tables t using(table_schema, table_name)
where c.table_schema = 'source DB' and t.table_type = 'BASE TABLE'
### FK 제약조건
SELECT s.table_name, s.constraint_name, s.ft, s.cols, s.fcols,
rc.update_rule, rc.delete_rule
FROM
(
SELECT tc.table_schema, tc.table_name,
tc.constraint_name, k.referenced_table_name ft,
group_concat( k.column_name
order by k.ordinal_position) as cols,
group_concat( k.referenced_column_name
order by k.position_in_unique_constraint) as fcols
FROM information_schema.table_constraints tc
LEFT JOIN information_schema.key_column_usage k
ON k.table_schema = tc.table_schema
AND k.table_name = tc.table_name
AND k.constraint_name = tc.constraint_name
WHERE tc.table_schema = 'source DB'
AND k.referenced_table_schema = 'source DB'
AND tc.constraint_type = 'FOREIGN KEY'
GROUP BY tc.table_schema, tc.table_name, tc.constraint_name, ft
) s
JOIN information_schema.referential_constraints rc
ON rc.constraint_schema = s.table_schema
AND rc.constraint_name = s.constraint_name
AND rc.table_name = s.table_name
### index 구성
SELECT table_name, index_name, index_type,
sum(non_unique),
cast(GROUP_CONCAT(column_name order by seq_in_index) as char)
FROM information_schema.statistics
WHERE table_schema = 'source DB'
GROUP BY table_name, index_name, index_type
=> source DB(MySQL) 의 catalog를 쿼리하여 가져오는 단계
- Prepare the target PostgreSQL database
source DB의 catalog를 target postgresql의 catalog로 변환 후
schema를 생성하고
casting rule 이 있다면 정의된 casting rule을 적용함
(column type 이나 default value 같은 설정)
- data copy
source db에서 data를 select하는 reader Threads 와
target db로 data를 write하는 writer threads 로 나뉘어 migration 작업을 수행함
parallel로 수행 가능
- Create the indexes in parallel mode
object copy 후 index 생성하는 단계, parallel로 수행됨
- 제약조건 적용 및 pgloader 종료
PK, FK 제약조건 적용 후 migration 종료
pgloader 설치
yum install -y sbcl git curl patch unzip devscripts pandoc libsqlite3-dev freetds-dev
wget https://github.com/dimitri/pgloader/archive/v3.6.1.tar.gz
tar xvf v3.6.1.tar.gz
cd pgloader-3.6.1/
make pgloader
sudo mv ./build/bin/pgloader /usr/local/bin/
pgloader --version
pgloader version "3.6.2"
compiled with SBCL 1.3.6
- sbcl은 pgloader 3.6.1, centos6 기준 sbcl 1.3.6 이상 설치해야함!!
- internet 연결안된 private 환경에서는 bundle 파일로 설치 https://github.com/dimitri/pgloader/releases
source MySQL table schema
CREATE TABLE `place` (
`place_seq` bigint(20) NOT NULL AUTO_INCREMENT,
`place_type_code` varchar(20) NOT NULL,
`nation_code` char(2) NOT NULL DEFAULT 'KR',
`name` varchar(200) NOT NULL,
`description` varchar(1000) DEFAULT NULL,
`latitude` decimal(11,8) NOT NULL,
`longitude` decimal(11,8) NOT NULL,
`area_info_seq` bigint(20) NOT NULL,
`region_sido_area_info_seq` int(11) NOT NULL DEFAULT '0',
`region_sigungu_area_info_seq` int(11) NOT NULL DEFAULT '0',
`region_upmyundong_area_info_seq` int(11) NOT NULL DEFAULT '0',
`address_detail` varchar(400) NOT NULL,
`postcode` char(5) NOT NULL,
`homepage_url` varchar(1000) DEFAULT NULL,
`sns_urls` json DEFAULT NULL,
`telephone` varchar(30) DEFAULT NULL,
`telephone_display` varchar(30) DEFAULT NULL,
`main_image_url` varchar(1000) DEFAULT NULL,
`additional_info` varchar(1000) DEFAULT NULL,
`min_price` int(11) NOT NULL DEFAULT '0',
`max_price` int(11) NOT NULL DEFAULT '0',
`ad_yn` char(1) NOT NULL DEFAULT 'N',
`partner_yn` char(1) NOT NULL DEFAULT 'N',
`search_yn` char(1) NOT NULL DEFAULT 'Y',
`delete_yn` char(1) NOT NULL DEFAULT 'N',
`source_info` varchar(100) DEFAULT NULL,
`load_ymdt` datetime DEFAULT NULL,
`search_indexing_ymdt` datetime DEFAULT NULL,
`register_ymdt` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
`modify_ymdt` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`place_seq`)
) ENGINE=InnoDB AUTO_INCREMENT=293057 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
migtraion test
- prepare
### postgre schema 생성
CREATE SCHEMA postgre authorization testuser;
=>MySQL source DB가 postgresql 의 schema로 migration 되기 때문에 사전에 생성해둠
### search_path 설정
* DB level
ALTER DATABASE "testdb" SET search_path TO testuser, postgre;
* session level
set search_path=testuser,postgre;
- pgloader 수행
$ pgloader mysql://\[user\[:password\]@\]\[netloc\]\[:port\]\[/dbname\]\[?option=value&…\] postgresql://\[user\[:password\]@\]\[netloc\]\[:port\]\[/dbname\]\[?option=value&…\]
[irteam@testserver2 15:11:26 ~/psql/pgloader-bundle-3.6.2/bin ]$ pgloader mysql://mig_test:'qhdks123'@11.111.11.11:13306/postgre postgresql://testuser:''@22.222.22.22:3000/testdb
2020-05-30T15:16:58.031000+09:00 LOG pgloader version "3.6.2"
2020-05-30T15:16:58.119000+09:00 LOG Migrating from #<MYSQL-CONNECTION mysql://mig_test@11.111.11.11:13306/postgre {100892E793}>
2020-05-30T15:16:58.119000+09:00 LOG Migrating into #<PGSQL-CONNECTION pgsql://testuser@22.222.22.22:3000/testdb {1008B80E03}>
2020-05-30T15:17:04.811000+09:00 ERROR PostgreSQL Database error 42501: must be owner of database testdb
QUERY: ALTER DATABASE "testdb" SET search_path TO testuser, postgre;
2020-05-30T15:17:04.815000+09:00 LOG report summary reset
table name errors rows bytes total time
------------------------------- --------- --------- --------- --------------
fetch meta data 0 7 0.087s
Create Schemas 0 0 0.000s
Create SQL Types 0 0 0.007s
Create tables 0 6 0.031s
Set Table OIDs 0 3 0.005s
------------------------------- --------- --------- --------- --------------
postgre.mobile_verification_log 0 0 0.335s
postgre.place 0 225127 69.7 MB 6.255s
postgre.place_category_mappings 0 412454 19.7 MB 2.232s
------------------------------- --------- --------- --------- --------------
COPY Threads Completion 0 4 6.254s
Create Indexes 0 4 0.589s
Index Build Completion 0 4 0.119s
Reset Sequences 0 2 0.041s
Primary Keys 0 3 0.003s
Create Foreign Keys 0 0 0.000s
Create Triggers 0 0 0.000s
Set Search Path 1 0 0.001s
Install Comments 0 0 0.000s
------------------------------- --------- --------- --------- --------------
Total import time ? 637581 89.4 MB 7.007s
- pgloader 결과
testdb=> \dt
List of relations
Schema | Name | Type | Owner
---------+-------------------------+-------+----------
postgre | place | table | testuser
postgre | place_category_mappings | table | testuser
(2 rows)
testdb=> select count(*) from place;
count
--------
225127
(1 row)
testdb=> select count(*) from place_category_mappings;
count
--------
412454
(1 row)
testdb=> \d place
Table "postgre.place"
Column | Type | Collation | Nullable | Default
---------------------------------+--------------------------+-----------+----------+------------------------------------------
place_seq | bigint | | not null | nextval('place_place_seq_seq'::regclass)
place_type_code | character varying(20) | | not null |
nation_code | character(2) | | not null | 'KR'::bpchar
name | character varying(200) | | not null |
description | character varying(1000) | | |
latitude | numeric(11,8) | | not null |
longitude | numeric(11,8) | | not null |
area_info_seq | bigint | | not null |
region_sido_area_info_seq | bigint | | not null | '0'::bigint
region_sigungu_area_info_seq | bigint | | not null | '0'::bigint
region_upmyundong_area_info_seq | bigint | | not null | '0'::bigint
address_detail | character varying(400) | | not null |
postcode | character(5) | | not null |
homepage_url | character varying(1000) | | |
sns_urls | json | | |
telephone | character varying(30) | | |
telephone_display | character varying(30) | | |
main_image_url | character varying(1000) | | |
additional_info | character varying(1000) | | |
min_price | bigint | | not null | '0'::bigint
max_price | bigint | | not null | '0'::bigint
ad_yn | character(1) | | not null | 'N'::bpchar
partner_yn | character(1) | | not null | 'N'::bpchar
search_yn | character(1) | | not null | 'Y'::bpchar
delete_yn | character(1) | | not null | 'N'::bpchar
source_info | character varying(100) | | |
load_ymdt | timestamp with time zone | | |
search_indexing_ymdt | timestamp with time zone | | |
register_ymdt | timestamp with time zone | | not null | CURRENT_TIMESTAMP
modify_ymdt | timestamp with time zone | | not null | CURRENT_TIMESTAMP
Indexes:
"idx_49946_primary" PRIMARY KEY, btree (place_seq)
=> 데이터 건 수 , schema 모두 정상적으로 이관 됨
pgloader 옵션 주면서 수행하기
- ORACLE parfile 처럼 export, import 설정을 파일에 저장해놓고 사용할 수 있음
- Source / Target DB 에 대한 설정, pgloader 작업 전 / 후 의 작업도 설정가능함
$ vi test.load
LOAD DATABASE
FROM mysql://mig_test:'qhdks123'@11.111.11.11:13306/postgre
INTO postgresql://testuser:''@22.222.22.22:3000/testdb
WITH on error stop, include drop, batch rows=10000, workers = 4, concurrency = 1,
multiple readers per thread,
max parallel create index =4
SET MySQL PARAMETERS
net_read_timeout = '120',
net_write_timeout = '120'
BEFORE LOAD DO
$$ create schema if not exists testuser; $$
AFTER LOAD DO
$$ set search_path=testuser,postgre; $$
;
### pgloader load 수행
$ pgloader test.load
2020-05-31T00:53:42.044000+09:00 LOG pgloader version "3.6.2"
2020-05-31T00:53:42.178000+09:00 LOG Migrating from #<MYSQL-CONNECTION mysql://mig_test@11.111.11.11:13306/postgre {1009750FE3}>
2020-05-31T00:53:42.178000+09:00 LOG Migrating into #<PGSQL-CONNECTION pgsql://testuser@22.222.22.22:3000/testdb {10097523F3}>
2020-05-31T00:53:48.688000+09:00 LOG report summary reset
table name errors rows bytes total time
------------------------------- --------- --------- --------- --------------
before load 0 1 0.017s
fetch meta data 0 7 0.086s
Create Schemas 0 0 0.001s
Create SQL Types 0 0 0.006s
Create tables 0 6 0.035s
Set Table OIDs 0 3 0.005s
------------------------------- --------- --------- --------- --------------
postgre.mobile_verification_log 0 0 0.118s
postgre.place 0 225127 69.7 MB 6.089s
postgre.place_category_mappings 0 412454 19.7 MB 2.234s
------------------------------- --------- --------- --------- --------------
COPY Threads Completion 0 4 6.084s
Create Indexes 0 4 0.485s
Index Build Completion 0 4 0.108s
Reset Sequences 0 2 0.023s
Primary Keys 0 3 0.003s
Create Foreign Keys 0 0 0.000s
Create Triggers 0 0 0.000s
Install Comments 0 0 0.000s
after load 0 1 0.014s
------------------------------- --------- --------- --------- --------------
Total import time ? 637581 89.4 MB 6.717s