이번 글에서는 PostgreSQL에서 Lock holder와 blocked session을 확인하는 방법에 대해 알아보겠습니다.
더불어 MySQL의 innodb_lock_wait_timeout, lock_wait_timeout 설정과 같은 설정도 함께 알아보겠습니다.

lock monitoring query

with recursive activity as (
  select
    pg_blocking_pids(pid) blocked_by,
    *,
    age(clock_timestamp(), xact_start)::interval(0) as tx_age,
    age(clock_timestamp(), state_change)::interval(0) as state_age
  from pg_stat_activity
  where state is distinct from 'idle'
), blockers as (
  select
    array_agg(distinct c order by c) as pids
  from (
    select unnest(blocked_by)
    from activity
  ) as dt(c)
), tree as (
  select
    activity.*,
    1 as level,
    activity.pid as top_blocker_pid,
    array[activity.pid] as path,
    array_agg(activity.pid) over () as all_blockers_above
  from activity, blockers
  where
    array[pid] <@ blockers.pids
    and blocked_by = '{}'::int[]
  union all
  select
    activity.*,
    tree.level + 1 as level,
    tree.top_blocker_pid,
    path || array[activity.pid] as path,
    tree.all_blockers_above || array_agg(activity.pid) over () as all_blockers_above
  from activity, tree
  where
    not array[activity.pid] <@ tree.all_blockers_above
    and activity.blocked_by <> '{}'::int[]
    and activity.blocked_by <@ tree.all_blockers_above
)
select
  pid,
  blocked_by,
  tx_age,
  state_age,
  replace(state, 'idle in transaction', 'idletx') state,
  datname,
  usename,
  format('%s:%s', wait_event_type, wait_event) as wait,
  (select count(distinct t1.pid) from tree t1 where array[tree.pid] <@ t1.path and t1.pid <> tree.pid) as blkd,
  format(
    '%s %s%s',
    lpad('[' || pid::text || ']', 7, ' '),
    repeat('.', level - 1) || case when level > 1 then ' ' end,
    query
  )
from tree
order by top_blocker_pid, level, pid;

lock test

  • session 1
testdb=# begin;
BEGIN
testdb=*# select count(*) from tb_test;
 count
--------
 110000
(1 row)

  • session 2
testdb=# alter table tb_test add column f varchar(10);



  • session 3
testdb=# select * from tb_test limit 1;


  • lock 조회
 pid  | blocked_by |  tx_age  | state_age | state  | datname | usename  |       wait        | blkd |                         format
------+------------+----------+-----------+--------+---------+----------+-------------------+------+---------------------------------------------------------
 2099 | {}         | 00:00:36 | 00:00:32  | idletx | testdb  | postgres | Client:ClientRead |    2 |  [2099] select count(*) from tb_test;
 2061 | {2099}     | 00:00:25 | 00:00:25  | active | testdb  | postgres | Lock:relation     |    1 |  [2061] . alter table tb_test add column f varchar(10);
 2100 | {2061}     | 00:00:14 | 00:00:14  | active | testdb  | postgres | Lock:relation     |    0 |  [2100] .. select * from tb_test limit 1;

=> transaction이 끝나지 않은 2099 select 세션이 lock hold 하고 있음
2061 alter 문은 2099 세션을 기다리고, 2100 select 세션은 2061 alter에 의해 대기하게되는 것을 확인할 수 있음
DDL 대기로 인해 뒤 이어지는 세션들이 계속 대기하게 되는 현상

  • lock type 조회
testdb=# select locktype, relation, virtualxid, transactionid, virtualtransaction, pid, mode, granted from pg_locks where pid = 2099;
  locktype  | relation | virtualxid | transactionid | virtualtransaction | pid   |      mode       | granted
------------+----------+------------+---------------+--------------------+-------+-----------------+---------
 virtualxid |          | 3/11       |               | 3/11               |  2099 | ExclusiveLock   | t
 relation   |    24869 |            |               | 3/11               |  2099 | AccessShareLock | t
(2 rows)

=> table 24869 (tb_test) 에 대해 AccessShareLock을 잡고 있는 상황으로,
DDL은 이 lock에 의해 대기하게 됨

해결 방법

transaction에 의해 Lock 대기 하는 DDL세션은 그 하나만 놓고보면 별 이슈가 되지 않습니다.
선행 트랜잭션이 종료된 후에 DDL구문을 다시 해본다던가 하면 되니까요
그러나 Lock 대기하는 DDL세션에 의해 또 뒤이어 들어오는 세션들이 Lock 대기하게 되면 이것은 큰 문제가 되기때문에
DDL 구문을 바로 취소시켜버려야 합니다.

testdb=# select * from pg_settings where name in ('statement_timeout','lock_timeout');
-[ RECORD 1 ]---+----------------------------------------------------------
name            | lock_timeout
setting         | 0
unit            | ms
category        | Client Connection Defaults / Statement Behavior
short_desc      | Sets the maximum allowed duration of any wait for a lock.
extra_desc      | A value of 0 turns off the timeout.
context         | user
vartype         | integer
source          | default
min_val         | 0
max_val         | 2147483647
enumvals        |
boot_val        | 0
reset_val       | 0
sourcefile      |
sourceline      |
pending_restart | f
-[ RECORD 2 ]---+----------------------------------------------------------
name            | statement_timeout
setting         | 0
unit            | ms
category        | Client Connection Defaults / Statement Behavior
short_desc      | Sets the maximum allowed duration of any statement.
extra_desc      | A value of 0 turns off the timeout.
context         | user
vartype         | integer
source          | default
min_val         | 0
max_val         | 2147483647
enumvals        |
boot_val        | 0
reset_val       | 0
sourcefile      |
sourceline      |
pending_restart | f
  • 위의 lock_timeout, statement_timeout 의 값을 설정하면 됨

    • statement_timeout 는 커맨드 자체가 대기하게 되는 경우
    • lock_timeout은 잠금 대기하게 되는 경우 적용, table,index, row 모두 통합
    • statement_timeout은 운영 위험이 있어 session 단위로 적용하는 등 postgresql.conf에는 적용하지 않는 편임
    • default 0 으로 무한대기
  • MySQL의 lock_wait_timeout(metadata lock), innodb_lock_wait_timeout (row lock) 과 같은 역할

  • 적용 테스트

testdb=# set lock_timeout='100ms';
SET
testdb=# alter table tb_test add column f varchar(10);
ERROR:  canceling statement due to lock timeout


testdb=# set lock_timeout='0';
SET
testdb=# alter table tb_test add column f varchar(10);

^CCancel request sent
ERROR:  canceling statement due to user request

testdb=# set statement_timeout='100ms';
SET
testdb=# alter table tb_test add column f varchar(10);
ERROR:  canceling statement due to statement timeout