digoal
2014-03-28
PostgreSQL , pg_bulkload , 绕过xlog , 重新基础备份
我以前写过一篇BLOG关于如何提高PostgreSQL的导入速度, 如
1. 提高写入并行度, 同时使用数据库分组提交, 设置postgresql.conf
commit_delay = 10 # range 0-100000, in microseconds
commit_siblings = 5 # range 1-1000
2. 加大数据库block size,需要在编译PostgreSQL是指定,软件的BLOCK_SIZE必须与数据库的BLOCK_SIZE一致.
3. 导入数据时, 关闭autovacuum,
4. 使用unlogged table不写wal, 不建议使用,
5. 导入数据时,开启异步wal提交手段, 设置postgresql.conf : synchronous_commit = off wal_writer_delay=10ms.
6. 设置较大的wal buffer ,设置postgresql.conf : wal_buffers=128MB
7. 导入数据时关闭full page write, 设置postgresql.conf full_page_writes=off
其他详见
http://blog.163.com/digoal@126/blog/static/163877040201392641033482
其实还有一种手段, 就是绕过shared buffer, 直接写文件, 这种模式同unlogged table一样, 不需要写wal, 所以也是比较危险的操作, 对于使用了流复制备库的场景,需要重建备库。同时基础备份也需要重新生成。
pg_bulkload的direct模式就是这种思路来实现的, 它还包含了数据恢复功能, 即导入失败的话, 需要恢复.
本文将对比一下传统的数据导入和使用pg_bulkload数据导入的速度差别.
1. 使用普通的copy模式导入unlogged table.
2. 使用普通的copy模式导入logged table.
3. 使用pg_bulkload导入unlogged table.
4. 使用pg_bulkload导入logged table.
测试环境如下 :
数据库编译参数 :
pg93@db-172-16-3-150-> pg_config |grep CONFIG
CONFIGURE = '--prefix=/home/pg93/pgsql9.3.3' '--with-pgport=1922' '--with-perl' '--with-tcl' '--with-python' '--with-openssl' '--with-pam' '--without-ldap' '--with-libxml' '--with-libxslt' '--enable-thread-safety' '--with-wal-blocksize=64' '--with-blocksize=32' '--enable-dtrace' '--enable-debug' '--enable-cassert'
数据库配置 :
listen_addresses = '0.0.0.0' # what IP address(es) to listen on;
port = 1921 # (change requires restart)
max_connections = 100 # (change requires restart)
superuser_reserved_connections = 3 # (change requires restart)
unix_socket_directories = '.' # comma-separated list of directories
unix_socket_permissions = 0700 # begin with 0 to use octal notation
tcp_keepalives_idle = 60 # TCP_KEEPIDLE, in seconds;
tcp_keepalives_interval = 10 # TCP_KEEPINTVL, in seconds;
tcp_keepalives_count = 10 # TCP_KEEPCNT;
shared_buffers = 4096MB # min 128kB
maintenance_work_mem = 512MB # min 1MB
shared_preload_libraries = 'pg_stat_statements' # (change requires restart)
vacuum_cost_delay = 10 # 0-100 milliseconds
vacuum_cost_limit = 10000 # 1-10000 credits
bgwriter_delay = 10ms # 10-10000ms between rounds
wal_level = hot_standby # minimal, archive, or hot_standby
synchronous_commit = off # synchronization level;
wal_buffers = 16MB # min 32kB, -1 sets based on shared_buffers
wal_writer_delay = 10ms # 1-10000 milliseconds
checkpoint_segments = 512 # in logfile segments, min 1, 16MB each
archive_mode = on # allows archiving to be done
archive_command = '/bin/date' # command to use to archive a logfile segment
max_wal_senders = 32 # max number of walsender processes
hot_standby = on # "on" allows queries during recovery
wal_receiver_status_interval = 1s # send replies at least this often
hot_standby_feedback = on # send info from standby to prevent
random_page_cost = 2.0 # same scale as above
effective_cache_size = 96GB
log_destination = 'csvlog' # Valid values are combinations of
logging_collector = on # Enable capturing of stderr and csvlog
log_directory = 'pg_log' # directory where log files are written,
log_truncate_on_rotation = on # If on, an existing log file with the
log_min_messages = log # values in order of decreasing detail:
log_checkpoints = on
log_connections = on
log_disconnections = on
log_error_verbosity = verbose # terse, default, or verbose messages
log_lock_waits = on # log lock waits >= deadlock_timeout
log_statement = 'ddl' # none, ddl, mod, all
log_timezone = 'PRC'
track_activities = on
track_counts = on
track_functions = all # none, pl, all
track_activity_query_size = 1024 # (change requires restart)
autovacuum = off # Enable autovacuum subprocess? 'on'
log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and
autovacuum_naptime = 3s # time between autovacuum runs
autovacuum_vacuum_scale_factor = 0.0002 # fraction of table size before vacuum
autovacuum_analyze_scale_factor = 0.0001 # fraction of table size before analyze
datestyle = 'iso, mdy'
timezone = 'PRC'
lc_messages = 'C' # locale for system error message
lc_monetary = 'C' # locale for monetary formatting
lc_numeric = 'C' # locale for number formatting
lc_time = 'C' # locale for time formatting
default_text_search_config = 'pg_catalog.english'
pg_stat_statements.max = 1000
pg_stat_statements.track = all
数据库列表以及collection :
pg93@db-172-16-3-150-> psql
psql (9.3.3)
Type "help" for help.
digoal=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+---------+-------+-----------------------
digoal | postgres | UTF8 | C | C |
postgres | postgres | UTF8 | C | C |
stats | postgres | UTF8 | C | C | =Tc/postgres +
| | | | | postgres=CTc/postgres+
| | | | | stats=CTc/postgres
statsrepo | postgres | UTF8 | C | C |
template0 | postgres | UTF8 | C | C | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | C | C | =c/postgres +
| | | | | postgres=CTc/postgres
(6 rows)
生成5000万条测试数据.
digoal=# create table test(id int primary key, info text, crt_time timestamp);
CREATE TABLE
digoal=# insert into test select generate_series(1,50000000),md5(random()::text),clock_timestamp();
INSERT 0 50000000
digoal=# \dt+ test
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------+-------+----------+---------+-------------
public | test | table | postgres | 3634 MB |
(1 row)
digoal=# \di+
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+-----------+-------+----------+-------+---------+-------------
public | test_pkey | index | postgres | test | 1063 MB |
(1 row)
导出到csv文件
digoal=# copy test to '/ssd3/pg93/test.dmp' with (format csv, delimiter ',', null '\N', quote '"', force_quote *) ;
使用copy导入数据的测试 :
digoal=# truncate test;
TRUNCATE TABLE
digoal=# \timing
使用copy导入数据
digoal=# copy test from '/ssd3/pg93/test.dmp' with (format csv, delimiter ',', null '\N', quote '"');
COPY 50000000
Time: 411245.879 ms
改为unlogged table重新测试 :
digoal=# update pg_class set relpersistence='u' where relname='test';
UPDATE 1
digoal=# update pg_class set relpersistence='u' where relname='test_pkey';
UPDATE 1
使用copy导入数据
digoal=# copy test from '/ssd3/pg93/test.dmp' with (format csv, delimiter ',', null '\N', quote '"');
COPY 50000000
Time: 363699.466 ms
接下来要试一下使用pg_bulkload绕过shared buffer导入数据.
首先要按照pg_bulkload.
wget http://pgfoundry.org/frs/download.php/3566/pg_bulkload-3.1.5.tar.gz
[root@db-172-16-3-150 ~]# export PATH=/home/pg93/pgsql9.3.3/bin:$PATH
[root@db-172-16-3-150 ~]# cd /opt/soft_bak/pg_bulkload-3.1.5
[root@db-172-16-3-150 pg_bulkload-3.1.5]# which pg_config
/home/pg93/pgsql9.3.3/bin/pg_config
[root@db-172-16-3-150 pg_bulkload-3.1.5]# make
[root@db-172-16-3-150 pg_bulkload-3.1.5]# make install
清除test表的数据, 创建pg_bulkload extension.
pg93@db-172-16-3-150-> psql
psql (9.3.3)
Type "help" for help.
digoal=# truncate test;
TRUNCATE TABLE
digoal=# create extension pg_bulkload;
digoal=# update pg_class set relpersistence ='p' where relname='test_pkey';
UPDATE 1
digoal=# update pg_class set relpersistence ='p' where relname='test';
UPDATE 1
使用postgresql启动数据库 :
pg93@db-172-16-3-150-> postgresql stop -m fast
waiting for server to shut down..... done
server stopped
pg93@db-172-16-3-150-> postgresql start
server starting
注意, pg_bulkload默认连接/tmp socket, 如果配置了其他sock, 必须改为/tmp或者添加/tmp的unix sock 监听.
pg93@db-172-16-3-150-> pg_bulkload -i /ssd3/pg93/test.dmp -O test -l /ssd3/test.log -o "TYPE=CSV" -o "WRITER=PARALLEL" -h $PGDATA -p $PGPORT -d $PGDATABASE
NOTICE: BULK LOAD START
ERROR: query failed: ERROR: could not establish connection to parallel writer
DETAIL: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/tmp/.s.PGSQL.1921"?
HINT: Refer to the following if it is an authentication error. Specifies the authentication method to without the need for a password in pg_hba.conf (ex. trust or ident), or specify the password to the password file of the operating system user who ran PostgreSQL server. If cannot use these solution, specify WRITER=DIRECT.
DETAIL: query was: SELECT * FROM pg_bulkload($1)
修改unix socket目录 , 增加 /tmp. 使用pg_bulkload 提供的postgresql脚本重启数据库.
vi $PGDATA/postgresql.conf
unix_socket_directories = '.,/tmp'
pg93@db-172-16-3-150-> postgresql restart -m fast
waiting for server to shut down.... done
server stopped
server starting
重新执行pg_bulkload.
pg_bulkload -i /ssd3/pg93/test.dmp -O test -l /ssd3/pg93/test.log -o "TYPE=CSV" -o "WRITER=PARALLEL" -h $PGDATA -p $PGPORT -d $PGDATABASE
在执行过程中我们看到$PGDATA多了一个目录pg_bulkload, 存储加载数据的状态信息, 如果导入过程中发生异常, 使用postgresql脚本重启数据库时将自动修复. 或者在使用pg_ctl启动数据库前先使用pg_bulkload修复.
pg93@db-172-16-3-150-> cd $PGDATA
pg93@db-172-16-3-150-> ll pg_bulkload/
total 4.0K
-rw------- 1 pg93 pg93 512 Mar 28 09:36 16384.34315.loadstatus
日志 :
[root@db-172-16-3-150 pg93]# cat test.log
pg_bulkload 3.1.5 on 2014-03-28 13:32:31.32559+08
INPUT = /ssd3/pg93/test.dmp
PARSE_BADFILE = /ssd4/pg93/pg_root/pg_bulkload/20140328133231_digoal_public_test.prs.dmp
LOGFILE = /ssd3/pg93/test.log
LIMIT = INFINITE
PARSE_ERRORS = 0
CHECK_CONSTRAINTS = NO
TYPE = CSV
SKIP = 0
DELIMITER = ,
QUOTE = "\""
ESCAPE = "\""
NULL =
OUTPUT = public.test
MULTI_PROCESS = YES
VERBOSE = NO
WRITER = DIRECT
DUPLICATE_BADFILE = /ssd4/pg93/pg_root/pg_bulkload/20140328133231_digoal_public_test.dup.csv
DUPLICATE_ERRORS = 0
ON_DUPLICATE_KEEP = NEW
TRUNCATE = NO
0 Rows skipped.
50000000 Rows successfully loaded.
0 Rows not loaded due to parse errors.
0 Rows not loaded due to duplicate errors.
0 Rows replaced with new rows.
Run began on 2014-03-28 13:32:31.32559+08
Run ended on 2014-03-28 13:35:13.019018+08
CPU 1.55s/128.55u sec elapsed 161.69 sec
使用pg_bulkload的direct 和 multi process模式(即parallel)导入数据总耗时161秒.
相比普通的copy logged table 411秒快了一倍多.
改为unlogged table, 使用pg_bulkload重新测试 :
digoal=# update pg_class set relpersistence ='u' where relname='test';
UPDATE 1
digoal=# update pg_class set relpersistence ='u' where relname='test_pkey';
UPDATE 1
digoal=# truncate test;
TRUNCATE TABLE
digoal=# checkpoint;
CHECKPOINT
$ pg_bulkload -i /ssd3/pg93/test.dmp -O test -l /ssd3/pg93/test.log -o "TYPE=CSV" -o "WRITER=PARALLEL" -h $PGDATA -p $PGPORT -d $PGDATABASE
pg_bulkload 3.1.5 on 2014-03-28 13:36:15.602787+08
INPUT = /ssd3/pg93/test.dmp
PARSE_BADFILE = /ssd4/pg93/pg_root/pg_bulkload/20140328133615_digoal_public_test.prs.dmp
LOGFILE = /ssd3/pg93/test.log
LIMIT = INFINITE
PARSE_ERRORS = 0
CHECK_CONSTRAINTS = NO
TYPE = CSV
SKIP = 0
DELIMITER = ,
QUOTE = "\""
ESCAPE = "\""
NULL =
OUTPUT = public.test
MULTI_PROCESS = YES
VERBOSE = NO
WRITER = DIRECT
DUPLICATE_BADFILE = /ssd4/pg93/pg_root/pg_bulkload/20140328133615_digoal_public_test.dup.csv
DUPLICATE_ERRORS = 0
ON_DUPLICATE_KEEP = NEW
TRUNCATE = NO
0 Rows skipped.
50000000 Rows successfully loaded.
0 Rows not loaded due to parse errors.
0 Rows not loaded due to duplicate errors.
0 Rows replaced with new rows.
Run began on 2014-03-28 13:36:15.602787+08
Run ended on 2014-03-28 13:38:57.506558+08
CPU 2.26s/129.23u sec elapsed 161.90 sec
导入数据总耗时161秒. 相比普通的copy unlogged table 363秒快了一倍多.
因为已经绕过了shared buffer, 所以使用pg_bulkload导入目标unlogged和logged表的结果一样.
最后附direct模式的测试结果, (不开multi process). 256秒, 还是比363快.
pg93@db-172-16-3-150-> pg_bulkload -i /ssd3/pg93/test.dmp -O test -l /ssd3/pg93/test.log -o "TYPE=CSV" -o "WRITER=DIRECT" -h $PGDATA -p $PGPORT -d $PGDATABASE
pg_bulkload 3.1.5 on 2014-03-28 13:41:10.934578+08
INPUT = /ssd3/pg93/test.dmp
PARSE_BADFILE = /ssd4/pg93/pg_root/pg_bulkload/20140328134110_digoal_public_test.prs.dmp
LOGFILE = /ssd3/pg93/test.log
LIMIT = INFINITE
PARSE_ERRORS = 0
CHECK_CONSTRAINTS = NO
TYPE = CSV
SKIP = 0
DELIMITER = ,
QUOTE = "\""
ESCAPE = "\""
NULL =
OUTPUT = public.test
MULTI_PROCESS = NO
VERBOSE = NO
WRITER = DIRECT
DUPLICATE_BADFILE = /ssd4/pg93/pg_root/pg_bulkload/20140328134110_digoal_public_test.dup.csv
DUPLICATE_ERRORS = 0
ON_DUPLICATE_KEEP = NEW
TRUNCATE = NO
0 Rows skipped.
49999998 Rows successfully loaded.
0 Rows not loaded due to parse errors.
0 Rows not loaded due to duplicate errors.
0 Rows replaced with new rows.
Run began on 2014-03-28 13:41:10.934578+08
Run ended on 2014-03-28 13:45:27.007941+08
CPU 10.68s/243.64u sec elapsed 256.07 sec
1. http://blog.163.com/digoal@126/blog/static/163877040201392641033482
2. http://pgbulkload.projects.pgfoundry.org/pg_bulkload.html