Skip to content

Latest commit

 

History

History
344 lines (233 loc) · 31.8 KB

20161123_01_pdf_001.md

File metadata and controls

344 lines (233 loc) · 31.8 KB

报告时间段: 2016-11-24 10:51:14.744512 ~ 2016-11-24 10:57:04.081798

一、数据库性能分析

1. 当前数据库 TOP 10 SQL : total_cpu_time

calls total_ms min_ms max_ms mean_ms stddev_ms rows shared_blks_hit shared_blks_read shared_blks_dirtied shared_blks_written local_blks_hit local_blks_read local_blks_dirtied shared_blks_written temp_blks_read temp_blks_written blk_read_time blk_write_time query
2039698 605331.386000053 0.015 848.727 0.296775005907738 2.85413485352702 2039698 99684751 2100 9136 0 0 0 0 0 0 0 0 0 UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid = $2;
2039698 268140.604000044 0.015 836.367 0.131460933922569 1.37900138114079 2039698 93195557 905 19774 0 0 0 0 0 0 0 0 0 UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE tid = $2;
2039698 239609.501000186 0.023 836.052 0.117473028360085 4.50711811585977 2039698 11277885 2681 1571654 0 0 0 0 0 0 0 0 0 UPDATE pgbench_accounts SET abalance = abalance + $1 WHERE aid = $2;
2039698 49333.0009998656 0.006 464.647 0.0241864241667166 1.93941085447089 2039698 2109626 13067 14524 0 0 0 0 0 0 0 0 0 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES ($1, $2, $3, $4, CURRENT_TIMESTAMP);
2039698 34167.6909994584 0.009 16.797 0.0167513479936731 0.0366687321580847 2039698 8334249 0 0 0 0 0 0 0 0 0 0 0 SELECT abalance FROM pgbench_accounts WHERE aid = $1;
1 7511.361 7511.361 7511.361 7511.361 0 0 816723 3 6 0 0 0 0 0 225 211 0 0 create table IF NOT EXISTS snap_pg_rel_space_bucket as select 1::int8 snap_id, now() snap_ts, current_database(), buk this_buk_no, cnt rels_in_this_buk, pg_size_pretty(min) buk_min, pg_size_pretty(max) buk_max from ( select row_number() over (partition by buk order by tsize), tsize, buk, min(tsize) over (partition by buk),max(tsize) over (partition by buk), count(*) over (partition by buk) cnt from ( select pg_relation_size(a.oid) tsize, width_bucket(pg_relation_size(a.oid),tmin-1,tmax+1,10) buk from ( select min(pg_relation_size(a.oid)) tmin, max(pg_relation_size(a.oid)) tmax from pg_class a, pg_namespace c where a.relnamespace=c.oid and nspname !~ $$^pg_$$ and nspname<>$$information_schema$$ ) t, pg_class a, pg_namespace c where a.relnamespace=c.oid and nspname !~ $$^pg_$$ and nspname<>$$information_schema$$ ) t ) t where row_number=1;
1 3931.596 3931.596 3931.596 3931.596 0 0 909503 1 1 0 0 0 0 0 0 0 0 0 create table IF NOT EXISTS snap_pg_hash_idx as select 1::int8 snap_id, now() snap_ts, current_database(),pg_get_indexdef(oid) from pg_class where relkind=$$i$$ and pg_get_indexdef(oid) ~ $$USING hash$$;
1 2158.239 2158.239 2158.239 2158.239 0 0 223555 3 4 0 0 0 0 0 4682 4666 0 0 create table IF NOT EXISTS snap_pg_table_bloat as select 1::int8 snap_id, now() snap_ts, current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta, ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat, CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages, CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes, CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs*(relpages-otta))::bigint &#124;&#124; $$ bytes$$ END AS wastedsize, iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta, ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat, CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages, CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes, CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs*(ipages-iotta))::bigint &#124;&#124; $$ bytes$$ END AS wastedisize, CASE WHEN relpages < otta THEN CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta::bigint) END ELSE CASE WHEN ipages < iotta THEN bs*(relpages-otta::bigint) ELSE bs*(relpages-otta::bigint + ipages-iotta::bigint) END END AS totalwastedbytes FROM ( SELECT nn.nspname AS schemaname, cc.relname AS tablename, COALESCE(cc.reltuples,0) AS reltuples, COALESCE(cc.relpages,0) AS relpages, COALESCE(bs,0) AS bs, COALESCE(CEIL((cc.reltuples*((datahdr+ma- (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta, COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages, COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta FROM pg_class cc JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$ LEFT JOIN ( SELECT ma,bs,foo.nspname,foo.relname, (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr, (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2 FROM ( SELECT ns.nspname, tbl.relname, hdr, ma, bs, SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS datawidth, MAX(coalesce(null_frac,0)) AS maxfracsum, hdr+( SELECT 1+count(*)/8 FROM pg_stats s2 WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname ) AS nullhdr FROM pg_attribute att JOIN pg_class tbl ON att.attrelid = tbl.oid JOIN pg_namespace ns ON ns.oid = tbl.relnamespace LEFT JOIN pg_stats s ON s.schemaname=ns.nspname AND s.tablename = tbl.relname AND s.inherited=false AND s.attname=att.attname, ( SELECT (SELECT current_setting($$block_size$$)::numeric) AS bs, CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$) IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr, CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma FROM (SELECT version() AS v) AS foo ) AS constants WHERE att.attnum > 0 AND tbl.relkind=$$r$$ GROUP BY 1,2,3,4,5 ) AS foo ) AS rs ON cc.relname = rs.relname AND nn.nspname = rs.nspname LEFT JOIN pg_index i ON indrelid = cc.oid LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid ) AS sml order by wastedbytes desc limit 10;
2039698 2024.84599996043 0 2.968 0.000992718529900034 0.00373028886295509 0 0 0 0 0 0 0 0 0 0 0 0 0 END;
1 1978.128 1978.128 1978.128 1978.128 0 0 223270 1 2 0 0 0 0 0 4682 4666 0 0 create table IF NOT EXISTS snap_pg_index_bloat as select 1::int8 snap_id, now() snap_ts, current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta, ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat, CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages, CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes, CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs*(relpages-otta))::bigint &#124;&#124; $$ bytes$$ END AS wastedsize, iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta, ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat, CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages, CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes, CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs*(ipages-iotta))::bigint &#124;&#124; $$ bytes$$ END AS wastedisize, CASE WHEN relpages < otta THEN CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta::bigint) END ELSE CASE WHEN ipages < iotta THEN bs*(relpages-otta::bigint) ELSE bs*(relpages-otta::bigint + ipages-iotta::bigint) END END AS totalwastedbytes FROM ( SELECT nn.nspname AS schemaname, cc.relname AS tablename, COALESCE(cc.reltuples,0) AS reltuples, COALESCE(cc.relpages,0) AS relpages, COALESCE(bs,0) AS bs, COALESCE(CEIL((cc.reltuples*((datahdr+ma- (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta, COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages, COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta FROM pg_class cc JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$ LEFT JOIN ( SELECT ma,bs,foo.nspname,foo.relname, (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr, (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2 FROM ( SELECT ns.nspname, tbl.relname, hdr, ma, bs, SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS datawidth, MAX(coalesce(null_frac,0)) AS maxfracsum, hdr+( SELECT 1+count(*)/8 FROM pg_stats s2 WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname ) AS nullhdr FROM pg_attribute att JOIN pg_class tbl ON att.attrelid = tbl.oid JOIN pg_namespace ns ON ns.oid = tbl.relnamespace LEFT JOIN pg_stats s ON s.schemaname=ns.nspname AND s.tablename = tbl.relname AND s.inherited=false AND s.attname=att.attname, ( SELECT (SELECT current_setting($$block_size$$)::numeric) AS bs, CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$) IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr, CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma FROM (SELECT version() AS v) AS foo ) AS constants WHERE att.attnum > 0 AND tbl.relkind=$$r$$ GROUP BY 1,2,3,4,5 ) AS foo ) AS rs ON cc.relname = rs.relname AND nn.nspname = rs.nspname LEFT JOIN pg_index i ON indrelid = cc.oid LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid ) AS sml order by wastedibytes desc limit 10;

建议

检查SQL是否有优化空间, 配合auto_explain插件在csvlog中观察LONG SQL的执行计划是否正确.

2. TOP 10 size 表统计信息

current_database nspname relname relkind pg_relation_size seq_scan seq_tup_read idx_scan idx_tup_fetch n_tup_ins n_tup_upd n_tup_del n_tup_hot_upd n_live_tup n_dead_tup
postgres public rum_test r 15 GB 0 0 0 0 0 0 0 0 0.00000000000000000000 0.00000000000000000000
postgres public gin_test r 15 GB 0 0 0 0 0 0 0 0 0.00000000000000000000 0.00000000000000000000
postgres public arr_test r 4340 MB 0 0 0 0 0 0 0 0 0.00000000000000000000 0.00000000000000000000
postgres public pgbench_accounts r 1395 MB 0 0 4079396 4079396 0 2039698 0 1877887 10454088.000000000000 1254538.000000000000
postgres public gist_test r 498 MB 0 0 0 0 0 0 0 0 0.00000000000000000000 0.00000000000000000000
postgres public btree_test r 422 MB 0 0 0 0 0 0 0 0 0.00000000000000000000 0.00000000000000000000
postgres public p2 r 346 MB 0 0 -1 -1 0 0 0 0 0.00000000000000000000 0.00000000000000000000
postgres public p1 r 346 MB 0 0 -1 -1 0 0 0 0 0.00000000000000000000 0.00000000000000000000
postgres public pgbench_history r 281 MB 0 0 -1 -1 2039698 0 0 0 5291572.000000000000 0.00000000000000000000
postgres public test_pg_part_single r 66 MB 0 0 0 0 0 0 0 0 0.00000000000000000000 0.00000000000000000000

说明

seq_scan, 全表扫描次数

seq_tup_read, 全表扫描实际一共读取了多少条记录, 如果平均每次读取的记录数不多, 可能是limit语句造成的

idx_scan, 索引扫描次数

idx_tup_fetch, 索引扫描实际获取的记录数, 如果平均每次读取记录数很多, 说明数据库倾向使用索引扫描, 建议观察随机IO的性能看情况调整

n_tup_ins, 统计周期内, 插入了多少条记录

n_tup_upd, 统计周期内, 更新了多少条记录

n_tup_hot_upd, 统计周期内, HOT更新(指更新后的记录依旧在当前PAGE)了多少条记录

n_live_tup, 该表有多少可用数据

n_dead_tup, 该表有多少垃圾数据

建议

经验值: 单表超过10GB, 并且这个表需要频繁更新 或 删除+插入的话, 建议对表根据业务逻辑进行合理拆分后获得更好的性能, 以及便于对膨胀索引进行维护; 如果是只读的表, 建议适当结合SQL语句进行优化.

3. 全表扫描统计 , 平均实际扫描记录数排名前10的表

current_database nspname relname relkind pg_relation_size seq_scan seq_tup_read idx_scan idx_tup_fetch n_tup_ins n_tup_upd n_tup_del n_tup_hot_upd n_live_tup n_dead_tup
postgres pg_catalog pg_constraint r 38 MB 3 299868 11727 210786 2 0 2 0 1.00000000000000000000 2.0000000000000000
postgres __pg_stats__ snap_pg_statio_all_indexes r 16 MB 2 100228 -1 -1 50114 0 0 0 50114.000000000000 0.00000000000000000000
postgres pg_catalog pg_class r 17 MB 16919 647986307 540644 388186 132 32 132 31 66.0000000000000000 139.0000000000000000
postgres pg_catalog pg_index r 9032 kB 45 1202791 108324 256166 34 0 34 0 17.0000000000000000 34.0000000000000000
postgres pg_catalog pg_proc r 680 kB 2 5936 1294 1900 16 7 16 7 10.0000000000000000 19.0000000000000000
postgres __pg_stats__ snap_pg_statio_all_tables r 232 kB 2 1606 -1 -1 803 0 0 0 803.0000000000000000 0.00000000000000000000
postgres __pg_stats__ snap_pg_db_rel_size r 248 kB 2 1466 -1 -1 733 0 0 0 733.0000000000000000 0.00000000000000000000
postgres pg_catalog pg_statistic r 392 kB 6 3110 337280 5644 125 145 154 136 542.0000000000000000 38.0000000000000000
postgres __pg_stats__ snap_pg_rel_age r 32 kB 1 100 -1 -1 100 0 0 0 100.0000000000000000 0.00000000000000000000
postgres __pg_stats__ snap_pg_stat_statements r 64 kB 1 42 -1 -1 42 0 0 0 42.0000000000000000 0.00000000000000000000

说明

seq_scan, 全表扫描次数

seq_tup_read, 全表扫描实际一共读取了多少条记录, 如果平均每次读取的记录数不多, 可能是limit语句造成的

idx_scan, 索引扫描次数

idx_tup_fetch, 索引扫描实际获取的记录数, 如果平均每次读取记录数很多, 说明数据库倾向使用索引扫描, 建议观察随机IO的性能看情况调整

n_tup_ins, 统计周期内, 插入了多少条记录

n_tup_upd, 统计周期内, 更新了多少条记录

n_tup_hot_upd, 统计周期内, HOT更新(指更新后的记录依旧在当前PAGE)了多少条记录

n_live_tup, 该表有多少可用数据

n_dead_tup, 该表有多少垃圾数据

建议

平均扫描的记录数如果很多, 建议找到SQL, 并针对性的创建索引(统计分析需求除外).

4. 未命中buffer , 热表统计

current_database schemaname relname heap_blks_read heap_blks_hit idx_blks_read idx_blks_hit toast_blks_read toast_blks_hit tidx_blks_read tidx_blks_hit
postgres pg_catalog pg_proc 1 1754 0 2833 10 262 0 312
postgres pg_catalog pg_statistic 6 6448 0 675060 0 23 0 14
postgres pg_catalog pg_db_role_setting 1 420 5 1036802 0 0 0 0
postgres pg_catalog pg_shdescription 2 25 3 10 0 0 0 0
postgres __pg_stats__ snap_list 1 5 2 8 0 0 0 0
postgres pg_catalog pg_shseclabel 0 0 2 4 0 0 0 0
postgres pg_catalog pg_trigger 0 49 0 96 0 0 0 0
postgres public test_pg_part_orig_10 0 0 0 0 0 0 0 0
postgres public test_pg_part_orig_2 0 0 0 0 0 0 0 0
postgres public test_pg_part_pathman_3 0 0 0 0 0 0 0 0

建议

如果热表的命中率很低, 说明需要增加shared buffer, 添加内存.

5. 未命中&命中buffer , 热表统计

current_database schemaname relname heap_blks_read heap_blks_hit idx_blks_read idx_blks_hit toast_blks_read toast_blks_hit tidx_blks_read tidx_blks_hit
postgres pg_catalog pg_db_role_setting 1 420 5 1036802 0 0 0 0
postgres pg_catalog pg_statistic 6 6448 0 675060 0 23 0 14
postgres pg_catalog pg_constraint 0 41901 0 24072 0 0 0 0
postgres pg_catalog pg_proc 1 1754 0 2833 10 262 0 312
postgres pg_catalog pg_rewrite 0 261 0 262 0 190 0 274
postgres pg_catalog pg_description 0 0 0 634 0 0 0 0
postgres public pathman_config 0 318 0 134 0 0 0 0
postgres pg_catalog pg_seclabel 0 0 0 316 0 0 0 0
postgres pg_catalog pg_attrdef 0 77 0 130 0 0 0 0
postgres pg_catalog pg_trigger 0 49 0 96 0 0 0 0

建议

如果热表的命中率很低, 说明需要增加shared buffer, 添加内存.

6. 未命中 , 热索引统计

current_database schemaname relname indexrelname idx_blks_read idx_blks_hit
postgres pg_catalog pg_type pg_type_typname_nsp_index 57 1121
postgres public pgbench_tellers pgbench_tellers_pkey 31 8252683
postgres pg_catalog pg_type pg_type_oid_index 23 2786
postgres public pgbench_branches pgbench_branches_pkey 22 5768469
postgres public pgbench_accounts pgbench_accounts_pkey 9 12858010
postgres pg_catalog pg_database pg_database_datname_index 8 519874
postgres pg_catalog pg_authid pg_authid_rolname_index 8 4893
postgres pg_catalog pg_tablespace pg_tablespace_oid_index 8 2852
postgres pg_catalog pg_database pg_database_oid_index 8 1031167
postgres pg_catalog pg_authid pg_authid_oid_index 8 5963

建议

如果热索引的命中率很低, 说明需要增加shared buffer, 添加内存.

7. 未命中&命中buffer , 热索引统计

current_database schemaname relname indexrelname idx_blks_read idx_blks_hit
postgres public pgbench_accounts pgbench_accounts_pkey 9 12858010
postgres public pgbench_tellers pgbench_tellers_pkey 31 8252683
postgres public pgbench_branches pgbench_branches_pkey 22 5768469
postgres pg_catalog pg_class pg_class_oid_index 0 1605552
postgres pg_catalog pg_db_role_setting pg_db_role_setting_databaseid_rol_index 5 1036802
postgres pg_catalog pg_database pg_database_oid_index 8 1031167
postgres pg_catalog pg_attribute pg_attribute_relid_attnum_index 4 698289
postgres pg_catalog pg_statistic pg_statistic_relid_att_inh_index 0 675060
postgres pg_catalog pg_database pg_database_datname_index 8 519874
postgres pg_catalog pg_index pg_index_indexrelid_index 0 214951

建议

如果热索引的命中率很低, 说明需要增加shared buffer, 添加内存.

8. 上次巡检以来未使用,或者使用较少的索引

current_database | schemaname | relname | indexrelname | idx_scan | idx_tup_read | idx_tup_fetch | pg_size_pretty ---|---|---|---|---|--- postgres | public | arr_test | idx_arr_test | 0 | 0 | 0 | 3910 MB postgres | public | btree_test | idx_btree | 0 | 0 | 0 | 214 MB postgres | public | btree_test | idx_btree_1 | 0 | 0 | 0 | 214 MB postgres | public | btree_test | idx_btree_2 | 0 | 0 | 0 | 214 MB postgres | public | gin_test | idx_gin_test | 0 | 0 | 0 | 3910 MB postgres | public | gist_test | idx_gist | 0 | 0 | 0 | 601 MB postgres | public | rum_test | rumidx | 0 | 0 | 0 | 7036 MB

建议

建议和应用开发人员确认后, 删除不需要的索引.

9. 索引数超过4并且SIZE大于10MB的表

current_database schemaname relname pg_size_pretty idx_cnt

建议

索引数量太多, 影响表的增删改性能, 建议检查是否有不需要的索引.

建议检查pg_stat_all_tables(n_tup_ins,n_tup_upd,n_tup_del,n_tup_hot_upd), 如果确实非常频繁, 建议检查哪些索引是不需要的.

二、数据库空间使用分析

1. 用户对象占用空间的柱状图

snap_ts current_database this_buk_no rels_in_this_buk buk_min buk_max
2016-11-24 10:57:04.081798+08 postgres 1 50631 0 bytes 1395 MB
2016-11-24 10:57:04.081798+08 postgres 3 3 3910 MB 4340 MB
2016-11-24 10:57:04.081798+08 postgres 5 1 7036 MB 7036 MB
2016-11-24 10:57:04.081798+08 postgres 10 2 15 GB 15 GB

建议

纵览用户对象大小的柱状分布图, 单容量超过10GB的对象(指排除TOAST的空间还超过10GB),建议分区, 目前建议使用pg_pathman插件.

三、数据库垃圾分析

1. 表膨胀分析

snap_ts db schemaname tablename tups pages otta tbloat wastedpages wastedbytes wastedsize iname itups ipages iotta ibloat wastedipages wastedibytes wastedisize totalwastedbytes
2016-11-24 10:57:04.081798+08 postgres public gin_test 10000000 2000001 1737641 1.2 262360 2149253120 2149253120 bytes idx_gin_test 999505408 500492 171109651 0.0 0 0 0 bytes 2149253120
2016-11-24 10:57:04.081798+08 postgres public rum_test 9800000 1960028 1702888 1.2 257140 2106490880 2106490880 bytes rumidx 979514496 823549 167687321 0.0 0 0 0 bytes 2106490880
2016-11-24 10:57:04.081798+08 postgres public pgbench_accounts 10454088 176172 168862 1.0 7310 59883520 59883520 bytes pgbench_accounts_pkey 10454088 27431 139440 0.2 0 0 0 bytes 59883520
2016-11-24 10:57:04.081798+08 postgres public pgbench_tellers 1000 3664 9 407.1 3655 29941760 29941760 bytes pgbench_tellers_pkey 1000 605 3 201.7 602 4931584 4931584 bytes 34873344
2016-11-24 10:57:04.081798+08 postgres public arr_test 10000000 555557 553109 1.0 2448 20054016 20054016 bytes idx_arr_test 999504960 500492 53326501 0.0 0 0 0 bytes 20054016
2016-11-24 10:57:04.081798+08 postgres pg_catalog pg_class 51056 2191 1175 1.9 1016 8323072 8323072 bytes pg_class_oid_index 51056 443 907 0.5 0 0 0 bytes 8323072
2016-11-24 10:57:04.081798+08 postgres pg_catalog pg_class 51056 2191 1175 1.9 1016 8323072 8323072 bytes pg_class_tblspc_relfilenode_index 51056 315 907 0.3 0 0 0 bytes 8323072
2016-11-24 10:57:04.081798+08 postgres pg_catalog pg_class 51056 2191 1175 1.9 1016 8323072 8323072 bytes pg_class_relname_nsp_index 51056 589 907 0.6 0 0 0 bytes 8323072
2016-11-24 10:57:04.081798+08 postgres pg_catalog pg_attribute 107789 2926 2164 1.4 762 6242304 6242304 bytes pg_attribute_relid_attnam_index 107789 749 1570 0.5 0 0 0 bytes 6242304
2016-11-24 10:57:04.081798+08 postgres pg_catalog pg_attribute 107789 2926 2164 1.4 762 6242304 6242304 bytes pg_attribute_relid_attnum_index 107789 662 1570 0.4 0 0 0 bytes 6242304

建议

根据浪费的字节数, 设置合适的autovacuum_vacuum_scale_factor, 大表如果频繁的有更新或删除和插入操作, 建议设置较小的autovacuum_vacuum_scale_factor来降低浪费空间.

同时还需要打开autovacuum, 根据服务器的内存大小, CPU核数, 设置足够大的autovacuum_work_mem 或 autovacuum_max_workers 或 maintenance_work_mem, 以及足够小的 autovacuum_naptime.

同时还需要分析是否对大数据库使用了逻辑备份pg_dump, 系统中是否经常有长SQL, 长事务. 这些都有可能导致膨胀.

使用pg_reorg或者vacuum full可以回收膨胀的空间.

参考: http://blog.163.com/digoal@126/blog/static/1638770402015329115636287/.

otta评估出的表实际需要页数, iotta评估出的索引实际需要页数.

bs数据库的块大小.

tbloat表膨胀倍数, ibloat索引膨胀倍数, wastedpages表浪费了多少个数据块, wastedipages索引浪费了多少个数据块.

wastedbytes表浪费了多少字节, wastedibytes索引浪费了多少字节.

2. 索引膨胀分析

snap_ts db schemaname tablename tups pages otta tbloat wastedpages wastedbytes wastedsize iname itups ipages iotta ibloat wastedipages wastedibytes wastedisize totalwastedbytes
2016-11-24 10:57:04.081798+08 postgres public gist_test 9999939 63695 63632 1.0 63 516096 516096 bytes idx_gist 9999939 76922 36711 2.1 40211 329408512 329408512 bytes 329924608
2016-11-24 10:57:04.081798+08 postgres public test_pg_part_single 999999 8389 8322 1.0 67 548864 548864 bytes test_pg_part_single_pkey 999999 5486 2570 2.1 2916 23887872 23887872 bytes 24436736
2016-11-24 10:57:04.081798+08 postgres pg_catalog pg_depend 209624 2280 1540 1.5 740 6062080 6062080 bytes pg_depend_depender_index 209624 2462 950 2.6 1512 12386304 12386304 bytes 18448384
2016-11-24 10:57:04.081798+08 postgres pg_catalog pg_depend 209624 2280 1540 1.5 740 6062080 6062080 bytes pg_depend_reference_index 209624 1875 950 2.0 925 7577600 7577600 bytes 13639680
2016-11-24 10:57:04.081798+08 postgres public pgbench_tellers 1000 3664 9 407.1 3655 29941760 29941760 bytes pgbench_tellers_pkey 1000 605 3 201.7 602 4931584 4931584 bytes 34873344
2016-11-24 10:57:04.081798+08 postgres public test_pg_part_orig_3 100014 737 833 0.9 0 0 0 bytes test_pg_part_orig_3_pkey 100014 551 257 2.1 294 2408448 2408448 bytes 2408448
2016-11-24 10:57:04.081798+08 postgres public test_pg_part_pathman_4 100000 761 833 0.9 0 0 0 bytes test_pg_part_pathman_4_pkey 100000 551 257 2.1 294 2408448 2408448 bytes 2408448
2016-11-24 10:57:04.081798+08 postgres public test_pg_part_pathman_8 100000 738 833 0.9 0 0 0 bytes test_pg_part_pathman_8_pkey 100000 551 257 2.1 294 2408448 2408448 bytes 2408448
2016-11-24 10:57:04.081798+08 postgres public test_pg_part_pathman_3 100000 732 833 0.9 0 0 0 bytes test_pg_part_pathman_3_pkey 100000 551 257 2.1 294 2408448 2408448 bytes 2408448
2016-11-24 10:57:04.081798+08 postgres public test_pg_part_orig_9 100001 730 833 0.9 0 0 0 bytes test_pg_part_orig_9_pkey 100001 551 257 2.1 294 2408448 2408448 bytes 2408448

建议

如果索引膨胀太大, 会影响性能, 建议重建索引, create index CONCURRENTLY ... .

3. 垃圾记录 TOP 10 表分析

snap_ts database schemaname tablename n_dead_tup
2016-11-24 10:57:04.081798+08 postgres public pgbench_branches 672799

建议

通常垃圾过多, 可能是因为无法回收垃圾, 或者回收垃圾的进程繁忙或没有及时唤醒, 或者没有开启autovacuum, 或在短时间内产生了大量的垃圾.

可以等待autovacuum进行处理, 或者手工执行vacuum table.

4. 未引用的大对象分析

snap_ts | database | pg_size_pretty ---|---|---|---|---

建议

如果大对象没有被引用时, 建议删除, 否则就类似于内存泄露, 使用vacuumlo可以删除未被引用的大对象, 例如: vacuumlo -l 1000 $db -w或者我写的调用vacuumlo()函数.

应用开发时, 注意及时删除不需要使用的大对象, 使用lo_unlink 或 驱动对应的API.

参考 http://www.postgresql.org/docs/9.4/static/largeobjects.html

四、数据库安全或潜在风险分析

1. 表年龄前100

snap_ts database rolname nspname relkind relname age age_remain

建议

表的年龄正常情况下应该小于vacuum_freeze_table_age, 如果剩余年龄小于2亿, 建议人为干预, 将LONG SQL或事务杀掉后, 执行vacuum freeze.

2. unlogged table和hash index

snap_ts database rolname nspname relname
snap_ts database idx

建议

unlogged table和hash index不记录XLOG, 无法使用流复制或者log shipping的方式复制到standby节点, 如果在standby节点执行某些SQL, 可能导致报错或查不到数据.

在数据库CRASH后无法修复unlogged table和hash index, 不建议使用.

PITR对unlogged table和hash index也不起作用.

3. 剩余可使用次数不足1000万次的序列检查

snap_ts database rolname nspname relname times_remain

建议

序列剩余使用次数到了之后, 将无法使用, 报错, 请开发人员关注.

4. 锁等待分析

snap_ts locktype r_mode r_user r_db relation r_pid r_page r_tuple r_xact_start r_query_start r_locktime r_query w_mode w_pid w_page w_tuple w_xact_start w_query_start w_locktime w_query

建议

锁等待状态, 反映业务逻辑的问题或者SQL性能有问题, 建议深入排查持锁的SQL.