报告时间段: `2016-11-24 10:51:14.744512` ~ `2016-11-24 10:57:04.081798`

一、数据库性能分析

1. 当前数据库 TOP 10 SQL : total_cpu_time

calls	total_ms	min_ms	max_ms	mean_ms	stddev_ms	rows	shared_blks_hit	shared_blks_read	shared_blks_dirtied	temp_blks_read	temp_blks_written	query
2039698	605331.386000053	0.015	848.727	0.296775005907738	2.85413485352702	2039698	99684751	2100	9136	0	0	`UPDATE pgbench_branches SET bbalance = bbalance + $1 WHERE bid = $2;`
2039698	268140.604000044	0.015	836.367	0.131460933922569	1.37900138114079	2039698	93195557	905	19774	0	0	`UPDATE pgbench_tellers SET tbalance = tbalance + $1 WHERE tid = $2;`
2039698	239609.501000186	0.023	836.052	0.117473028360085	4.50711811585977	2039698	11277885	2681	1571654	0	0	`UPDATE pgbench_accounts SET abalance = abalance + $1 WHERE aid = $2;`
2039698	49333.0009998656	0.006	464.647	0.0241864241667166	1.93941085447089	2039698	2109626	13067	14524	0	0	`INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES ($1, $2, $3, $4, CURRENT_TIMESTAMP);`
2039698	34167.6909994584	0.009	16.797	0.0167513479936731	0.0366687321580847	2039698	8334249	0	0	0	0	`SELECT abalance FROM pgbench_accounts WHERE aid = $1;`
1	7511.361	7511.361	7511.361	7511.361	0	0	816723	3	6	225	211	create table IF NOT EXISTS snap_pg_rel_space_bucket as select 1::int8 snap_id, now() snap_ts, current_database(), buk this_buk_no, cnt rels_in_this_buk, pg_size_pretty(min) buk_min, pg_size_pretty(max) buk_max from ( select row_number() over (partition by buk order by tsize), tsize, buk, min(tsize) over (partition by buk),max(tsize) over (partition by buk), count(*) over (partition by buk) cnt from ( select pg_relation_size(a.oid) tsize, width_bucket(pg_relation_size(a.oid),tmin-1,tmax+1,10) buk from ( select min(pg_relation_size(a.oid)) tmin, max(pg_relation_size(a.oid)) tmax from pg_class a, pg_namespace c where a.relnamespace=c.oid and nspname !~ $$^pg_$$ and nspname<>$$information_schema$$ ) t, pg_class a, pg_namespace c where a.relnamespace=c.oid and nspname !~ $$^pg_$$ and nspname<>$$information_schema$$ ) t ) t where row_number=1;
1	3931.596	3931.596	3931.596	3931.596	0	0	909503	1	1	0	0	`create table IF NOT EXISTS snap_pg_hash_idx as select 1::int8 snap_id, now() snap_ts, current_database(),pg_get_indexdef(oid) from pg_class where relkind=$$i$$ and pg_get_indexdef(oid) ~ $$USING hash$$;`
1	2158.239	2158.239	2158.239	2158.239	0	0	223555	3	4	4682	4666	create table IF NOT EXISTS snap_pg_table_bloat as select 1::int8 snap_id, now() snap_ts, current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta, ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat, CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages, CASE WHEN relpages < otta THEN 0 ELSE bs(sml.relpages-otta)::bigint END AS wastedbytes, CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs(relpages-otta))::bigint \|\| $$ bytes$$ END AS wastedsize, iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta, ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat, CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages, CASE WHEN ipages < iotta THEN 0 ELSE bs(ipages-iotta) END AS wastedibytes, CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs(ipages-iotta))::bigint \|\| $$ bytes$$ END AS wastedisize, CASE WHEN relpages < otta THEN CASE WHEN ipages < iotta THEN 0 ELSE bs(ipages-iotta::bigint) END ELSE CASE WHEN ipages < iotta THEN bs(relpages-otta::bigint) ELSE bs(relpages-otta::bigint + ipages-iotta::bigint) END END AS totalwastedbytes FROM ( SELECT nn.nspname AS schemaname, cc.relname AS tablename, COALESCE(cc.reltuples,0) AS reltuples, COALESCE(cc.relpages,0) AS relpages, COALESCE(bs,0) AS bs, COALESCE(CEIL((cc.reltuples((datahdr+ma- (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta, COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages, COALESCE(CEIL((c2.reltuples(datahdr-12))/(bs-20::float)),0) AS iotta FROM pg_class cc JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$ LEFT JOIN ( SELECT ma,bs,foo.nspname,foo.relname, (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr, (maxfracsum(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2 FROM ( SELECT ns.nspname, tbl.relname, hdr, ma, bs, SUM((1-coalesce(null_frac,0))coalesce(avg_width, 2048)) AS datawidth, MAX(coalesce(null_frac,0)) AS maxfracsum, hdr+( SELECT 1+count()/8 FROM pg_stats s2 WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname ) AS nullhdr FROM pg_attribute att JOIN pg_class tbl ON att.attrelid = tbl.oid JOIN pg_namespace ns ON ns.oid = tbl.relnamespace LEFT JOIN pg_stats s ON s.schemaname=ns.nspname AND s.tablename = tbl.relname AND s.inherited=false AND s.attname=att.attname, ( SELECT (SELECT current_setting($$block_size$$)::numeric) AS bs, CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$) IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr, CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma FROM (SELECT version() AS v) AS foo ) AS constants WHERE att.attnum > 0 AND tbl.relkind=$$r$$ GROUP BY 1,2,3,4,5 ) AS foo ) AS rs ON cc.relname = rs.relname AND nn.nspname = rs.nspname LEFT JOIN pg_index i ON indrelid = cc.oid LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid ) AS sml order by wastedbytes desc limit 10;
2039698	2024.84599996043	0	2.968	0.000992718529900034	0.00373028886295509	0	0	0	0	0	0	`END;`
1	1978.128	1978.128	1978.128	1978.128	0	0	223270	1	2	4682	4666	create table IF NOT EXISTS snap_pg_index_bloat as select 1::int8 snap_id, now() snap_ts, current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta, ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat, CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages, CASE WHEN relpages < otta THEN 0 ELSE bs(sml.relpages-otta)::bigint END AS wastedbytes, CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs(relpages-otta))::bigint \|\| $$ bytes$$ END AS wastedsize, iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta, ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat, CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages, CASE WHEN ipages < iotta THEN 0 ELSE bs(ipages-iotta) END AS wastedibytes, CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs(ipages-iotta))::bigint \|\| $$ bytes$$ END AS wastedisize, CASE WHEN relpages < otta THEN CASE WHEN ipages < iotta THEN 0 ELSE bs(ipages-iotta::bigint) END ELSE CASE WHEN ipages < iotta THEN bs(relpages-otta::bigint) ELSE bs(relpages-otta::bigint + ipages-iotta::bigint) END END AS totalwastedbytes FROM ( SELECT nn.nspname AS schemaname, cc.relname AS tablename, COALESCE(cc.reltuples,0) AS reltuples, COALESCE(cc.relpages,0) AS relpages, COALESCE(bs,0) AS bs, COALESCE(CEIL((cc.reltuples((datahdr+ma- (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta, COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages, COALESCE(CEIL((c2.reltuples(datahdr-12))/(bs-20::float)),0) AS iotta FROM pg_class cc JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$ LEFT JOIN ( SELECT ma,bs,foo.nspname,foo.relname, (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr, (maxfracsum(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2 FROM ( SELECT ns.nspname, tbl.relname, hdr, ma, bs, SUM((1-coalesce(null_frac,0))coalesce(avg_width, 2048)) AS datawidth, MAX(coalesce(null_frac,0)) AS maxfracsum, hdr+( SELECT 1+count()/8 FROM pg_stats s2 WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname ) AS nullhdr FROM pg_attribute att JOIN pg_class tbl ON att.attrelid = tbl.oid JOIN pg_namespace ns ON ns.oid = tbl.relnamespace LEFT JOIN pg_stats s ON s.schemaname=ns.nspname AND s.tablename = tbl.relname AND s.inherited=false AND s.attname=att.attname, ( SELECT (SELECT current_setting($$block_size$$)::numeric) AS bs, CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$) IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr, CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma FROM (SELECT version() AS v) AS foo ) AS constants WHERE att.attnum > 0 AND tbl.relkind=$$r$$ GROUP BY 1,2,3,4,5 ) AS foo ) AS rs ON cc.relname = rs.relname AND nn.nspname = rs.nspname LEFT JOIN pg_index i ON indrelid = cc.oid LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid ) AS sml order by wastedibytes desc limit 10;

建议

检查SQL是否有优化空间, 配合auto_explain插件在csvlog中观察LONG SQL的执行计划是否正确.

2. TOP 10 size 表统计信息

current_database	nspname	relname	relkind	pg_relation_size	idx_scan	idx_tup_fetch	n_tup_ins	n_tup_upd	n_tup_hot_upd	n_live_tup	n_dead_tup
`postgres`	`public`	`rum_test`	r	15 GB	0	0	0	0	0	0.00000000000000000000	0.00000000000000000000
`postgres`	`public`	`gin_test`	r	15 GB	0	0	0	0	0	0.00000000000000000000	0.00000000000000000000
`postgres`	`public`	`arr_test`	r	4340 MB	0	0	0	0	0	0.00000000000000000000	0.00000000000000000000
`postgres`	`public`	`pgbench_accounts`	r	1395 MB	4079396	4079396	0	2039698	1877887	10454088.000000000000	1254538.000000000000
`postgres`	`public`	`gist_test`	r	498 MB	0	0	0	0	0	0.00000000000000000000	0.00000000000000000000
`postgres`	`public`	`btree_test`	r	422 MB	0	0	0	0	0	0.00000000000000000000	0.00000000000000000000
`postgres`	`public`	`p2`	r	346 MB	-1	-1	0	0	0	0.00000000000000000000	0.00000000000000000000
`postgres`	`public`	`p1`	r	346 MB	-1	-1	0	0	0	0.00000000000000000000	0.00000000000000000000
`postgres`	`public`	`pgbench_history`	r	281 MB	-1	-1	2039698	0	0	5291572.000000000000	0.00000000000000000000
`postgres`	`public`	`test_pg_part_single`	r	66 MB	0	0	0	0	0	0.00000000000000000000	0.00000000000000000000

说明

seq_scan, 全表扫描次数

seq_tup_read, 全表扫描实际一共读取了多少条记录, 如果平均每次读取的记录数不多, 可能是limit语句造成的

idx_scan, 索引扫描次数

idx_tup_fetch, 索引扫描实际获取的记录数, 如果平均每次读取记录数很多, 说明数据库倾向使用索引扫描, 建议观察随机IO的性能看情况调整

n_tup_ins, 统计周期内, 插入了多少条记录

n_tup_upd, 统计周期内, 更新了多少条记录

n_tup_hot_upd, 统计周期内, HOT更新(指更新后的记录依旧在当前PAGE)了多少条记录

n_live_tup, 该表有多少可用数据

n_dead_tup, 该表有多少垃圾数据

建议

经验值: 单表超过10GB, 并且这个表需要频繁更新或删除+插入的话, 建议对表根据业务逻辑进行合理拆分后获得更好的性能, 以及便于对膨胀索引进行维护; 如果是只读的表, 建议适当结合SQL语句进行优化.

3. 全表扫描统计 , 平均实际扫描记录数排名前10的表

current_database	nspname	relname	relkind	pg_relation_size	seq_scan	seq_tup_read	idx_scan	idx_tup_fetch	n_tup_ins	n_tup_upd	n_tup_del	n_tup_hot_upd	n_live_tup	n_dead_tup
`postgres`	`pg_catalog`	`pg_constraint`	r	38 MB	3	299868	11727	210786	2	0	2	0	1.00000000000000000000	2.0000000000000000
`postgres`	`__pg_stats__`	`snap_pg_statio_all_indexes`	r	16 MB	2	100228	-1	-1	50114	0	0	0	50114.000000000000	0.00000000000000000000
`postgres`	`pg_catalog`	`pg_class`	r	17 MB	16919	647986307	540644	388186	132	32	132	31	66.0000000000000000	139.0000000000000000
`postgres`	`pg_catalog`	`pg_index`	r	9032 kB	45	1202791	108324	256166	34	0	34	0	17.0000000000000000	34.0000000000000000
`postgres`	`pg_catalog`	`pg_proc`	r	680 kB	2	5936	1294	1900	16	7	16	7	10.0000000000000000	19.0000000000000000
`postgres`	`__pg_stats__`	`snap_pg_statio_all_tables`	r	232 kB	2	1606	-1	-1	803	0	0	0	803.0000000000000000	0.00000000000000000000
`postgres`	`__pg_stats__`	`snap_pg_db_rel_size`	r	248 kB	2	1466	-1	-1	733	0	0	0	733.0000000000000000	0.00000000000000000000
`postgres`	`pg_catalog`	`pg_statistic`	r	392 kB	6	3110	337280	5644	125	145	154	136	542.0000000000000000	38.0000000000000000
`postgres`	`__pg_stats__`	`snap_pg_rel_age`	r	32 kB	1	100	-1	-1	100	0	0	0	100.0000000000000000	0.00000000000000000000
`postgres`	`__pg_stats__`	`snap_pg_stat_statements`	r	64 kB	1	42	-1	-1	42	0	0	0	42.0000000000000000	0.00000000000000000000

说明

seq_scan, 全表扫描次数

seq_tup_read, 全表扫描实际一共读取了多少条记录, 如果平均每次读取的记录数不多, 可能是limit语句造成的

idx_scan, 索引扫描次数

idx_tup_fetch, 索引扫描实际获取的记录数, 如果平均每次读取记录数很多, 说明数据库倾向使用索引扫描, 建议观察随机IO的性能看情况调整

n_tup_ins, 统计周期内, 插入了多少条记录

n_tup_upd, 统计周期内, 更新了多少条记录

n_tup_hot_upd, 统计周期内, HOT更新(指更新后的记录依旧在当前PAGE)了多少条记录

n_live_tup, 该表有多少可用数据

n_dead_tup, 该表有多少垃圾数据

建议

平均扫描的记录数如果很多, 建议找到SQL, 并针对性的创建索引(统计分析需求除外).

4. 未命中buffer , 热表统计

current_database	schemaname	relname	heap_blks_read	heap_blks_hit	idx_blks_read	idx_blks_hit	toast_blks_read	toast_blks_hit	tidx_blks_hit
`postgres`	`pg_catalog`	`pg_proc`	1	1754	0	2833	10	262	312
`postgres`	`pg_catalog`	`pg_statistic`	6	6448	0	675060	0	23	14
`postgres`	`pg_catalog`	`pg_db_role_setting`	1	420	5	1036802	0	0	0
`postgres`	`pg_catalog`	`pg_shdescription`	2	25	3	10	0	0	0
`postgres`	`__pg_stats__`	`snap_list`	1	5	2	8	0	0	0
`postgres`	`pg_catalog`	`pg_shseclabel`	0	0	2	4	0	0	0
`postgres`	`pg_catalog`	`pg_trigger`	0	49	0	96	0	0	0
`postgres`	`public`	`test_pg_part_orig_10`	0	0	0	0	0	0	0
`postgres`	`public`	`test_pg_part_orig_2`	0	0	0	0	0	0	0
`postgres`	`public`	`test_pg_part_pathman_3`	0	0	0	0	0	0	0

建议

如果热表的命中率很低, 说明需要增加shared buffer, 添加内存.

5. 未命中&命中buffer , 热表统计

current_database	schemaname	relname	heap_blks_read	heap_blks_hit	idx_blks_read	idx_blks_hit	toast_blks_read	toast_blks_hit	tidx_blks_hit
`postgres`	`pg_catalog`	`pg_db_role_setting`	1	420	5	1036802	0	0	0
`postgres`	`pg_catalog`	`pg_statistic`	6	6448	0	675060	0	23	14
`postgres`	`pg_catalog`	`pg_constraint`	0	41901	0	24072	0	0	0
`postgres`	`pg_catalog`	`pg_proc`	1	1754	0	2833	10	262	312
`postgres`	`pg_catalog`	`pg_rewrite`	0	261	0	262	0	190	274
`postgres`	`pg_catalog`	`pg_description`	0	0	0	634	0	0	0
`postgres`	`public`	`pathman_config`	0	318	0	134	0	0	0
`postgres`	`pg_catalog`	`pg_seclabel`	0	0	0	316	0	0	0
`postgres`	`pg_catalog`	`pg_attrdef`	0	77	0	130	0	0	0
`postgres`	`pg_catalog`	`pg_trigger`	0	49	0	96	0	0	0

建议

如果热表的命中率很低, 说明需要增加shared buffer, 添加内存.

6. 未命中 , 热索引统计

current_database	schemaname	relname	indexrelname	idx_blks_read	idx_blks_hit
`postgres`	`pg_catalog`	`pg_type`	`pg_type_typname_nsp_index`	57	1121
`postgres`	`public`	`pgbench_tellers`	`pgbench_tellers_pkey`	31	8252683
`postgres`	`pg_catalog`	`pg_type`	`pg_type_oid_index`	23	2786
`postgres`	`public`	`pgbench_branches`	`pgbench_branches_pkey`	22	5768469
`postgres`	`public`	`pgbench_accounts`	`pgbench_accounts_pkey`	9	12858010
`postgres`	`pg_catalog`	`pg_database`	`pg_database_datname_index`	8	519874
`postgres`	`pg_catalog`	`pg_authid`	`pg_authid_rolname_index`	8	4893
`postgres`	`pg_catalog`	`pg_tablespace`	`pg_tablespace_oid_index`	8	2852
`postgres`	`pg_catalog`	`pg_database`	`pg_database_oid_index`	8	1031167
`postgres`	`pg_catalog`	`pg_authid`	`pg_authid_oid_index`	8	5963

建议

如果热索引的命中率很低, 说明需要增加shared buffer, 添加内存.

7. 未命中&命中buffer , 热索引统计

current_database	schemaname	relname	indexrelname	idx_blks_read	idx_blks_hit
`postgres`	`public`	`pgbench_accounts`	`pgbench_accounts_pkey`	9	12858010
`postgres`	`public`	`pgbench_tellers`	`pgbench_tellers_pkey`	31	8252683
`postgres`	`public`	`pgbench_branches`	`pgbench_branches_pkey`	22	5768469
`postgres`	`pg_catalog`	`pg_class`	`pg_class_oid_index`	0	1605552
`postgres`	`pg_catalog`	`pg_db_role_setting`	`pg_db_role_setting_databaseid_rol_index`	5	1036802
`postgres`	`pg_catalog`	`pg_database`	`pg_database_oid_index`	8	1031167
`postgres`	`pg_catalog`	`pg_attribute`	`pg_attribute_relid_attnum_index`	4	698289
`postgres`	`pg_catalog`	`pg_statistic`	`pg_statistic_relid_att_inh_index`	0	675060
`postgres`	`pg_catalog`	`pg_database`	`pg_database_datname_index`	8	519874
`postgres`	`pg_catalog`	`pg_index`	`pg_index_indexrelid_index`	0	214951

建议

如果热索引的命中率很低, 说明需要增加shared buffer, 添加内存.

8. 上次巡检以来未使用，或者使用较少的索引

current_database | schemaname | relname | indexrelname | idx_scan | idx_tup_read | idx_tup_fetch | pg_size_pretty ---|---|---|---|---|--- postgres | public | arr_test | idx_arr_test | 0 | 0 | 0 | 3910 MB postgres | public | btree_test | idx_btree | 0 | 0 | 0 | 214 MB postgres | public | btree_test | idx_btree_1 | 0 | 0 | 0 | 214 MB postgres | public | btree_test | idx_btree_2 | 0 | 0 | 0 | 214 MB postgres | public | gin_test | idx_gin_test | 0 | 0 | 0 | 3910 MB postgres | public | gist_test | idx_gist | 0 | 0 | 0 | 601 MB postgres | public | rum_test | rumidx | 0 | 0 | 0 | 7036 MB

建议

建议和应用开发人员确认后, 删除不需要的索引.

9. 索引数超过4并且SIZE大于10MB的表

current_database	schemaname	relname	pg_size_pretty	idx_cnt

建议

索引数量太多, 影响表的增删改性能, 建议检查是否有不需要的索引.

建议检查pg_stat_all_tables(n_tup_ins,n_tup_upd,n_tup_del,n_tup_hot_upd), 如果确实非常频繁, 建议检查哪些索引是不需要的.

二、数据库空间使用分析

1. 用户对象占用空间的柱状图

snap_ts	current_database	this_buk_no	rels_in_this_buk	buk_min	buk_max
`2016-11-24 10:57:04.081798+08`	`postgres`	1	50631	0 bytes	1395 MB
`2016-11-24 10:57:04.081798+08`	`postgres`	3	3	3910 MB	4340 MB
`2016-11-24 10:57:04.081798+08`	`postgres`	5	1	7036 MB	7036 MB
`2016-11-24 10:57:04.081798+08`	`postgres`	10	2	15 GB	15 GB

建议

纵览用户对象大小的柱状分布图, 单容量超过10GB的对象(指排除TOAST的空间还超过10GB)，建议分区, 目前建议使用pg_pathman插件.

三、数据库垃圾分析

1. 表膨胀分析

snap_ts	db	schemaname	tablename	tups	pages	otta	tbloat	wastedpages	wastedbytes	wastedsize	iname	itups	ipages	iotta	ibloat	wastedipages	wastedibytes	wastedisize	totalwastedbytes
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`gin_test`	10000000	2000001	1737641	1.2	262360	2149253120	2149253120 bytes	idx_gin_test	999505408	500492	171109651	0.0	0	0	0 bytes	2149253120
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`rum_test`	9800000	1960028	1702888	1.2	257140	2106490880	2106490880 bytes	rumidx	979514496	823549	167687321	0.0	0	0	0 bytes	2106490880
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`pgbench_accounts`	10454088	176172	168862	1.0	7310	59883520	59883520 bytes	pgbench_accounts_pkey	10454088	27431	139440	0.2	0	0	0 bytes	59883520
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`pgbench_tellers`	1000	3664	9	407.1	3655	29941760	29941760 bytes	pgbench_tellers_pkey	1000	605	3	201.7	602	4931584	4931584 bytes	34873344
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`arr_test`	10000000	555557	553109	1.0	2448	20054016	20054016 bytes	idx_arr_test	999504960	500492	53326501	0.0	0	0	0 bytes	20054016
`2016-11-24 10:57:04.081798+08`	`postgres`	`pg_catalog`	`pg_class`	51056	2191	1175	1.9	1016	8323072	8323072 bytes	pg_class_oid_index	51056	443	907	0.5	0	0	0 bytes	8323072
`2016-11-24 10:57:04.081798+08`	`postgres`	`pg_catalog`	`pg_class`	51056	2191	1175	1.9	1016	8323072	8323072 bytes	pg_class_tblspc_relfilenode_index	51056	315	907	0.3	0	0	0 bytes	8323072
`2016-11-24 10:57:04.081798+08`	`postgres`	`pg_catalog`	`pg_class`	51056	2191	1175	1.9	1016	8323072	8323072 bytes	pg_class_relname_nsp_index	51056	589	907	0.6	0	0	0 bytes	8323072
`2016-11-24 10:57:04.081798+08`	`postgres`	`pg_catalog`	`pg_attribute`	107789	2926	2164	1.4	762	6242304	6242304 bytes	pg_attribute_relid_attnam_index	107789	749	1570	0.5	0	0	0 bytes	6242304
`2016-11-24 10:57:04.081798+08`	`postgres`	`pg_catalog`	`pg_attribute`	107789	2926	2164	1.4	762	6242304	6242304 bytes	pg_attribute_relid_attnum_index	107789	662	1570	0.4	0	0	0 bytes	6242304

建议

根据浪费的字节数, 设置合适的autovacuum_vacuum_scale_factor, 大表如果频繁的有更新或删除和插入操作, 建议设置较小的autovacuum_vacuum_scale_factor来降低浪费空间.

同时还需要打开autovacuum, 根据服务器的内存大小, CPU核数, 设置足够大的autovacuum_work_mem 或 autovacuum_max_workers 或 maintenance_work_mem, 以及足够小的 autovacuum_naptime.

同时还需要分析是否对大数据库使用了逻辑备份pg_dump, 系统中是否经常有长SQL, 长事务. 这些都有可能导致膨胀.

使用pg_reorg或者vacuum full可以回收膨胀的空间.

参考: http://blog.163.com/digoal@126/blog/static/1638770402015329115636287/.

otta评估出的表实际需要页数, iotta评估出的索引实际需要页数.

bs数据库的块大小.

tbloat表膨胀倍数, ibloat索引膨胀倍数, wastedpages表浪费了多少个数据块, wastedipages索引浪费了多少个数据块.

wastedbytes表浪费了多少字节, wastedibytes索引浪费了多少字节.

2. 索引膨胀分析

snap_ts	db	schemaname	tablename	tups	pages	otta	tbloat	wastedpages	wastedbytes	wastedsize	iname	itups	ipages	iotta	ibloat	wastedipages	wastedibytes	wastedisize	totalwastedbytes
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`gist_test`	9999939	63695	63632	1.0	63	516096	516096 bytes	idx_gist	9999939	76922	36711	2.1	40211	329408512	329408512 bytes	329924608
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`test_pg_part_single`	999999	8389	8322	1.0	67	548864	548864 bytes	test_pg_part_single_pkey	999999	5486	2570	2.1	2916	23887872	23887872 bytes	24436736
`2016-11-24 10:57:04.081798+08`	`postgres`	`pg_catalog`	`pg_depend`	209624	2280	1540	1.5	740	6062080	6062080 bytes	pg_depend_depender_index	209624	2462	950	2.6	1512	12386304	12386304 bytes	18448384
`2016-11-24 10:57:04.081798+08`	`postgres`	`pg_catalog`	`pg_depend`	209624	2280	1540	1.5	740	6062080	6062080 bytes	pg_depend_reference_index	209624	1875	950	2.0	925	7577600	7577600 bytes	13639680
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`pgbench_tellers`	1000	3664	9	407.1	3655	29941760	29941760 bytes	pgbench_tellers_pkey	1000	605	3	201.7	602	4931584	4931584 bytes	34873344
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`test_pg_part_orig_3`	100014	737	833	0.9	0	0	0 bytes	test_pg_part_orig_3_pkey	100014	551	257	2.1	294	2408448	2408448 bytes	2408448
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`test_pg_part_pathman_4`	100000	761	833	0.9	0	0	0 bytes	test_pg_part_pathman_4_pkey	100000	551	257	2.1	294	2408448	2408448 bytes	2408448
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`test_pg_part_pathman_8`	100000	738	833	0.9	0	0	0 bytes	test_pg_part_pathman_8_pkey	100000	551	257	2.1	294	2408448	2408448 bytes	2408448
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`test_pg_part_pathman_3`	100000	732	833	0.9	0	0	0 bytes	test_pg_part_pathman_3_pkey	100000	551	257	2.1	294	2408448	2408448 bytes	2408448
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`test_pg_part_orig_9`	100001	730	833	0.9	0	0	0 bytes	test_pg_part_orig_9_pkey	100001	551	257	2.1	294	2408448	2408448 bytes	2408448

建议

如果索引膨胀太大, 会影响性能, 建议重建索引, create index CONCURRENTLY ... .

3. 垃圾记录 TOP 10 表分析

snap_ts	database	schemaname	tablename	n_dead_tup
`2016-11-24 10:57:04.081798+08`	`postgres`	`public`	`pgbench_branches`	672799

建议

通常垃圾过多, 可能是因为无法回收垃圾, 或者回收垃圾的进程繁忙或没有及时唤醒, 或者没有开启autovacuum, 或在短时间内产生了大量的垃圾.

可以等待autovacuum进行处理, 或者手工执行vacuum table.

4. 未引用的大对象分析

snap_ts | database | pg_size_pretty ---|---|---|---|---

建议

如果大对象没有被引用时, 建议删除, 否则就类似于内存泄露, 使用vacuumlo可以删除未被引用的大对象, 例如: vacuumlo -l 1000 $db -w或者我写的调用vacuumlo()函数.

应用开发时, 注意及时删除不需要使用的大对象, 使用lo_unlink 或驱动对应的API.

参考 http://www.postgresql.org/docs/9.4/static/largeobjects.html

四、数据库安全或潜在风险分析

1. 表年龄前100

snap_ts	database	rolname	nspname	relkind	relname	age	age_remain

建议

表的年龄正常情况下应该小于vacuum_freeze_table_age, 如果剩余年龄小于2亿, 建议人为干预, 将LONG SQL或事务杀掉后, 执行vacuum freeze.

2. unlogged table和hash index

snap_ts	database	rolname	nspname	relname

snap_ts	database	idx

建议

unlogged table和hash index不记录XLOG, 无法使用流复制或者log shipping的方式复制到standby节点, 如果在standby节点执行某些SQL, 可能导致报错或查不到数据.

在数据库CRASH后无法修复unlogged table和hash index, 不建议使用.

PITR对unlogged table和hash index也不起作用.

3. 剩余可使用次数不足1000万次的序列检查

snap_ts	database	rolname	nspname	relname	times_remain

建议

序列剩余使用次数到了之后, 将无法使用, 报错, 请开发人员关注.

4. 锁等待分析

snap_ts	locktype	r_mode	r_user	r_db	relation	r_pid	r_page	r_tuple	r_xact_start	r_query_start	r_locktime	r_query	w_mode	w_pid	w_page	w_tuple	w_xact_start	w_query_start	w_locktime	w_query

建议

锁等待状态, 反映业务逻辑的问题或者SQL性能有问题, 建议深入排查持锁的SQL.

Files

20161123_01_pdf_001.md

Latest commit

History