Originally from: tweet, LinkedIn post.
I post a new PostgreSQL "howto" article every day. Join me in this journey – subscribe, provide feedback, share!
Modern Postgres provides more than 300 settings (a.k.a. GUC variables – "grand unified configuration"). Fine-tuning Postgres for a particular environment, database, and workload is a very complex task.
But in most cases, the Pareto principle (a.k.a. rule 80/20) works pretty well: you need to spend limited effort to
address basic areas of tuning, and then focus on query performance. The reasoning behind this approach is simple and
solid: yes, you can spend a lot of effort and find a better value of shared_buffers
than the traditional 25% (which,
as many people think, is far from ideal: e.g.,
see Andres Freund's Tweet), and then
find yourself in a position where a few queries with suboptimal performance – e.g., lacking proper indexes – ruin all
the positive effect from that fine-tuning.
Therefore, I recommend this approach:
- Basic "rough" tuning
- Log-related settings
- Autovacuum tuning
- Checkpointer tuning
- Then focus on query optimization, reactive or proactive, and fine-tuning for specific areas only when there is a strong reason for it
For initial rough tuning, the empirical tools are "good enough" in most cases, following the 80/20 principle (actually, perhaps even 95/5 in this case):
- PGTune (source code)
- PostgreSQL Configurator
- for TimescaleDB users: timescaledb-tune
Additionally, to the official docs, this resource is good to use as a reference (it has integrated information from various sources, not only official docs) – for example, check the page for random_page_cost, a parameter which is quite often forgotten.
If you use a managed Postgres service such as RDS, quite likely this level of tuning is performed already when you
provision a server. But it's still worth double-checking – for example, some providers provision a server with SSD disk
but leave random_page_cost
default – 4
– which is an outdated value suitable for magnetic disks. Just set it to 1
if you have an SSD.
It is important to perform this level of tuning before query optimization efforts, because otherwise, you might need to re-do query optimization once you adjusted the basic configuration.
A general rule here: the more logging, the better. Of course, assuming that you avoid saturation of two kinds:
- disk space (logs filled the disk)
- disk IO (too many writes per second caused by logging)
In short my recommendations are (this is worth a separate detailed post):
- turn on checkpoint logging,
log_checkpoints='on'
(fortunately, it's on by default in PG15+), - turn on all autovacuum logging,
log_autovacuum_min_duration=0
(or a very low value) - log temporary files except tiny ones (e.g.,
log_temp_files = 100
) - log all DDL statements
log_statement='ddl'
- adjust
log_line_prefix
- set a low value for
log_min_duration_statement
(e.g.,500ms
) and/or useauto_explain
to log slow queries with plans
This is a big topic worth a separate post. In short, the key idea is that default settings don't suit for any modern OLTP case (web/mobile apps), so autovacuum has to be always tuned. If we don't do it, autovacuum becomes a "converter" of large portions of dead tuples to bloat, and this eventually negatively affects performance.
Two areas of tuning needs to be addressed:
- Increase the frequency of processing – lowering
**_scale_factor
/**_threshold
settings, we make autovacuum workers process tables when quite low value of dead tuples is accumulated - Allocate more resources for processing: more autovacuum workers (
autovacuum_workers
), more memory (autovacuum_work_mem
), and higher "quotas" for work (controlled via**_cost_limit
/**_cost_delay
).
Again, it's worth a separate post. But in short, you need to consider raising checkpoint_timeout
and – most
importantly – max_wal_size
(whose default is very small for modern machines and data volumes, just 1GB
), so
checkpoints occur less frequently, especially when a lot of writes happen. However, shifting settings in this direction
mean longer recovery time in case of a crash or recovery from backups – this is a trade-off that needs to be analyzed
for a particular case.
That's it. Generally, this initial/rough tuning of Postgres config shouldn't take long. For a particular cluster of type of clusters, it's a 1-2 day work for an engineer. You don't actually need AI for this, empirical tools work well – unless you do aim to squeeze 5-10% more (you might want it though, e.g., if you have thousands of servers).