We see quite a few messages in our log files with "automatic aggressive vacuum" What is this?
Regular vacuum is intended to clean up old/dead rows which are no longer needed by any active transactions. Therefore it will skip pages with no dead tuples.
For example,
Lets keep it simple for this explanation. We only two users on the database right now. Tom and Jerry.
Tom is running a long query in transaction 100 while Jerry is in transaction 105 updating the table Tom is using in his query. That being the case, Tom must have visibility into the rows with transaction id of 100 or less. Otherwise, he would get bogus data.
Now Tom finishes his query and starts another query in a new transaction. This time Tom's transaction is 110. This means that Tom no longer needs to see rows that have transaction id of 105 or less. They are no longer needed, are marked as dead ☠️ and are ready to be vacuumed.
Now, a frozen tuple is a row marked as frozen to prevent wrap around txid. This means the row is visible to ALL transactions regardless of transaction id. These are typically set during vacuum/autovaccuum times. If vacuum identifies the tuple as being old enough it will be frozen. The next time it gets updated, that frozen status will be reomoved.
Regular vacuuming will typically skip pages that have no dead rows. This means that some old rows may not be frozen because the page they are in has been skipped by the vacuum process. When this occurs often enough, vacuum will perform an aggressive vacuum. This will find and freeze the unfrozen rows by vacuuming all pages not just the ones where dead tuple are.
When will aggressive vacuum happen ?
To answer this, the vacuum_freeze_table_age parameter along with relfrozenxid in pg_class for a table will determine when the aggressive vacuum takes place. If the relfrozenxid in pg_class reaches the value for vacuum_freeze_table_age, the aggressive vacuum is started.
The default value for vacuum_freeze_table_age is set to 150,000,000.
To illustrate this, lets look at this server.
postgressolutions=# show vacuum_freeze_table_age;
vacuum_freeze_table_age
-------------------------
150000000
(1 row)
On this server the table user_activity has the following details for it in pg_class.
postgressolutions=# select * from pg_class where relname = 'user_activity';
-[ RECORD 1 ]-------+--------------
oid | 180328
relname | user_activity
relnamespace | 2200
reltype | 180330
reloftype | 0
relowner | 17722
relam | 2
relfilenode | 1412009
reltablespace | 0
relpages | 5589
reltuples | 523545
relallvisible | 5589
reltoastrelid | 180331
relhasindex | t
relisshared | f
relpersistence | p
relkind | r
relnatts | 5
relchecks | 0
relhasrules | f
relhastriggers | f
relhassubclass | f
relrowsecurity | f
relforcerowsecurity | f
relispopulated | t
relreplident | d
relispartition | f
relrewrite | 0
relfrozenxid | 89713979
relminmxid | 1
relacl |
reloptions |
relpartbound |
Notice the value for relfrozenxid which is 89,713,979.
When that value reaches 150,000,000 an aggressive vacuum will take place. This means that the vacuum will hit every page that may contain unfrozen tuples and not just the ones containing dead rows.
Increasing this will reduce the number of aggressive vacuums taking place.
Wow. This is awesome! Thanks. Pretty much what I was looking for.
Well, I just so happen there is one here that adresses your query :)
Have a look at
https://github.com/jtorral/DockerPgHa
It spins up postgres, patroni and pgbackrest.
The best feature is the genCompose script which creates a docker-compose file for you based on your supplied input.
This repo needs some enhancements but it is a good start.