Show which tables are closest to transaction id wraparound

29 Sep 2015

   select scma.nspname as scma,
          tbl.relname as tbl,
          ((select setting from pg_settings where name = 'autovacuum_freeze_max_age')::bigint - age(tbl.relfrozenxid)) as tx_until_forced_autovacuum
     from pg_class as tbl
left join pg_namespace scma on scma.oid = tbl.relnamespace
    where scma.nspname not in ('pg_catalog', 'information_schema')
      and scma.nspname not like 'pg_temp_%'
      and tbl.relkind = 'r'
      and ((select setting from pg_settings where name = 'autovacuum_freeze_max_age')::bigint - age(tbl.relfrozenxid)) < 500000000
 order by tx_until_forced_autovacuum asc;

Details

https://www.postgresql.org/docs/current/runtime-config-autovacuum.html:
"autovacuum_freeze_max_age (integer)
"Specifies the maximum age (in transactions) that a table's pg_class.relfrozenxid field can attain before a VACUUM operation is forced to prevent transaction ID wraparound within the table. Note that the system will launch autovacuum processes to prevent wraparound even when autovacuum is otherwise disabled.
"Vacuum also allows removal of old files from the pg_xact subdirectory, which is why the default is a relatively low 200 million transactions. This parameter can only be set at server start, but the setting can be reduced for individual tables by changing table storage parameters. For more information see Section 24.1.5."
https://www.postgresql.org/docs/current/catalog-pg-class.html:
"relfrozenxid xid
"All transaction IDs before this one have been replaced with a permanent (“frozen”) transaction ID in this table. This is used to track whether the table needs to be vacuumed in order to prevent transaction ID wraparound or to allow pg_xact to be shrunk. Zero (InvalidTransactionId) if the relation is not a table."

Let's create a brand new table:

create table foo();
commit;

If you run this repeatedly, you will see that the number of transactions until transaction id wraparound starts to shrink, one transaction at a time:

   select scma.nspname as scma,
          tbl.relname as tbl,
          (select setting from pg_settings where name = 'autovacuum_freeze_max_age')::bigint as autovacuum_freeze_max_age,
          txid_current() as txid_current,
          tbl.relfrozenxid as tbl_relfrozenxid,
          age(tbl.relfrozenxid) as age_tbl_relfrozenxid,
          ((select setting from pg_settings where name = 'autovacuum_freeze_max_age')::bigint - age(tbl.relfrozenxid)) as tx_until_forced_autovacuum
     from pg_class as tbl
     join pg_namespace scma on scma.oid = tbl.relnamespace
    where scma.nspname not in ('pg_catalog', 'information_schema')
      and scma.nspname not like 'pg_temp_%'
      and tbl.relkind = 'r'
 order by tx_until_forced_autovacuum asc;
rollback;

Run above query:

┌────────┬─────┬───────────────────────────┬──────────────┬──────────────────┬──────────────────────┬────────────────────────────┐
│  scma  │ tbl │ autovacuum_freeze_max_age │ txid_current │ tbl_relfrozenxid │ age_tbl_relfrozenxid │ tx_until_forced_autovacuum │
├────────┼─────┼───────────────────────────┼──────────────┼──────────────────┼──────────────────────┼────────────────────────────┤
│ public │ foo │                 200000000 │          509 │              508 │                    1 │                  199999999 │
└────────┴─────┴───────────────────────────┴──────────────┴──────────────────┴──────────────────────┴────────────────────────────┘
(1 row)

Run above query again:

┌────────┬─────┬───────────────────────────┬──────────────┬──────────────────┬──────────────────────┬────────────────────────────┐
│  scma  │ tbl │ autovacuum_freeze_max_age │ txid_current │ tbl_relfrozenxid │ age_tbl_relfrozenxid │ tx_until_forced_autovacuum │
├────────┼─────┼───────────────────────────┼──────────────┼──────────────────┼──────────────────────┼────────────────────────────┤
│ public │ foo │                 200000000 │          510 │              508 │                    2 │                  199999998 │
└────────┴─────┴───────────────────────────┴──────────────┴──────────────────┴──────────────────────┴────────────────────────────┘
(1 row)

Run above query a third time:

┌────────┬─────┬───────────────────────────┬──────────────┬──────────────────┬──────────────────────┬────────────────────────────┐
│  scma  │ tbl │ autovacuum_freeze_max_age │ txid_current │ tbl_relfrozenxid │ age_tbl_relfrozenxid │ tx_until_forced_autovacuum │
├────────┼─────┼───────────────────────────┼──────────────┼──────────────────┼──────────────────────┼────────────────────────────┤
│ public │ foo │                 200000000 │          511 │              508 │                    3 │                  199999997 │
└────────┴─────┴───────────────────────────┴──────────────┴──────────────────┴──────────────────────┴────────────────────────────┘
(1 row)

BEFORE running above query, run this:

vacuum freeze foo;

Run above query a fourth time:

┌────────┬─────┬───────────────────────────┬──────────────┬──────────────────┬──────────────────────┬────────────────────────────┐
│  scma  │ tbl │ autovacuum_freeze_max_age │ txid_current │ tbl_relfrozenxid │ age_tbl_relfrozenxid │ tx_until_forced_autovacuum │
├────────┼─────┼───────────────────────────┼──────────────┼──────────────────┼──────────────────────┼────────────────────────────┤
│ public │ foo │                 200000000 │          512 │              512 │                    0 │                  200000000 │
└────────┴─────┴───────────────────────────┴──────────────┴──────────────────┴──────────────────────┴────────────────────────────┘
(1 row)

As you can see, the age() function, when fed an xid, is just giving you the number of transactions between the xid and the current transaction id.

(See also https://github.com/postgres/postgres/blob/REL_11_2/src/backend/utils/adt/xid.c#L99 )

So basically, this query can be used to look at which tables are getting closest to forced autovacuum:

   select scma.nspname as scma,
          tbl.relname as tbl,
          ((select setting from pg_settings where name = 'autovacuum_freeze_max_age')::bigint - age(tbl.relfrozenxid)) as tx_until_forced_autovacuum
     from pg_class as tbl
     join pg_namespace scma on scma.oid = tbl.relnamespace
    where scma.nspname not in ('pg_catalog', 'information_schema')
      and scma.nspname not like 'pg_temp_%'
      and tbl.relkind = 'r'
 order by tx_until_forced_autovacuum asc;
rollback;

┌────────┬─────┬────────────────────────────┐
│  scma  │ tbl │ tx_until_forced_autovacuum │
├────────┼─────┼────────────────────────────┤
│ public │ foo │                  199999996 │
└────────┴─────┴────────────────────────────┘

and we can just set a threshold (using a where clause) where only tables whose tx_until_forced_autovacuum is less than a threshold are shown.

How should we determine the threshold itself? Perhaps we should sample txid_current() at regular intervals to determine how many transactions per hour are being used for a particular pg cluster, so that we know how quickly we need to worry about approaching transaction id wraparound.