PostgreSQL Stuff

Recommended .psqlrc

-- Print date on startup
\echo `date  +"%Y-%m-%d %H:%M:%S"`

-- Set client encoding to SQL_ASCII (to match what is on the server)
\encoding SQL_ASCII

-- Do NOT automatically commit after every statement!
\set AUTOCOMMIT off

-- Be verbose about feedback
\set VERBOSITY verbose

-- [user]@[host]:[port]/[db]['*' if we are in a transaction]['#' if we are root-like; '>' otherwise]
\set PROMPT1 '%n@%m:%>/%/%x%# '

-- Prevent Ctrl-D from exiting psql.
\set IGNOREEOF 5

-- Make history ignore all lines entered that were preceded by spaces, and ignore any entries that matched the previous line entered.
\set HISTCONTROL ignoreboth

-- Keep a different history file for each database name you log on to.
\set HISTFILE ~/.psql_history- :DBNAME

-- Keep a history of the last 2000 commands.
\set HISTSIZE 2000

-- Instead of displaying nulls as blank space, which look the same as empty strings (but are not the same!), show nulls as [NULL].
\pset null '[NULL]'

-- Show pretty unicode lines between rows and columns in select results.
\pset linestyle unicode

-- Show pretty lines around the outside of select results.
\pset border 2

-- Turn off the pager so that results just keep scrolling by, rather than stopping.
\pset pager off

-- Within columns, wrap long lines so that select results still fit on the display.
\pset format wrapped

-- Show how long it takes to run each query.
\timing

-- Show the application_name in pg_stat_activity.
-- Good database citizens set this field so we know who to blame when a query hogs resources,
-- or somebody stays idle in transaction for too long.
set application_name to manni_desktop; commit;

-- Set bytea output to show as many ASCII letters as possible.
-- (Handy if you are storing text whose encoding you do not know in bytea columns.)
set bytea_output to escape; commit;

Set up Linux kernel parameters for PostgreSQL

Run this bash script:

#!/bin/bash

set -e
set -u

SYSCTL=/sbin/sysctl

echo "# add the output of this script to /etc/sysctl.conf,"
echo "# and then, as root, run"
echo
echo "# sysctl -p /etc/sysctl.conf"
echo
echo "# to load change the kernel settings for these parameters."
echo

PAGE_SIZE=$(getconf PAGE_SIZE)

echo "# page size is: $PAGE_SIZE"

NUM_PHYS_PAGES=$(getconf _PHYS_PAGES)

echo "# number of physical pages on this box: $NUM_PHYS_PAGES"

CURR_SHMALL=$($SYSCTL -n kernel.shmall)
PREF_SHMALL=$(expr $NUM_PHYS_PAGES / 2)

echo "# kernel.shmall should be half of the number of pages. Current kernel.shmall, in pages, is: $CURR_SHMALL"
echo "# kernel.shmall should be:"
echo
echo "kernel.shmall = $PREF_SHMALL"
echo

CURR_SHMMAX=$($SYSCTL -n kernel.shmmax)
PREF_SHMMAX=$(expr $PREF_SHMALL \* $PAGE_SIZE)

echo "# kernel.shmmax should be half of available RAM, in kB. Current kernel.shmmax, in kB, is: $CURR_SHMMAX"
echo "# kernel.shmmax should be:"
echo
echo "kernel.shmmax = $PREF_SHMMAX"
echo

# CURR_SHMMIN=$($SYSCTL -n kernel.shmmin)  # XXX: does not exist on linux
# CURR_SHMSEG=$($SYSCTL -n kernel.shmseg)  # XXX: does not exist on linux

CURR_SHMMNI=$($SYSCTL -n kernel.shmmni)

echo "# kernel.shmmni is usually set to a sane amount on Linux. Currently, it is: $CURR_SHMMNI"

# CURR_SEMMNI=$($SYSCTL -n kernel.semmni)  # XXX: does not exist on linux
# CURR_SHMMNI=$($SYSCTL -n kernel.semmns)  # XXX: does not exist on linux
# CURR_SHMMSL=$($SYSCTL -n kernel.semmsl)  # XXX: does not exist on linux
# CURR_SHMMSL=$($SYSCTL -n kernel.semmap)  # XXX: does not exist on linux
# CURR_SHMMSL=$($SYSCTL -n kernel.semmvx)  # XXX: does not exist on linux

CURR_SEM=$($SYSCTL -n kernel.sem)

echo "# kernel.sem usually has sane defauls. They are currently: $CURR_SEM"

Initialise a new database cluster

I create new db clusters just infrequently enough that I never remember the flags I use when creating the cluster. In this example, I create a database using the SQL_ASCII character encoding (which is sort of the non-character-encoding character encoding, as opposed to, let's say, UTF8 or LATIN1) and collating/sorting based on the C locale, which is the "no locale" (as opposed to, let's say, en_US or fr_CA):

initdb \
    --pgdata=/u1/pg/data \
    --encoding=SQL_ASCII \
    --locale=C

Typical changes to postgresql.conf for large database installations

If you run a postgresql database with few connections but large jobs, here are some recommended settings for postgrsql.conf. Assume a sever with 75GB of RAM.

listen_address='localhost'  ==>  listen_address='*'

# according to http://rhaas.blogspot.com/2012/03/tuning-sharedbuffers-and-walbuffers.html,
# shared_buffers should be 25% of RAM, but no higher than 8GB.

shared_buffers = 32MB  ==>  shared_buffers = 8GB

temp_buffers = 8MB  ==>  temp_buffers = 80MB

work_mem = 1MB  ==>  work_mem = 1GB  # set higher on a per-session basis

maintenance_work_mem = 16MB  ==>  maintenance_work_mem = 20GB  # set quite high; used to build indexes and do data loads

# according to http://rhaas.blogspot.com/2012/03/tuning-sharedbuffers-and-walbuffers.html,
# wal_buffers should be 3% of shared_buffers up to a maximum of 16MB, the size of a single WAL segment.

wal_buffers = 64kB  ==>  wal_buffers = 16MB

# according to https://www.packtpub.com/article/server-configuration-tuning-postgresql,
# checkpoint segments should be much higher than the default of 3:

checkpoint_segments = 3  ==>  checkpoint_segments = 64

# if you are on RAID 10, the cost of accessing a random page should be much closer to the cost
# of sequential IO. (In fact, if you are on SSDs, seq_page_cost and random_page_cost should equal each other.)

random_page_cost = 4.0  ==>  random_page_cost = 2.0

# according to http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server, 
# "Setting effective_cache_size to 1/2 of total memory would be a normal conservative setting, 
# and 3/4 of memory is a more aggressive but still reasonable amount."

effective_cache_size = 128MB  ==>  effective_cache_size = 37GB

# settings for logging:
log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_min_messages = debug1  # obviously rather verbose; drop to warning if your logs fill up
log_line_prefix = '%t'

Create a user who can, in turn, create databases and other users

Instead of using the postgres user to create and destroy databases and users, you can create a user for that task instead:

$ sudo su - postgres
$ psql template1
template1=# create user myuser superuser createdb createrole password 'mypassword';
template1=# \q

Create a new database

I create new dbs just infrequently enough that I never remember the flags I use when creating the db. In this example, I create a database using the SQL_ASCII character encoding (which is sort of the non-character-encoding character encoding, as opposed to, let's say, UTF8 or LATIN1) and collating/sorting based on the C locale, which is the "no locale" (as opposed to, let's say, en_US or fr_CA):

createdb \
    -h myhost \
    -U myuser \
    --echo \
    --owner=myuser \
    --locale=C \  # only psql 9.0 and above
    --encoding=SQL_ASCII \
    my_new_database

Or, from the db:

$ sudo su - postgres
$ psql template0
template1=# create database my_new_database with owner = myuser template = template1 
            encoding = 'SQL_ASCII' lc_collate = 'C' lc_ctype = 'C';
template1=# \q

database startup script for /etc/init.d

#!/bin/bash

set -e
set -u

# chkconfig: 2345 98 02
# description: PostgreSQL RDBMS

# where to find commands like su, echo, etc...
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

DB_ENCODING=SQL_ASCII

DB_LOCALE=C

PG_INSTALL_DIR=/u99/local/pgsql-9.0

PG_DATA_DIR="${PG_INSTALL_DIR}/data"

PG_SERVER_LOG="$PG_DATA_DIR/serverlog"

PG_UNIX_USER=postgres

POSTGRES="$PG_INSTALL_DIR/bin/postgres"

PG_CTL="$PG_INSTALL_DIR/bin/pg_ctl"

INITDB="$PG_INSTALL_DIR/bin/initdb"

# die on first failure; do not keep trucking
set -e

if [ $# -ne 1 ]; then
    echo "please enter start/stop/restart etc..." 1>&2
    exit 1
fi

# Only start if we can find postgres and pg_ctl.
if [ ! -x $PG_CTL ]; then
    echo "$PG_CTL not found" 1>&2
    exit 1
fi

if [ ! -x $POSTGRES ]; then
    echo "$POSTGRES not found" 1>&2
    exit 1
fi

case $1 in
  init)
	su - $PG_UNIX_USER -c "$INITDB --pgdata='$PG_DATA_DIR' --encoding=$DB_ENCODING --locale=$DB_LOCALE"
	;;
  start)
	echo -n "Starting PostgreSQL: "
	su - $PG_UNIX_USER -c "$PG_CTL start -D '$PG_DATA_DIR' -l $PG_SERVER_LOG"
	echo "ok"
	;;
  stop)
	echo -n "Stopping PostgreSQL: "
	su - $PG_UNIX_USER -c "$PG_CTL stop -D '$PG_DATA_DIR' -s -m fast"
	echo "ok"
	;;
  restart)
	echo -n "Restarting PostgreSQL: "
	su - $PG_UNIX_USER -c "$PG_CTL stop -D '$PG_DATA_DIR' -s -m fast -w"
	su - $PG_UNIX_USER -c "$PG_CTL start -D '$PG_DATA_DIR' -l $PG_SERVER_LOG"
	echo "ok"
	;;
  reload)
        echo -n "Reload PostgreSQL: "
        su - $PG_UNIX_USER -c "$PG_CTL reload -D '$PG_DATA_DIR' -s"
        echo "ok"
        ;;
  status)
	su - $PG_UNIX_USER -c "$PG_CTL status -D '$PG_DATA_DIR'"
	;;
  *)
	# Print help
	echo "Usage: $0 {start|stop|restart|reload|status}" 1>&2
	exit 1
	;;
esac

exit 0

copy a database

When you don't specify --template for createdb, the template1 database will get used. However, a fun trick to copy a database is to just specify the database you want copied using the --template flag:

createdb \
    -h myhost \
    -U myuser \
    --echo \
    --owner=myuser \
    --template=my_old_database \
    my_new_database

Get sizes of objects using built-in functions

Get the size of a database:

select pg_size_pretty(pg_database_size('my_db'));

Get the size of a table (but not its toast tables or indices):

select pg_size_pretty(pg_relation_size('my_table'));

Get the size of a table and its toast tables (but not its indices):

select pg_size_pretty(pg_table_size('my_table'));

Get the size of a table and its toast tables and its indices:

select pg_size_pretty(pg_total_relation_size('my_table'));

Get table row counts quickly

30 Jul 2015

These row counts won't be as exact as

select count(*) from my_table
, but they will run a lot quicker.

select n_live_tup as rough_row_count 
  from pg_stat_user_tables 
 where schemaname = 'public' 
   and relname = 'my_table';

or

select t.reltuples as rough_row_count 
  from pg_class as t 
  join pg_namespace as s on t.relnamespace = s.oid 
 where s.nspname = 'public' 
   and t.relname = 'my_table';

Force use of indexes by query planner

set enable_seqscan = off;
commit;

Set a sequence to the max value of a table column

select setval('my_sequence', max(my_column)) from my_table;
commit;

Set work_mem higher for this session.

If you are going to build a large index or do some other large query that can benefit from access to more RAM, set work_mem larger for your session

set work_mem = '56GB';
commit;
-- Do your work here.
-- Then be a good neighbour.
set work_mem = default;
commit;

Get a partial database dump that can be restored to a different user

How to dump a PostgreSQL database in such a way that restoring to a slightly different version of PostgreSQL on a different machine using perhaps a different user will not be a problem:

pg_dump \
    --file=/path/to/dump/file.dump \
    --format=custom \
    --ignore-version \
    --schema=public \
    --schema=otherschema \
    --no-owner \
    --verbose \
    --no-privileges \
    -U user -h myhost mydatabase

How to dump only particular tables from particular schemas

pg_dump \
    --file=/path/to/dump/file.dump \
    --format=custom \
    --ignore-version \
    --table=myschema.mytable \
    --table=myschema.myothertable \
    --table=myotherschema.mythirdtable \
    --no-owner \
    --verbose \
    --no-privileges \
    -U user -h myhost mydatabase

How to restore a dump flexibly

Restore a dump, not caring if the user matches nor if the exact point version of PostgreSQL matches.

pg_restore \
    --format=custom \
    --ignore-version \
    --no-owner \
    --no-privileges \
    --verbose \
    -U user -h myhost -d mydatabase \
    /path/to/dump/file.dump

How to restore only some items from a dump file

If you have a dump file that has more stuff in it than you want, or that has commands to build indexes and you don't want to build indices at restore time, you can do this:

First, get a list of all the items in the dump file, directing that list to a text file.

pg_restore \
    -l /path/to/dump/file.dump \
    > /path/to/dump.list

Now edit dump.list, removing anything you do not want to restore (particularly indexes).

Now feed that edited list to pg_restore.

pg_restore \
    --format=custom \
    --ignore-version \
    --no-owner \
    --use-list /path/to/dump.list \
    --verbose \
    --no-privileges \
    -U user -h myhost -d mydatabase \
    /path/to/dump/file.dump

How to list all of a table's indices

Remember, the easy way, when you are in psql, is to just do

\d mytable
and look for the "Indexes" portion of the table description. Otherwise...

  select t.relname as table_name,
         i.relname as index_name
    from pg_class as i
    join pg_index as idxjoin
      on i.oid = idxjoin.indexrelid
    join pg_class as t
      on idxjoin.indrelid = t.oid
   where t.relname = 'my_table'
     and i.relkind = 'i'
order by t.relname,
         i.relname;

How to drop all of a table's indices

create or replace function drop_indexes_on_table(a_schema text, a_table text)
returns int
language 'plpgsql'
as $$
declare
    sql_to_run text;
    idx_rec record;
    num_idxs_dropped int;
begin
    num_idxs_dropped := 0;

    raise notice 'looking for indexes for table %.%', a_schema, a_table;

    for idx_rec in
    select idx_info.relname as index_name
      from pg_index as idx
      join pg_class as tbl
        on tbl.oid = idx.indrelid
      join pg_namespace as schm
        on tbl.relnamespace = schm.oid
      join pg_class as idx_info
        on idx.indexrelid = idx_info.oid
     where schm.nspname = a_schema
       and tbl.relname = a_table
    loop
        num_idxs_dropped := num_idxs_dropped + 1;
        raise notice 'about to drop index % for table %.%', idx_rec.index_name, a_schema, a_table;
        sql_to_run := 'drop index if exists ' || a_schema || '.' || idx_rec.index_name;
        execute sql_to_run;
    end loop;
    return num_idxs_dropped;
end;
$$;
commit;

Get the size of a database

  select relname as relname, 
         pg_size_pretty(sum(relpages) * 8192) as size
    from pg_class
group by relname;

Find all tables with a certain column name

    select n.nspname, 
           c.relname, 
           c.relnamespace, 
           c.relkind, 
           a.attname, 
           a.attrelid 
      from pg_attribute as a 
      join pg_catalog.pg_class as c 
        on a.attrelid = c.oid 
      join pg_catalog.pg_namespace as n 
        on c.relnamespace = n.oid 
     where a.attname = 'my_column' 
       and n.nspname = 'my_schema' 
       and c.relkind = 'r';

Get the columns that an index operates on

First get the oid of the table, the oid of the indexes, and the column numbers used by those indexes. The column numbers will be something like "1 3 4", which means the first, third, and fourth columns of the searched-for table are used by the index. (Usually, there's just one number because many indexes are made for one column.)

select tbl.oid as table_oid,
       idx.indexrelid as index_oid,
       idx.indkey as column_numbers,
       idx_info.relname as index_name
  from pg_index as idx
  join pg_class as tbl on tbl.oid = idx.indrelid
  join pg_namespace as schm on tbl.relnamespace = schm.oid 
  join pg_class as idx_info on idx.indexrelid = idx_info.oid
 where schm.nspname = 'my_schema'
   and tbl.relname = 'my_table';


 table_oid | index_oid | column_numbers |           index_name            
-----------+-----------+----------------+---------------------------------
  48645160 | 215366881 | 3              | my_table_my_col_idx
  48645160 |  48877631 | 1              | my_table_my_other_col_idx

Now you can get the names of the columns operated on by each index. For each row (representing one index) from the above query, split i.indkey by space into column numbers. For each column number, run the following query to get the column name.

select c.attname as column_name
  from pg_attribute as c
 where c.attrelid = $table_oid  -- from previous query
   and c.attnum = $column_number  -- from previous query

Find schemas associated with a particular user

Select schemas in current db associated with user myuser, even if there are no tables in the schema, including public schema:

select 'public' as nspname
 union all
select nspname
  from pg_catalog.pg_namespace as nsp
  join pg_catalog.pg_roles as rls
    on nsp.nspowner = rls.oid
 where rls.rolname = 'myuser';

Get IO stats for a particular table

select relname         as "table",
       heap_blks_read  as "heap from disc",
       heap_blks_hit   as "heap from cache",
       idx_blks_read   as "index from disc",
       idx_blks_hit    as "index from cache",
       toast_blks_read as "toast from disc",
       toast_blks_hit  as "toast from cache",
       tidx_blks_read  as "toast index disc",
       tidx_blks_hit   as "toast index cache"
  from pg_statio_user_tables
 where relname = 'my_table';

Read activity stats for a particular table

select relname         as "table",
       seq_scan        as "table scans",
       idx_scan        as "index lookups"
  from pg_stat_user_tables
 where relname = 'my_table';

Read enhanced activity stats for a particular table

select relname         as "table",
       seq_scan        as "table scans",
       seq_tup_read    as "tuples scanned",
       idx_scan        as "index lookups",
       idx_tup_fetch   as "tuples fetched via index"
  from pg_stat_user_tables
 where relname = 'my_table';

Find the size of a table

In bytes:

select relpages * 8192 as size_in_bytes
  from pg_class where relname = 'mytable';

In human-readable form:

select pg_size_pretty(relpages * 8192) as size
  from pg_class where relname = 'mytable';

Find out how many pages and tuples are used by a table

select relname         as "table",
       reltuples       as "number of tuples",
       relpages        as "number of 8kb pages"
  from pg_class
 where relname = 'my_table';

Find live vs. dead tuples of a table

select relname         as "table",
       n_live_tup      as "live tuples",
       n_dead_tup      as "dead tuples"
  from pg_stat_user_tables
 where relname = 'my_table';

Find out when a table was last analyzed or vacuumed

select relname         as "table",
       last_vacuum     as "last manual vacuum",
       last_autovacuum as "last auto vacuum",
       last_analyze    as "last manual analyze",
       last_autoanalyze as "last auto analyze"
  from pg_stat_user_tables
 where relname = 'my_table';

Find the number of connections to your database

select datname,
       procpid,
       usename,
       client_addr,
       client_port
  from pg_catalog.pg_stat_activity;

Kill a user's session based on procpid from pg_stat_activity

pg_terminate_backend(procpid);

Cancel a user's query based on procpid from pg_stat_activity

pg_cancel_backend(procpid);

Set up automatic updating of tables with updated columns

Finally: a legitimate use for triggers!

Many of your table definitions likely have these two columns:

create table users (
    name text not null,
-- ... other columns ...
    created timestamp with time zone not null default now(),
    updated timestamp with time zone not null default now());

create table customers (
    name text not null,
-- ... other columns ...
    created timestamp with time zone not null default now(),
    updated timestamp with time zone not null default now());

But how to you ensure that every update to a row in this table actually updates the "updated" column? A trigger, that's how:

-- Define this stored procedure once

create or replace function set_updated_column_to_now()
returns trigger as
$body$
begin
    NEW.updated = now();
    return NEW;  -- allow the row through for the actual update
end;
$body$
language plpgsql
volatile;

-- Define a trigger for each table that has an updated column

-- (The PostgreSQL docs say that multiple triggers on the same
-- table are executed in alphabetical order, so these are named
-- with a leading aaa_ on the assumption that we want them to
-- set the updated column before other triggers operate on the
-- row.)

create trigger aaa_set_users_updated_column_to_now
before update
on users
for each row
when (NEW.updated = OLD.updated)
execute procedure set_updated_column_to_now();

create trigger aaa_set_customers_updated_column_to_now
before update
on customers
for each row
when (NEW.updated = OLD.updated)
execute procedure set_updated_column_to_now();

A function to return the extension of a filename

create or replace function extract_file_extension(filename text) returns text as $$
declare
    extension text;
begin
    select regexp_replace(filename, E'^.*\\.', '') into extension;
    if extension = filename then
        return '';
    else
        return extension;
    end if;
end;
$$ language plpgsql;

Something you might want to turn on before exporting floats with the copy command:

Why is this not the default? I do not know...

set extra_float_digits to 3; commit;

Regexp to get file extension

Surprising, the amount of times I seem to do this:

# insert into t (filename) values ('blah.jar'), ('foo'), ('baz-1.2.0.jar'); commit;
INSERT 0 3
COMMIT
# select filename, case when regexp_replace(filename, E'^.*\\.', '') = filename then '' else regexp_replace(filename, E'^.*\\.', '') end as ext from t; rollback;
┌───────────────┬─────┐
│   filename    │ ext │
├───────────────┼─────┤
│ blah.jar      │ jar │
│ foo           │     │
│ baz-1.2.0.jar │ jar │
└───────────────┴─────┘

maintenence_work_mem to help vacuuming

Jeff Janes posted this to the PostgreSQL general maling list, and I did not want to forget it:

"Vacuum can only memorize one dead tuple for every 6 bytes of maintenance_work_mem. If there are more dead tuples than that, it needs to make multiple passes over the indexes.

"Increase maintenance_work_mem to improve the efficiency."

fizzbuzz in PostgreSQL

A friend of mine at work said "...because you wouldn't exactly write fizzbuzz in SQL. So then, of course, we had to:

select generate_series as num, 
       case when (generate_series % 3 = 0 and generate_series % 5 = 0) 
                then 'fizzbuzz' 
            when generate_series % 3 = 0 
                then 'fizz' 
            when generate_series % 5 = 0 
                then 'buzz' 
            else '' end 
       as fizzbuzz 
  from generate_series(1, 100);

Hidden columns in PostgreSQL tables

See http://www.postgresql.org/docs/9.4/static/ddl-system-columns.html for more details, but instead of selecting * from your table, do this to get all of the hidden columns:

select tableoid, cmin, cmax, xmin, xmax, ctid, * from mytable;

Log long-running statements

See http://www.postgresql.org/docs/9.4/static/runtime-config-logging.html for more details, but the quick solution is to set this in your postgresql.conf:

log_min_duration_statement = 2000  # log statements running longer than 2000ms (2 seconds)

Show blocked locks

See https://wiki.postgresql.org/wiki/Lock_Monitoring for more details, but the quick solution is to create this view and select from it:

create or replace view blocked_locks as
select
distinct blocked_locks.pid                     as blocked_pid,
         blocked_activity.usename              as blocked_user,
         now() - blocked_activity.query_start  as blocked_duration,
         blocked_activity.query                as blocked_statement,
         blocking_locks.pid                    as blocking_pid,
         blocking_activity.usename             as blocking_user,
         blocking_activity.query               as blocking_statement,
         now() - blocking_activity.query_start as blocking_duration
    from pg_locks                as blocked_locks
    join pg_stat_activity        as blocked_activity  
      on blocked_activity.pid = blocked_locks.pid
    join pg_locks                as blocking_locks 
      on blocking_locks.locktype                         = blocked_locks.locktype
     and blocking_locks.database      is not distinct from blocked_locks.database
     and blocking_locks.relation      is not distinct from blocked_locks.relation
     and blocking_locks.page          is not distinct from blocked_locks.page
     and blocking_locks.tuple         is not distinct from blocked_locks.tuple
     and blocking_locks.virtualxid    is not distinct from blocked_locks.virtualxid
     and blocking_locks.transactionid is not distinct from blocked_locks.transactionid
     and blocking_locks.classid       is not distinct from blocked_locks.classid
     and blocking_locks.objid         is not distinct from blocked_locks.objid
     and blocking_locks.objsubid      is not distinct from blocked_locks.objsubid
     and blocking_locks.pid                             != blocked_locks.pid
    join pg_stat_activity        as blocking_activity 
      on blocking_activity.pid = blocking_locks.pid
   where blocked_locks.granted is false;

Show which tables are closest to transaction id wraparound

   select scma.nspname as scma,
          tbl.relname as tbl,
          ((select setting from pg_settings where name = 'autovacuum_freeze_max_age')::bigint - age(tbl.relfrozenxid)) as tx_until_forced_autovacuum
     from pg_class as tbl
left join pg_namespace scma on scma.oid = tbl.relnamespace
    where scma.nspname not in ('pg_catalog', 'information_schema')
      and scma.nspname not like 'pg_temp_%'
      and tbl.relkind = 'r'
      and ((select setting from pg_settings where name = 'autovacuum_freeze_max_age')::bigint - age(tbl.relfrozenxid)) < 500000000
 order by tx_until_forced_autovacuum asc;

Show page cache hits (and misses) in query explain plan

explain (analyze on, buffers on) select * from mytable

How to clear your screen in a psql session:

\! clear

Sweet!

How to change the default unix socket location from /tmp for PostgreSQL

The sort answer is: don't do it. But, many distros do, so if you are trying to mimick a distro build, you may have to.

If custom compiling, you can tell all the binaries by editing DEFAULT_PGSOCKET_DIR in src/include/pg_config_manual.h, so that all the binaries you compile will get this setting.

To change just the server setting at run time, edit postgresql.conf like so:

unix_socket_directory = '/var/run/postgresql'

Calculate the database cache hit ratio:

select sum(blks_hit) * 100 / sum(blks_hit + blks_read) as hit_ratio from pg_stat_database;

Flush linux disk cache

For testing purposes, it is sometimes handy to tell linux to flush the disk cache. Here's how:

# as root
echo 3 > /proc/sys/vm/drop_caches

Note that some web sites say that if you are dumping the disk cache, you may want to run the command sync before doing the flush. Probably not a bad idea.

Get/Set Write-Through/Write-Back Setting for Your HDD

Gregory Smith's awesome PostgreSQL 9.0 High Performance does a good job of describing how many drives are configured in write-back mode rather than write-through mode. Write-back mode allows the drive to tell the operating system that a write has been successful after that write has gotten into the drive's cache, note after the write has been persisted to the actual disk. Write-through, on the other hand, while slower, gives this guarantee.

For most consumer drives, you will find write-back is the default. Here's how to check.

Get a list of your storage devices:

# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 223.6G  0 disk 
├─sda1   8:1    0   512M  0 part /boot/efi
├─sda2   8:2    0 207.2G  0 part /
└─sda3   8:3    0  15.9G  0 part [SWAP]

See if write-back is enabled (write-caching is turned on):

# hdparm -W /dev/sda

/dev/sda:
 write-caching =  1 (on)

Yup! Here's how to disable it for guaranteed durability of your writes:

# hdparm -W0 /dev/sda

/dev/sda:
 setting drive write-caching to 0 (off)
 write-caching =  0 (off)

Get row number in select result

select row_number() over (),
       some_col,
       some_other_col
  from my_table
 where col = 'bar';

Arrays and the PostgreSQL Query Planner

Let's create two tables and see how they behave when we join them.

postgres@[local]:5432/postgres# create table t1 (id int, name text);
CREATE TABLE
postgres@[local]:5432/postgres# create table t2 (id int, name text);
CREATE TABLE

postgres@[local]:5432/postgres# insert into t1 select s.i, s.i::text from generate_series(1, 1000) as s(i);
INSERT 0 1000
postgres@[local]:5432/postgres# insert into t2 select s.i, s.i::text from generate_series(1, 1000) as s(i);
INSERT 0 1000

What's the query plan when we join these two tables on their id column?

postgres@[local]:5432/postgres# explain select * from t1 join t2 on t1.id = t2.id; rollback;
┌──────────────────────────────────────────────────────────────────┐
│                            QUERY PLAN                            │
├──────────────────────────────────────────────────────────────────┤
│ Hash Join  (cost=27.50..56.25 rows=1000 width=14)                │
│   Hash Cond: (t1.id = t2.id)                                     │
│   ->  Seq Scan on t1  (cost=0.00..15.00 rows=1000 width=7)       │
│   ->  Hash  (cost=15.00..15.00 rows=1000 width=7)                │
│         ->  Seq Scan on t2  (cost=0.00..15.00 rows=1000 width=7) │
└──────────────────────────────────────────────────────────────────┘
(5 rows)

What's the query plan when we use an in clause instead?

postgres@[local]:5432/postgres# explain select * from t1 where id in (select id from t2); rollback;
┌──────────────────────────────────────────────────────────────────┐
│                            QUERY PLAN                            │
├──────────────────────────────────────────────────────────────────┤
│ Hash Semi Join  (cost=27.50..56.25 rows=1000 width=7)            │
│   Hash Cond: (t1.id = t2.id)                                     │
│   ->  Seq Scan on t1  (cost=0.00..15.00 rows=1000 width=7)       │
│   ->  Hash  (cost=15.00..15.00 rows=1000 width=4)                │
│         ->  Seq Scan on t2  (cost=0.00..15.00 rows=1000 width=4) │
└──────────────────────────────────────────────────────────────────┘
(5 rows)

Same query plan. Nice. So PostgreSQL presumably treats the select statement inside the in clause as something table-ish.

How about when we give a long list in the in clause?

postgres@[local]:5432/postgres# explain select * from t1 where id in
postgres-# (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
postgres(# 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,
postgres(# 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71,
postgres(# 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94,
postgres(# 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114,
postgres(# 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133,
postgres(# 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152,
postgres(# 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171,
postgres(# 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190,
postgres(# 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209);
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                QUERY PLAN                                                 │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Seq Scan on t1  (cost=0.00..276.25 rows=209 width=7)                                                      │
│   Filter: (id = ANY ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,3…│
│…1,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,6…│
│…6,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,…│
│…101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,1…│
│…27,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,15…│
│…3,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179…│
│…,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,…│
│…206,207,208,209}'::integer[]))                                                                            │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(2 rows)

OK, we get any in a filter.

How about when we make the contents of the in clause table-ish by unnesting them?

postgres-# (select unnest(array[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
postgres(# 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,
postgres(# 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71,
postgres(# 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94,
postgres(# 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114,
postgres(# 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133,
postgres(# 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152,
postgres(# 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171,
postgres(# 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190,
postgres(# 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209]));
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                QUERY PLAN                                                 │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Hash Semi Join  (cost=2.76..21.50 rows=100 width=7)                                                       │
│   Hash Cond: (t1.id = (unnest('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,2…│
│…8,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,6…│
│…3,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,9…│
│…8,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,…│
│…125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,1…│
│…51,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,17…│
│…7,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203…│
│…,204,205,206,207,208,209}'::integer[])))                                                                  │
│   ->  Seq Scan on t1  (cost=0.00..15.00 rows=1000 width=7)                                                │
│   ->  Hash  (cost=1.51..1.51 rows=100 width=4)                                                            │
│         ->  Result  (cost=0.00..0.51 rows=100 width=0)                                                    │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(5 rows)

Ah, this looks more like the query plans we had for joins. Nice.