Postgresql

From NailingJello.com
Revision as of 10:45, 22 May 2013 by Grzegorz (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Postgresql

I have been a postgresql developer since version 7.2 / 7.3. I was involved in projects that used Postgresql in GIS projects, text mining and analysis and eventually running Postgresql in production environments. One of or sites has gotten over half a million page views and nearly 25 thousand unique visitors per day. This page is used to share my experiences as well as setup, scripts, documentation and everything you need to make your postgres deployment a success.

If you have any questions or comments, please give me a shout at greg [at] nailingjello.com. I am happy to help.

Production Setup

Since about 2010 I have been involved in running postgresql in production environment. My group of a very few application and database administrators (7 and growing) is responsible for a large number production services on virtually every OS / Database / Application servers and combination thereof. We run a fairly large shared Oracle environment (about 100 - 120 databases / clients and growing) and shared SQL Server (150 - 175 databases / clients and growing). We are currently (early 2013) setting up a similar scale shared postgresql server. In addition to shared services we run a lot (about a hundred) individual databases often in High Availability configuration.

Operating System Setup

Recommended reading: Postgresql Tunning Guide. Although we run postgresql in a number of operating systems, we've primarily run Postgres in Linux (Ubuntu and Red Hat) and Solaris (X86 and Sparc). The following instructions are for Ubuntu systems, but should work in Red Hat (not tested).

Legend

The following variables are referenced in other code, log files, and recovery procedures below. Original documentation was written for Postgres 9.0.

$PGVERSION = "9.2"
$PHOSTNAME = "primary-db01" # Hostname of the primary database

File Locations

Our standard file locations and what they are used for.

File Component Notes
/etc/postgresql/$PGVERSION/main/postgresql.conf Postgresql Main database configuration file
/etc/postgresql/$PGVERSION/main/pg_hba.conf Postgresql Host Based Access control file
/etc/sysctl.conf Kernel Kernel Parameters
/etc/rsyncd.conf rsyncd Rsync server for log shipping (configured on both, running on standby only)
/etc/init.d/iptables iptables Internal firewall
~postgres/scripts/hotbackup.sh omnipitr Hot backup script - executed nightly. Automatically sends data to standby db
~postgres/scripts/recovery*.conf Postgresql Example recovery files. Used to restore backup to a point in time or setup database standby
~postgres/scripts/monitor_old_transactions.sql Postgresql Executed every 15 minutes to warn when any transaction takes longer than expected.

Log Files

File Component Notes
/var/log/postgresql/postgresql-$PGVERSION-main.log Postgresql Database log file
/var/log/postgresql/omnipitr.log omnipitr Omnipitr log file
/var/log/postgresql/hotbackup omnipitr STDOUT of the hot backup command - normal logging inside omnipitr.log
/var/log/postgresql/postgresql_transactions.log Postgresql logs long running transactions
/var/log/rsync.log rsync DB WAL files are shipped through rsync. Problems may show up here

Main Commands

Command User Component Notes
pg_ctlcluster $PGVERSION main restart root,postgres Postgresql Start, stop, restart or reload database
/etc/init.d/rsync restart root rsync Restart rsync daemon
psql postgres Postgresql DB command line

Database Recovery Options

Recover database from backup - Point in time recovery

sudo su - postgres
pg_ctlcluster $PGVERSION main stop # make sure postgres is shutdown
cd /postgres/db/$PGVERSION
mv main main_old # or delete if not enough space. 
tar xvf /backups/$PHOSTNAME-data-<date>.tar
cp ~/scripts/recovery_pitr.conf main/recovery.conf 
vi main/recovery.conf # set the date you'd like to recover to
pg_ctlcluster $PGVERSION main start # check logs for errors

Recover database from backup - Re-establish standby

Make sure primary database is running and is shipping logs to standby. Make sure there is a backup available with unbroken chain of WAL files up to present time.

sudo su - postgres
pg_ctlcluster $PGVERSION main stop # make sure postgres is shutdown
cd /postgres/db/$PGVERSION
mv main main_old # or delete if not enough space. 
tar xvf /backups/$PHOSTNAME-data-<date>.tar
cp ~/scripts/recovery.conf main/recovery.conf # copy recovery.conf to main DB directory. 
pg_ctlcluster $PGVERSION main start # check logs for errors

Note: It is expected to see the following warning after starting postgresql. The warning is the result of the database starting in "recovery" mode and waiting for the next WAL file to apply from the primary database server.

WARNING: connection to the database failed, disabling startup checks:
psql: FATAL: the database system is starting up

Recover database from backup - Nightly backup

This should only be used if the WAL files are not available or if we do not care about synchronizing time.

sudo su - postgres
pg_ctlcluster $PGVERSION main stop # make sure postgresql is shutdown
cd /postgres/db/$PGVERSION
mv main main_old # or delete if not enough space. 
tar xvf /backups/$PHOSTNAME-data-<date>.tar
tar xvf /backups/$PHOSTNAME-xlog-<date>.tar # make sure the dates match!!!
pg_ctlcluster $PGVERSION main start # check logs for errors

Known Problems and Solutions

Disk is (is getting) full

/postgres/WAL - use find command to find old data and delete it if possible. Check SLA for minimum retention.

/usr/bin/find /postgres/WAL -mtime +3 -name "0*" -exec rm {} \; 

/postgres/db: Main database disk. Requires an outage. Procedure: mount a new larger volume, copy files and remount in the proper place. Possibility: Database growth due to application deadlock below.

/backups/ : Main backup files. Can be used to restore to that night's backup, but without WAL cannot be used for Point in Time recovery. Check SLA for minimum retention.

Database growth due to application deadlock

Content coming soon. Basically need to run a vacuum full.

Database high CPU usage

Content coming soon. Figuring out what is the activity and who is doing what. pg_stat_activity is your friend.