mighty pigz
I've just googled for "paralell gzip" and found that pigz are there.
Not exactly hot news, but these are really useful and mighty pigz.
Once you start using them, "pigz -p3" becomes as natural as "make -j3".
Living without them for so many years... a shame. I have only one excuse: multi-core machines were not so common when I started my adventure with computing.
Happy pigzing!
Becker & Posner
Na onecie piszą o ciekawym amerykańskim blogu politycznym.
Zajrzałem tam, jest to zbiór zwięzłych i rzeczowych analiz, pisane to jest głównie na gruncie USA ale wygląda interesująco.
Ciekawe czy dorobimy się kiedyś w PL takiej publicystyki.
pgbouncer mini HOWTO + benchmark
pgbouncer is a lightweight connection pooler for PostgreSQL.
I've decided to write this mini howto in order to prove that pgbouncer:
- is easy to install and configure
- is really useful, even in minimal setups (same machine as postgres, 10 clients).
As a prerequisite, we will need:
- postgres up and running (well, that's what pgbouncer was made for).
- standard set of development tools needed to compile C programs (gcc+make+binutils).
I assume you already have these.
My Linux distro is Ubuntu 9.10, kernel/software versions:
filip@srv:~$ uname -r 2.6.27.7-smp filip@srv:~$ pg_config --version PostgreSQL 8.4.3 filip@srv:~$ gcc --version gcc (Ubuntu 4.4.1-4ubuntu9) 4.4.1
Installing pgbouncer
First we need libevent (event notification library). Luckilly, in recent Ubuntu version it's packaged, so let's use it:
filip@srv:~$ apt-cache search libevent libevent-1.4-2 - An asynchronous event notification library libevent-core-1.4-2 - An asynchronous event notification library (core) libevent-dev - Development libraries, header files and docs for libevent (...)
We need libevent-dev package, to have header and include files needed for pgbouncer compilation.
filip@srv:~$ sudo apt-get install libevent-dev (...) done.
OK so now we have libevent installed. Next we go for pgbouncer itself.
It's not packaged for Ubuntu at the moment, so we have to compile it from sources. I chose to install to /usr/local - simply because I had no better idea. YMMV.
Download, unpack and configure:
filip@srv:~/src$ wget http://pgfoundry.org/frs/download.php/2608/pgbouncer-1.3.2.tgz (...) `pgbouncer-1.3.2.tgz' saved [166756/166756] filip@srv:~/src$ tar xzf pgbouncer-1.3.2.tgz filip@srv:~/src$ cd pgbouncer-1.3.2/ filip@srv:~/src/pgbouncer-1.3.2$ ./configure --prefix=/usr/local (...) configure: creating ./config.status config.status: creating config.mak config.status: creating include/config.h
OK it's configured, lets compile and install:
filip@srv:~/src/pgbouncer-1.3.2$ make (...) make[1]: Nothing to be done for `all'. make[1]: Leaving directory `/home/filip/src/pgbouncer-1.3.2/doc' filip@srv:~/src/pgbouncer-1.3.2$ sudo make install (...)
Pgbouncer is now installed.
All in one command, for your convenience:
wget http://pgfoundry.org/frs/download.php/2608/pgbouncer-1.3.2.tgz \ && tar xzf pgbouncer-1.3.2.tgz \ && cd pgbouncer-1.3.2 \ && ./configure --prefix=/usr/local \ && make && sudo make install
Configuring pgbouncer
Now let's create a config file. We do this by copying distributed ini file sample to /etc:
filip@srv:~$ sudo cp /usr/local/share/doc/pgbouncer/pgbouncer.ini /etc/pgbouncer.ini
Next we edit this file and configure needed options.
Most important is pool_mode. I choose transaction pooling mode because it gives most performance boost for normal postgresql usage.
As for connection settings, we have pgbouncer on port 6432, routing connections to port 5432 (my postgres is on same host and standard port).
File paths are adjusted for typical Ubuntu setup. pgbouncer will be run from postgres account, so we just point auth_file to pg_auth.
Here is my pgbouncer.ini file:
[databases] * = port=5432 [pgbouncer] logfile = /var/log/postgresql/pgbouncer.log pidfile = /var/log/postgresql/pgbouncer.pid listen_addr = * listen_port = 6432 unix_socket_dir = /var/run/postgresql auth_type = trust auth_file = /var/lib/postgresql/8.4/main/global/pg_auth admin_users = postgres stats_users = postgres pool_mode = transaction server_reset_query = DISCARD ALL; server_check_query = select 1 server_check_delay = 10 max_client_conn = 1000 default_pool_size = 20 log_connections = 1 log_disconnections = 1 log_pooler_errors = 1
Starting pgbouncer
Now let's start the beast:
filip@srv:~$ sudo su - postgres postgres@srv:~$ pgbouncer -d /etc/pgbouncer.ini 2010-04-23 18:37:05.969 20068 LOG File descriptor limit: 10000 (H:15000), max_client_conn: 1000, max fds possible: 1010
OK, now let's check if this really works. Connect to admin console (virtual db "pgbouncer"):
filip@srv:~$ psql -Upostgres -p6432 pgbouncer
psql (8.4.3, server 8.0/bouncer)
WARNING: psql version 8.4, server version 8.0.
Some psql features might not work.
Type "help" for help.
postgres@pgbouncer=#
It works!.
Benchmarking
Now let's do some testing. We use good'ol' pgbench.
Create and initialize test database with scale 10:
filip@srv:~$ createdb bench filip@srv:~$ /usr/lib/postgresql/8.4/bin/pgbench -i -s 10 bench (...) vacuum...done.
Test with 10 clients, separate connection for each transaction, direct connection to postgres, run test for one minute:
(mandela)filip@ratel:~$ /usr/lib/postgresql/8.4/bin/pgbench -c 10 -C -T 60 bench starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 10 query mode: simple number of clients: 10 duration: 60 s number of transactions actually processed: 1528 tps = 25.410170 (including connections establishing) tps = 53.186072 (excluding connections establishing)
Now the same test, but connecting via pgbouncer:
(mandela)filip@ratel:~$ /usr/lib/postgresql/8.4/bin/pgbench -c 10 -C -T 60 -p 6432 bench starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 10 query mode: simple number of clients: 10 duration: 60 s number of transactions actually processed: 2601 tps = 43.308068 (including connections establishing) tps = 55.619391 (excluding connections establishing)
Looks promising.
Now both tests repeated several times, just bare results for brevity:
(direct) tps = 25.583194 (including connections establishing) tps = 55.247968 (excluding connections establishing) (pgbouncer) tps = 51.769025 (including connections establishing) tps = 73.188059 (excluding connections establishing) (direct) tps = 25.857126 (including connections establishing) tps = 64.090508 (excluding connections establishing) (pgbouncer) tps = 61.633963 (including connections establishing) tps = 87.375610 (excluding connections establishing) (direct) tps = 21.134134 (including connections establishing) tps = 50.005559 (excluding connections establishing) (pgbouncer) tps = 50.122482 (including connections establishing) tps = 74.693641 (excluding connections establishing) (direct) tps = 18.925272 (including connections establishing) tps = 49.249117 (excluding connections establishing) (pgbouncer) tps = 63.616117 (including connections establishing) tps = 94.977040 (excluding connections establishing) (direct) tps = 22.444140 (including connections establishing) tps = 43.382705 (excluding connections establishing) (pgbouncer) tps = 68.886017 (including connections establishing) tps = 102.644402 (excluding connections establishing) (direct) tps = 19.979776 (including connections establishing) tps = 52.215144 (excluding connections establishing) (pgbouncer) tps = 57.047613 (including connections establishing) tps = 85.300031 (excluding connections establishing)
Lets make it more readable, and calculate performance gain (pgbench vs direct):
| Test # | direct incl connections | direct excl connections | pgbouncer incl connections | pgbouncer excl connections |
|---|---|---|---|---|
| 1 | 25,41 | 53,19 | 43,31 | 55,62 |
| 2 | 25,58 | 55,25 | 51,77 | 73,19 |
| 3 | 25,86 | 64,09 | 61,63 | 87,38 |
| 4 | 21,13 | 50,01 | 50,12 | 74,69 |
| 5 | 18,93 | 49,25 | 63,62 | 94,98 |
| 6 | 22,44 | 43,38 | 68,89 | 102,64 |
| 7 | 19,98 | 52,22 | 57,05 | 85,3 |
| AVG | 22,76 | 52,48 | 56,63 | 81,97 |
| PGBOUNCER GAIN PERCENT | 148,78% | 56,19% |
We can see that - including time consumed by connections handling - pgbouncer gives about 150% speedup compared to raw postgres.
56,19% is also a very interesting result - this difference probably comes from postgres session initialization, but maybe pgbouncer handles connections/disconnections faster than postgresql itself.
These results are very good but of course they are heavily influenced by pgbench "-C" switch (separate connection for each transaction). Let's see what are the results for pgbench without "-C":
| Test # | direct incl conn | direct excl conn | pgbouncer incl conn | pgbouncer excl conn |
|---|---|---|---|---|
| 1 | 77,64 | 77,78 | 55,67 | 55,7 |
| 2 | 73,16 | 73,38 | 79,59 | 79,6 |
| 3 | 80,91 | 81,03 | 67,45 | 67,46 |
| 4 | 61,8 | 61,94 | 78,97 | 79,02 |
| 5 | 79,45 | 79,57 | 80,66 | 80,7 |
| AVG | 74,59 | 74,74 | 72,47 | 72,5 |
| PGBOUNCER GAIN PERCENT | -2,85% | -3,00% |
As you see, for persistent connections there is no gain - even a small overhead.
BTW, both test were executed like this:
filip@srv:~$ for n in 1 2 3 4 5 ; do echo "$n (direct)"; /usr/lib/postgresql/8.4/bin/pgbench -c 10 -T 60 bench 2>&1 | grep tps; echo "$n (pgbouncer)"; /usr/lib/postgresql/8.4/bin/pgbench -c 10 -T 60 -p 6432 bench 2>&1 | grep tps; done
Note: this was on a very weak machine - desktop class PC from circa 2006.
Conclusion and disclamer
Using pgbench is definitely a good idea if you have clients connecting many times repeatedly.
These result would probably be very similar using any other decent connection pooler.
I am not an expert on TPC-B and I also do not take responsibility for any damage made to your system and/or database by the code written above.
I do not guarantee that you achieve same results - maybe the whole test was crippled and useless.
As always, please let me know if you see any errors / omissions in the article.
dynamic SQL parameters in PL/PgSQL functions
It pays to read documentation.
From PostgreSQL 8.3 to 8.4 there was one interesting addon for plpgsql: ability to pass parameter values directly into EXECUTE.
Of course depesz wrote about it. And I did read it; But still (by routine) I was using something like:
sql_query := 'SELECT foo FROM bar JOIN baz USING (barbaz) WHERE baz.id = ANY (__PARAM__::integer[])';
sql_query := replace(sql_query, '__PARAM__', quote_nullable(my_param);
EXECUTE sql_query INTO my_foos;
While it can be written in more elegant and error-prone way:
EXECUTE 'SELECT foo FROM bar JOIN baz USING (barbaz) WHERE baz.id = ANY ($1)'
INTO my_foos
USING my_param;
postgres schemagrep function
From time to time, I need to "grep" database schema to locate some database objects.
One possibility is to dump the whole schema to file (pg_dump -s DBNAME) and use your text editor to browse it. It is quite handy and I actually use it a lot.
But sometimes, it is more convenient to have this "grep-like" possibility directly from psql.
So... here it is, the "schemagrep" function.
Please note that argument for this function is a regex pattern, so you can look for almost everything.
The search is case-insensitive.
begin;
create or replace function schemagrep_relkind( "char" ) returns text as $$
select case $1
when 'r' then 'TABLE'
when 'i' then 'INDEX'
when 'S' then 'SEQUENCE'
when 'v' then 'VIEW'
when 'c' then 'COMPOSITE TYPE'
when 't' then 'TOAST TABLE'
end
$$ language sql immutable strict;
drop function if exists schemagrep( text );
create function schemagrep( text ) returns table(
schema name,
object_name name,
match_type text,
psql_hint text
) as $$
select x.*, case when x.match_type ~ 'FUNCTION' then E'\\df+ ' when x.match_type ~ 'COMMENT' then E'\\d+ ' else E'\\d ' end || nspname || '.' || relname
from (
select n.nspname, c.relname, schemagrep_relkind(c.relkind) || ' NAME' as match_type
from pg_class c, pg_namespace n where c.relname ~* $1 and c.relnamespace = n.oid
union all
select distinct n.nspname, c.relname, schemagrep_relkind(c.relkind) || ' ATTRIBUTE'
from pg_class c, pg_attribute a, pg_namespace n where a.attrelid = c.oid and c.relkind <> 'v' and a.attname ~* $1 and c.relnamespace = n.oid
union all
select n.nspname, proname, 'FUNCTION DEFINITION'
from pg_proc, pg_namespace n where prosrc ~* $1 and pronamespace = n.oid
union all
select schemaname, viewname, 'VIEW DEFINITION'
from pg_views where definition ~* $1
union all
select n.nspname, relname, 'CLASS CONSTRAINT'
from pg_class c, pg_constraint cx, pg_namespace n where cx.conrelid = c.oid and cx.consrc ~* $1 and relnamespace = n.oid
union all
select distinct n.nspname, c.relname, 'COMMENT'
from pg_class c, pg_description d, pg_namespace n where d.objoid = c.oid and d.description ~* $1 and c.relnamespace = n.oid
) x
$$ language sql stable strict;
grant execute on function schemagrep(text) to public;
commit;
Test it by running
SELECT * FROM schemagrep( 'anything' );
For me, it works like this:
testdb=# select * from schemagrep('account');
schema | object_name | match_type | psql_hint
--------+-----------------------+---------------------+----------------------------------
public | pgbench_accounts | TABLE NAME | \d public.pgbench_accounts
public | pgbench_accounts_pkey | INDEX NAME | \d public.pgbench_accounts_pkey
public | sp_acc_createaccount | FUNCTION DEFINITION | \df+ public.sp_acc_createaccount
(3 rows)
That's it! Of course the function can be extended or modified to your needs, because it's in pure SQL :-)