vim as a pager for psql
I asked this question on stackexchange, about a pager with "freeze first line" support.
There was no easy answer, but someone hinted me at Emacs. My Emacs skill is close to zero, so I wanted to check if it's possible in Vim.
And yes it is! Here you are:
First, we need a Vim macro, which will do most of the work, I save it in ~/.vim/plugin/less.vim:
" :Less " turn vim into a pager for psql aligned results fun! Less() set nocompatible set nowrap set scrollopt=hor set scrollbind set number execute 'above split' " resize upper window to one line; two lines are not needed because vim adds separating line execute 'resize 1' " switch to lower window and scroll 2 lines down wincmd j execute 'norm! 2^E' " hide statusline in lower window set laststatus=0 " hide contents of upper statusline. editor note: do not remove trailing spaces in next line! set statusline=\ " arrows do scrolling instead of moving nmap ^[OC zL nmap ^[OB ^E nmap ^[OD zH nmap ^[OA ^Y nmap <Space> <PageDown> " faster quit (I tend to forget about the upper panel) nmap q :qa^M nmap Q :qa^M endfun command! -nargs=0 Less call Less()
Second, to emulate a pager, I need to invoke vim so that it will:
- read standard input
- but if argument is given on command line, read whatever comes there
- work in read-only mode
- skip all init scripts, but instead execute Less macro defined above
I put this together as helper script in ~/bin/vimpager:
#!/bin/bash what=- test "$@" && what="$@" exec vim -u NONE -R -S ~/.vim/plugin/less.vim -c Less $what
Third, I need to override environment variable $PAGER, but only for psql (add this to my ~/.bash_aliases):
if which vimpager &>/dev/null; then alias psql='PAGER=vimpager psql'; fi
Fourth, I disabled "\pset pager always" in my ~/.psqlrc file, because I don't need to invoke vim on small listings.
That's all - and it works for me.
PS. There is "vimpager" here, but mine is much simpler and specific to psql query results.
PostgreSQL 9 Admin Cookbook – review in progress
I was asked to review PostgreSQL 9 Admin Cookbook (by Simon Riggs and Hannu Krosing), and I've just downloaded a copy - await a review in few weeks.
Django and PostgreSQL “Idle In Transaction” Connections
I'm reposting this post (by Christophe Pettus) as a hint for friends running apps on the Django+Pg combo.
Hi there, Sensisoft :-)
Import zbiorów TERYT do bazy danych SQL
Główny Urząd Statystyczny administruje bazą danych podziału administracyjnego Polski, z dokładnością do ulic i nazw miejscowych.
Baza ta nazywa się TERYT i jest udostępniana publicznie w formie plików XML, w serwisie stat.gov.pl.
W tym artykule pokażę jak zaimportować te dane do bazy relacyjnej (na przykładzie PostgreSQL oraz Microsoft SQL Server).
Pobieranie
Najpierw musimy pobrać aktualną wersję plików TERYT. Poniżej polecenia wget, oczywiście można też uzyć przeglądarki i zapisać pliki pod odpowiednimi nazwami.
wget "http://www.stat.gov.pl/broker/access/prefile/downloadPreFile.jspa?id=147" -O teryt_wmrodz.zip
wget "http://www.stat.gov.pl/broker/access/prefile/downloadPreFile.jspa?id=203" -O teryt_terc.zip
wget "http://www.stat.gov.pl/broker/access/prefile/downloadPreFile.jspa?id=205" -O teryt_simc.zip
wget "http://www.stat.gov.pl/broker/access/prefile/downloadPreFile.jspa?id=222" -O teryt_ulic.zip
Transformacja do CSV
Następnie należy przetłumaczyć te zbiory do formatu tabelarycznego.
Możemy użyć do tego celu skryptu w języku sed z tej strony.
Uruchomienie (bash):
for zb in wmrodz terc simc ulic; do
unzip -p teryt_$zb.zip | sed -n -f teryt.sed > teryt_$zb.csv
done
Powinno to stworzyć 4 pliki CSV.
Utworzenie tabel
Chcemy zaimportować dane do relacyjnej bazy danych, więc na początek musimy ją stworzyć :-)
Poniżej jest DDL tworzący odpowiednie tabele. Starałem się używać ANSI SQL żeby kod działał w różnych bazach. Starałem się też, aby wszystkie tabele miały odpowiednie klucze (główne i obce).
create table teryt_wmrodz(
rm varchar(2) not null primary key,
nazwa_rm varchar(30) not null unique,
stan_na varchar(10) not null
);
create table teryt_terc(
wojewodztwo varchar(2) not null,
powiat varchar(2),
gmina varchar(2),
rodz varchar(2),
nazwa varchar(100) not null,
nazdod varchar(100) not null,
stan_na varchar(10) not null,
constraint teryt_terc_key unique ( wojewodztwo, powiat, gmina, rodz ),
constraint teryt_terc_key2 unique ( wojewodztwo, powiat, gmina, nazdod )
);
create table teryt_simc(
wojewodztwo varchar(2) not null,
powiat varchar(2) not null,
gmina varchar(2) not null,
rodz_gmi varchar(2) not null,
RM varchar(2) not null,
MZ varchar(2) not null,
nazwa varchar(100) not null,
sym varchar(10) not null primary key,
sympod varchar(10) not null references teryt_simc,
stan_na varchar(10) not null
);
create table teryt_ulic(
wojewodztwo varchar(2) not null,
powiat varchar(2) not null,
gmina varchar(2) not null,
rodz_gmi varchar(2) not null,
symbol varchar(10) not null references teryt_simc (sym),
sym_ul varchar(10) not null,
cecha varchar(10) not null,
NAZWA_1 varchar(100) not null,
NAZWA_2 varchar(100),
stan_na varchar(10) not null,
constraint teryt_ulic_pkey primary key (symbol,sym_ul),
constraint teryt_ulic_fkey_teryt_terc foreign key (wojewodztwo,powiat,gmina,rodz_gmi) references teryt_terc (wojewodztwo, powiat, gmina, rodz)
);
Ładowanie danych do bazy
Można użyć dowolnej bazy relacyjnej i narzędzia do ładowania; ja testowałem to na dwóch:
PostgreSQL
Import przez zwykłe COPY, poniżej polecenia:
truncate teryt_terc, teryt_wmrodz, teryt_ulic, teryt_simc;
SET client_encoding TO 'UTF-8';
\copy teryt_wmrodz from 'teryt_wmrodz.csv' with csv delimiter '|'
\copy teryt_terc from 'teryt_terc.csv' with csv delimiter '|'
\copy teryt_simc from 'teryt_simc.csv' with csv delimiter '|'
\copy teryt_ulic from 'teryt_ulic.csv' with csv delimiter '|'
MSSQL
Import za pomocą DTSWizard, dostępny spod SSMS.
Nie będę tu wrzucał screenów z SSMS, import jest prosty, jednak po nim trzeba upewnić się, że kodowanie znaków jest poprawne.
Celebracja
Voilà! Możemy się "cieszyć" danymi TERYT w formie relacyjnej. Chcesz wiedzieć jakie są dzielnice Krakowa? Ile jest w Polsce ulic imienia Antoniego Malczewskiego? Ile miejscowości o nazwie na literę K jest w województwie zachodniopomorskim? Proszę bardzo! Toż to dziecinnie proste, wystarczy znać SQL.
Update (2010-09-11): link na int3.pl jest niedostępny. Załączam lokalne kopie skryptów: teryt.sed, teryt2.sed.
pgbouncer mini HOWTO + benchmark
pgbouncer is a lightweight connection pooler for PostgreSQL.
I've decided to write this mini howto in order to prove that pgbouncer:
- is easy to install and configure
- is really useful, even in minimal setups (same machine as postgres, 10 clients).
As a prerequisite, we will need:
- postgres up and running (well, that's what pgbouncer was made for).
- standard set of development tools needed to compile C programs (gcc+make+binutils).
I assume you already have these.
My Linux distro is Ubuntu 9.10, kernel/software versions:
filip@srv:~$ uname -r 2.6.27.7-smp filip@srv:~$ pg_config --version PostgreSQL 8.4.3 filip@srv:~$ gcc --version gcc (Ubuntu 4.4.1-4ubuntu9) 4.4.1
Installing pgbouncer
First we need libevent (event notification library). Luckilly, in recent Ubuntu version it's packaged, so let's use it:
filip@srv:~$ apt-cache search libevent libevent-1.4-2 - An asynchronous event notification library libevent-core-1.4-2 - An asynchronous event notification library (core) libevent-dev - Development libraries, header files and docs for libevent (...)
We need libevent-dev package, to have header and include files needed for pgbouncer compilation.
filip@srv:~$ sudo apt-get install libevent-dev (...) done.
OK so now we have libevent installed. Next we go for pgbouncer itself.
It's not packaged for Ubuntu at the moment, so we have to compile it from sources. I chose to install to /usr/local - simply because I had no better idea. YMMV.
Download, unpack and configure:
filip@srv:~/src$ wget http://pgfoundry.org/frs/download.php/2608/pgbouncer-1.3.2.tgz (...) `pgbouncer-1.3.2.tgz' saved [166756/166756] filip@srv:~/src$ tar xzf pgbouncer-1.3.2.tgz filip@srv:~/src$ cd pgbouncer-1.3.2/ filip@srv:~/src/pgbouncer-1.3.2$ ./configure --prefix=/usr/local (...) configure: creating ./config.status config.status: creating config.mak config.status: creating include/config.h
OK it's configured, lets compile and install:
filip@srv:~/src/pgbouncer-1.3.2$ make (...) make[1]: Nothing to be done for `all'. make[1]: Leaving directory `/home/filip/src/pgbouncer-1.3.2/doc' filip@srv:~/src/pgbouncer-1.3.2$ sudo make install (...)
Pgbouncer is now installed.
All in one command, for your convenience:
wget http://pgfoundry.org/frs/download.php/2608/pgbouncer-1.3.2.tgz \ && tar xzf pgbouncer-1.3.2.tgz \ && cd pgbouncer-1.3.2 \ && ./configure --prefix=/usr/local \ && make && sudo make install
Configuring pgbouncer
Now let's create a config file. We do this by copying distributed ini file sample to /etc:
filip@srv:~$ sudo cp /usr/local/share/doc/pgbouncer/pgbouncer.ini /etc/pgbouncer.ini
Next we edit this file and configure needed options.
Most important is pool_mode. I choose transaction pooling mode because it gives most performance boost for normal postgresql usage.
As for connection settings, we have pgbouncer on port 6432, routing connections to port 5432 (my postgres is on same host and standard port).
File paths are adjusted for typical Ubuntu setup. pgbouncer will be run from postgres account, so we just point auth_file to pg_auth.
Here is my pgbouncer.ini file:
[databases] * = port=5432 [pgbouncer] logfile = /var/log/postgresql/pgbouncer.log pidfile = /var/log/postgresql/pgbouncer.pid listen_addr = * listen_port = 6432 unix_socket_dir = /var/run/postgresql auth_type = trust auth_file = /var/lib/postgresql/8.4/main/global/pg_auth admin_users = postgres stats_users = postgres pool_mode = transaction server_reset_query = DISCARD ALL; server_check_query = select 1 server_check_delay = 10 max_client_conn = 1000 default_pool_size = 20 log_connections = 1 log_disconnections = 1 log_pooler_errors = 1
Starting pgbouncer
Now let's start the beast:
filip@srv:~$ sudo su - postgres postgres@srv:~$ pgbouncer -d /etc/pgbouncer.ini 2010-04-23 18:37:05.969 20068 LOG File descriptor limit: 10000 (H:15000), max_client_conn: 1000, max fds possible: 1010
OK, now let's check if this really works. Connect to admin console (virtual db "pgbouncer"):
filip@srv:~$ psql -Upostgres -p6432 pgbouncer
psql (8.4.3, server 8.0/bouncer)
WARNING: psql version 8.4, server version 8.0.
Some psql features might not work.
Type "help" for help.
postgres@pgbouncer=#
It works!.
Benchmarking
Now let's do some testing. We use good'ol' pgbench.
Create and initialize test database with scale 10:
filip@srv:~$ createdb bench filip@srv:~$ /usr/lib/postgresql/8.4/bin/pgbench -i -s 10 bench (...) vacuum...done.
Test with 10 clients, separate connection for each transaction, direct connection to postgres, run test for one minute:
(mandela)filip@ratel:~$ /usr/lib/postgresql/8.4/bin/pgbench -c 10 -C -T 60 bench starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 10 query mode: simple number of clients: 10 duration: 60 s number of transactions actually processed: 1528 tps = 25.410170 (including connections establishing) tps = 53.186072 (excluding connections establishing)
Now the same test, but connecting via pgbouncer:
(mandela)filip@ratel:~$ /usr/lib/postgresql/8.4/bin/pgbench -c 10 -C -T 60 -p 6432 bench starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 10 query mode: simple number of clients: 10 duration: 60 s number of transactions actually processed: 2601 tps = 43.308068 (including connections establishing) tps = 55.619391 (excluding connections establishing)
Looks promising.
Now both tests repeated several times, just bare results for brevity:
(direct) tps = 25.583194 (including connections establishing) tps = 55.247968 (excluding connections establishing) (pgbouncer) tps = 51.769025 (including connections establishing) tps = 73.188059 (excluding connections establishing) (direct) tps = 25.857126 (including connections establishing) tps = 64.090508 (excluding connections establishing) (pgbouncer) tps = 61.633963 (including connections establishing) tps = 87.375610 (excluding connections establishing) (direct) tps = 21.134134 (including connections establishing) tps = 50.005559 (excluding connections establishing) (pgbouncer) tps = 50.122482 (including connections establishing) tps = 74.693641 (excluding connections establishing) (direct) tps = 18.925272 (including connections establishing) tps = 49.249117 (excluding connections establishing) (pgbouncer) tps = 63.616117 (including connections establishing) tps = 94.977040 (excluding connections establishing) (direct) tps = 22.444140 (including connections establishing) tps = 43.382705 (excluding connections establishing) (pgbouncer) tps = 68.886017 (including connections establishing) tps = 102.644402 (excluding connections establishing) (direct) tps = 19.979776 (including connections establishing) tps = 52.215144 (excluding connections establishing) (pgbouncer) tps = 57.047613 (including connections establishing) tps = 85.300031 (excluding connections establishing)
Lets make it more readable, and calculate performance gain (pgbench vs direct):
| Test # | direct incl connections | direct excl connections | pgbouncer incl connections | pgbouncer excl connections |
|---|---|---|---|---|
| 1 | 25,41 | 53,19 | 43,31 | 55,62 |
| 2 | 25,58 | 55,25 | 51,77 | 73,19 |
| 3 | 25,86 | 64,09 | 61,63 | 87,38 |
| 4 | 21,13 | 50,01 | 50,12 | 74,69 |
| 5 | 18,93 | 49,25 | 63,62 | 94,98 |
| 6 | 22,44 | 43,38 | 68,89 | 102,64 |
| 7 | 19,98 | 52,22 | 57,05 | 85,3 |
| AVG | 22,76 | 52,48 | 56,63 | 81,97 |
| PGBOUNCER GAIN PERCENT | 148,78% | 56,19% |
We can see that - including time consumed by connections handling - pgbouncer gives about 150% speedup compared to raw postgres.
56,19% is also a very interesting result - this difference probably comes from postgres session initialization, but maybe pgbouncer handles connections/disconnections faster than postgresql itself.
These results are very good but of course they are heavily influenced by pgbench "-C" switch (separate connection for each transaction). Let's see what are the results for pgbench without "-C":
| Test # | direct incl conn | direct excl conn | pgbouncer incl conn | pgbouncer excl conn |
|---|---|---|---|---|
| 1 | 77,64 | 77,78 | 55,67 | 55,7 |
| 2 | 73,16 | 73,38 | 79,59 | 79,6 |
| 3 | 80,91 | 81,03 | 67,45 | 67,46 |
| 4 | 61,8 | 61,94 | 78,97 | 79,02 |
| 5 | 79,45 | 79,57 | 80,66 | 80,7 |
| AVG | 74,59 | 74,74 | 72,47 | 72,5 |
| PGBOUNCER GAIN PERCENT | -2,85% | -3,00% |
As you see, for persistent connections there is no gain - even a small overhead.
BTW, both test were executed like this:
filip@srv:~$ for n in 1 2 3 4 5 ; do echo "$n (direct)"; /usr/lib/postgresql/8.4/bin/pgbench -c 10 -T 60 bench 2>&1 | grep tps; echo "$n (pgbouncer)"; /usr/lib/postgresql/8.4/bin/pgbench -c 10 -T 60 -p 6432 bench 2>&1 | grep tps; done
Note: this was on a very weak machine - desktop class PC from circa 2006.
Conclusion and disclamer
Using pgbench is definitely a good idea if you have clients connecting many times repeatedly.
These result would probably be very similar using any other decent connection pooler.
I am not an expert on TPC-B and I also do not take responsibility for any damage made to your system and/or database by the code written above.
I do not guarantee that you achieve same results - maybe the whole test was crippled and useless.
As always, please let me know if you see any errors / omissions in the article.