imitatio creatio co we łbie piszczy

31May/100

mighty pigz

I've just googled for "paralell gzip" and found that pigz are there.

Not exactly hot news, but these are really useful and mighty pigz.

Once you start using them, "pigz -p3" becomes as natural as "make -j3".

Living without them for so many years... a shame. I have only one excuse: multi-core machines were not so common when I started my adventure with computing.

Happy pigzing!

15Dec/09Off

postgresql and some grsec kernels = FAIL

Short message: Do not run postgres on some grsec - patched kernels.

Disclaimer: I write this just because I didn't find any clear explanations of the problem on the net, and I feel that such note can be useful for other who have this problem. I'm not an expert on grsec.

The story

A friend of mine had a problem with his postgres server (PostgreSQL 8.3.8).
The application (Perl/Java) was quite simple but it generated heavy load on the machine.
While running some INSERT/UPDATE queries, there happened random segmentation faults, like this one:

Dec 7 07:24:45 nsXXXXXX kernel: postgres[22481]: segfault at 7fda5e1d5000 ip 00007fda604553c3 sp 00007fffe41faf28 error 4 in libc-2.9.so [7fda603d1000+168000]
Dec 7 07:24:45 nsXXXXXX kernel: grsec: From XX.YY.ZZ.51: Segmentation fault occurred at 00007fda5e1d5000 in /usr/lib/postgresql/8.3/bin/postgres[postgres:22481] uid/euid:103/103 gid/egid:114/114, parent /usr/lib/postgresql/8.3/bin/postgres[postgres:29857] uid/euid:103/103 gid/egid:114/114

As you imagine, one of the backends went away. It happened few times a day.

After one of these segfaults, the server detected some corrupted pages ("ERROR: compressed data is corrupt") - effectively, the database was trashed, junk, byebye. Not possible to pg_dump, randomly damaged pages.

So you understand, this made the server completely useless.

After taking some advice from the community, and investigating some false traces (bad memory? OOM/overcommit?) we finally tracked this down to nonstandard kernel version:

# uname -a
Linux nsXXXXXX.ovh.net 2.6.31.5-grsec-xxxx-grs-ipv4-64 #2 SMP Thu Nov 5 12:36:20 UTC 2009 x86_64 GNU/Linux

I was a bit suspicious about this, and my friend found some grsec problems reported by users (but not postgres-related).

So we gave it a try. And Voila! After replacing the kernel with non-grsec version, the problem went away.

Of course the database had to be rebuilt.

Lesson learned.

UPDATE:
This for sure does not apply to ALL grsec'ed kernels. But it definitely applies to this version which we had problems with - it comes from the hosting provider ovh.pl