postgresql and some grsec kernels = FAIL
Short message: Do not run postgres on some grsec - patched kernels.
Disclaimer: I write this just because I didn't find any clear explanations of the problem on the net, and I feel that such note can be useful for other who have this problem. I'm not an expert on grsec.
The story
A friend of mine had a problem with his postgres server (PostgreSQL 8.3.8).
The application (Perl/Java) was quite simple but it generated heavy load on the machine.
While running some INSERT/UPDATE queries, there happened random segmentation faults, like this one:
Dec 7 07:24:45 nsXXXXXX kernel: postgres[22481]: segfault at 7fda5e1d5000 ip 00007fda604553c3 sp 00007fffe41faf28 error 4 in libc-2.9.so [7fda603d1000+168000]
Dec 7 07:24:45 nsXXXXXX kernel: grsec: From XX.YY.ZZ.51: Segmentation fault occurred at 00007fda5e1d5000 in /usr/lib/postgresql/8.3/bin/postgres[postgres:22481] uid/euid:103/103 gid/egid:114/114, parent /usr/lib/postgresql/8.3/bin/postgres[postgres:29857] uid/euid:103/103 gid/egid:114/114
As you imagine, one of the backends went away. It happened few times a day.
After one of these segfaults, the server detected some corrupted pages ("ERROR: compressed data is corrupt") - effectively, the database was trashed, junk, byebye. Not possible to pg_dump, randomly damaged pages.
So you understand, this made the server completely useless.
After taking some advice from the community, and investigating some false traces (bad memory? OOM/overcommit?) we finally tracked this down to nonstandard kernel version:
# uname -a
Linux nsXXXXXX.ovh.net 2.6.31.5-grsec-xxxx-grs-ipv4-64 #2 SMP Thu Nov 5 12:36:20 UTC 2009 x86_64 GNU/Linux
I was a bit suspicious about this, and my friend found some grsec problems reported by users (but not postgres-related).
So we gave it a try. And Voila! After replacing the kernel with non-grsec version, the problem went away.
Of course the database had to be rebuilt.
Lesson learned.
UPDATE:
This for sure does not apply to ALL grsec'ed kernels. But it definitely applies to this version which we had problems with - it comes from the hosting provider ovh.pl
December 15th, 2009 - 15:02
Ekhm … niemożłiwe żeby grsec wpływał. Na wszystkich serwerach na jakie aktualnie mam wjazd jest grsec i jest/są na nich postgresy (w chrootach) – zero opisywanych kłopotów.
December 15th, 2009 - 16:20
@Marcin: dzięki, zedytowałem post żeby nie oczerniać grsec jako takiego.