imitatio creatio co we łbie piszczy

8Mar/100

dynamic SQL parameters in PL/PgSQL functions

It pays to read documentation.

From PostgreSQL 8.3 to 8.4 there was one interesting addon for plpgsql: ability to pass parameter values directly into EXECUTE.

Of course depesz wrote about it. And I did read it; But still (by routine) I was using something like:

sql_query := 'SELECT foo FROM bar JOIN baz USING (barbaz) WHERE baz.id = ANY (__PARAM__::integer[])';
sql_query := replace(sql_query, '__PARAM__', quote_nullable(my_param);
EXECUTE sql_query INTO my_foos;

While it can be written in more elegant and error-prone way:

EXECUTE 'SELECT foo FROM bar JOIN baz USING (barbaz) WHERE baz.id = ANY ($1)'
INTO my_foos
USING my_param;

5Mar/100

postgres schemagrep function

From time to time, I need to "grep" database schema to locate some database objects.

One possibility is to dump the whole schema to file (pg_dump -s DBNAME) and use your text editor to browse it. It is quite handy and I actually use it a lot.

But sometimes, it is more convenient to have this "grep-like" possibility directly from psql.

So... here it is, the "schemagrep" function.
Please note that argument for this function is a regex pattern, so you can look for almost everything.
The search is case-insensitive.


begin;

create or replace function schemagrep_relkind( "char" ) returns text as $$
select case $1
    when 'r' then 'TABLE'
    when 'i' then 'INDEX'
    when 'S' then 'SEQUENCE'
    when 'v' then 'VIEW'
    when 'c' then 'COMPOSITE TYPE'
    when 't' then 'TOAST TABLE'
end
$$ language sql immutable strict;

drop function if exists schemagrep( text );
create function schemagrep( text ) returns table(
    schema name,
    object_name name,
    match_type text,
    psql_hint text
) as $$
select x.*, case when x.match_type ~ 'FUNCTION' then E'\\df+ ' when x.match_type ~ 'COMMENT' then E'\\d+ ' else E'\\d ' end || nspname || '.' || relname
from (
    select n.nspname, c.relname, schemagrep_relkind(c.relkind) || ' NAME' as match_type
    from pg_class c, pg_namespace n where c.relname ~* $1 and c.relnamespace = n.oid
  union all
    select distinct n.nspname, c.relname, schemagrep_relkind(c.relkind) || ' ATTRIBUTE'
    from pg_class c, pg_attribute a, pg_namespace n where a.attrelid = c.oid and c.relkind <> 'v' and a.attname ~* $1 and c.relnamespace = n.oid
  union all
    select n.nspname, proname, 'FUNCTION DEFINITION'
    from pg_proc, pg_namespace n where prosrc ~* $1 and pronamespace = n.oid
  union all
    select schemaname, viewname, 'VIEW DEFINITION'
    from pg_views where definition ~* $1
  union all
    select n.nspname, relname, 'CLASS CONSTRAINT'
    from pg_class c, pg_constraint cx, pg_namespace n where cx.conrelid = c.oid and cx.consrc ~* $1 and relnamespace = n.oid
  union all
    select distinct n.nspname, c.relname, 'COMMENT'
    from pg_class c, pg_description d, pg_namespace n where d.objoid = c.oid and d.description ~* $1 and c.relnamespace = n.oid
) x
$$ language sql stable strict;

grant execute on function schemagrep(text) to public;

commit;

Test it by running

SELECT * FROM schemagrep( 'anything' );

For me, it works like this:

testdb=# select * from schemagrep('account');
 schema |      object_name      |     match_type      |            psql_hint
--------+-----------------------+---------------------+----------------------------------
 public | pgbench_accounts      | TABLE NAME          | \d public.pgbench_accounts
 public | pgbench_accounts_pkey | INDEX NAME          | \d public.pgbench_accounts_pkey
 public | sp_acc_createaccount  | FUNCTION DEFINITION | \df+ public.sp_acc_createaccount
(3 rows)

That's it! Of course the function can be extended or modified to your needs, because it's in pure SQL :-)

8Feb/100

iterate over a table in PostgreSQL vs SAS

I have struggled how to achieve something like this in SAS.

(you can read it as pseudocode but it's real PL/PgSQL):

DO $this$
DECLARE my_table name;
BEGIN;
    -- iterate over table names
    FOR my_table IN SELECT nazwa FROM tabele
    LOOP
        sql := 'CREATE TABLE XXX_backup AS SELECT * FROM XXX';
        sql := replace(sql, 'XXX', my_table);
        execute sql;
    END LOOP;
END;
$this$ language plpgsql;

After consulting a collegue [thx Ludwik] and exploring this helpful doc:
Performing Multiple Statements for Each Record in a SAS® Data Set, tadam:

%MACRO COPYTABLES(SCANFILE,SCANFIELD);
*get number of tables;
	DATA _NULL_;
		IF 0 THEN SET &SCANFILE NOBS=X;
		CALL SYMPUT('RECCOUNT',X);
		STOP;
	RUN;
*iterate;
	%DO I=1 %TO &RECCOUNT;
*get table name;
	DATA _NULL_;
		SET &SCANFILE (FIRSTOBS=&I);
		CALL SYMPUT('TABLENAME',&SCANFIELD);
		STOP;
	RUN;
*copy;
	DATA &TABLENAME._backup;
		SET &TABLENAME;
	RUN;
	%END;
%MEND COPYSETS;

%COPYSETS(tabele,nazwa);

When you switch from procedural coding to macro coding, you have twist your brain upside down...
I had to read a white paper (SAS docs lacking examples) and scratch my head for a moment before writing this simple procedural loop in SAS 4GL.

Now tell me that SAS isn't weird ;-)

15Dec/092

postgresql and some grsec kernels = FAIL

Short message: Do not run postgres on some grsec - patched kernels.

Disclaimer: I write this just because I didn't find any clear explanations of the problem on the net, and I feel that such note can be useful for other who have this problem. I'm not an expert on grsec.

The story

A friend of mine had a problem with his postgres server (PostgreSQL 8.3.8).
The application (Perl/Java) was quite simple but it generated heavy load on the machine.
While running some INSERT/UPDATE queries, there happened random segmentation faults, like this one:

Dec 7 07:24:45 nsXXXXXX kernel: postgres[22481]: segfault at 7fda5e1d5000 ip 00007fda604553c3 sp 00007fffe41faf28 error 4 in libc-2.9.so [7fda603d1000+168000]
Dec 7 07:24:45 nsXXXXXX kernel: grsec: From XX.YY.ZZ.51: Segmentation fault occurred at 00007fda5e1d5000 in /usr/lib/postgresql/8.3/bin/postgres[postgres:22481] uid/euid:103/103 gid/egid:114/114, parent /usr/lib/postgresql/8.3/bin/postgres[postgres:29857] uid/euid:103/103 gid/egid:114/114

As you imagine, one of the backends went away. It happened few times a day.

After one of these segfaults, the server detected some corrupted pages ("ERROR: compressed data is corrupt") - effectively, the database was trashed, junk, byebye. Not possible to pg_dump, randomly damaged pages.

So you understand, this made the server completely useless.

After taking some advice from the community, and investigating some false traces (bad memory? OOM/overcommit?) we finally tracked this down to nonstandard kernel version:

# uname -a
Linux nsXXXXXX.ovh.net 2.6.31.5-grsec-xxxx-grs-ipv4-64 #2 SMP Thu Nov 5 12:36:20 UTC 2009 x86_64 GNU/Linux

I was a bit suspicious about this, and my friend found some grsec problems reported by users (but not postgres-related).

So we gave it a try. And Voila! After replacing the kernel with non-grsec version, the problem went away.

Of course the database had to be rebuilt.

Lesson learned.

UPDATE:
This for sure does not apply to ALL grsec'ed kernels. But it definitely applies to this version which we had problems with - it comes from the hosting provider ovh.pl

15Jul/09Off

PL/Perl regex vs builtin regex_replace in postgres

Maybe you (like me) wondered which one is faster... to use builtin regexp_replace function or to use Perl's regex engine via PL/Perl.

Here you are - this is a test I made (PostgreSQL 8.3.7, Perl 5.8.8, on CentOS 5,3)

First we have a plperl function:

CREATE OR REPLACE FUNCTION test_perl( TEXT ) RETURNS TEXT AS
$code$
my $str = shift;
$str =~ s{[[:space:]]+}{}g;
$str =~ s{^\+48}{};
$str =~ s{[^[:digit:]]+}{#}g;
$str =~ s{#$}{};
$str =~ s{^([[:digit:]]{3})#([[:digit:]]{3})#([[:digit:]]{3})$}{$1$2$3};
return $str if $str;
return undef;
$code$
LANGUAGE plperl IMMUTABLE STRICT;

Then we have it's regexp_replace equivalent (sorry - it's not pretty, but it does exactly the same thing, five replace operations ):


CREATE OR REPLACE FUNCTION test_rr( TEXT ) RETURNS TEXT AS
$code$
SELECT nullif( regexp_replace( regexp_replace( regexp_replace( regexp_replace( regexp_replace(
$1, '[[:space:]]+', '', 'g' ), '^\+48', '' ),
'[^[:digit:]]+', '#', 'g' ), '#$', '' ),
'^([[:digit:]]{3})#([[:digit:]]{3})#([[:digit:]]{3})$', '\1\2\3' ), '' )
$code$
LANGUAGE SQL IMMUTABLE STRICT;

Test run:

=> select count(*) from testtable where test_rr(telefon) is not null;
Time: 7051.000 ms
Time: 7629.000 ms
Time: 7534.000 ms
=> select count(*) from testtable where test_perl(telefon) is not null;
Time: 4087.000 ms
Time: 3931.000 ms
Time: 4056.000 ms

It appears that Perl regex engine is almost twice faster than the builtin one.

I will follow with more tests, because I have a very intensive usage of regular expressions here... Probably it will be even faster to move proessing out of the database :-(

The use of database as do-everything place is tempting, especially in postgres - but maybe it's not the right tool for the job.


Update: yes, we moved processing out of the database. Flat files processed with regular expressions... Works like a charm :)