my row in your table


Did you know GNU parallel?

From time to time I write about some sweet little Linux utilities.

This time let me introduce GNU parallel.

As name suggests, it makes possible to run almost every other command in parallel, on many CPU cores.

Let's give a simple example. Searching for a string in many files. A simple test, on a 2-core box:

$ time bzgrep -w word /u/logs/app*/2012/01/31_app.log.bz2 > word.log
real    1m33.492s
user    1m25.269s
sys     0m8.181s
$ time parallel bzgrep -w word ::: /u/logs/app*/2012/01/31_app.log.bz2 > word.log
real    0m34.267s
user    2m3.112s
sys     0m9.193s

For more, read the wonderful list of examples in GNU parallel man page, they speak for themselves.

Of course parallel execution is possible without GNU parallel (with a piece of scripting language, or with xargs)... but this tool is more powerful. See this summary for a list of features and comparison with xargs, pexec and other alternatives.

The great thing about GNU utilities is that they actually still evolve and get better. Every time I see a progress there, it makes me optimistic about the future of open source.