The best-known method of clustering Linux computers is Beowulf. Many are familiar with it, but very few really seem to understand how it works.
There is no one software package called Beowulf. Beowulf is a collection of software tools that run on top of the Linux kernel. Messaging interfaces such as MPI (Message Passing Interface) or PVM (Parallel Virtual Machine) provide the most visible evidence to the user, but deeper "tweaks" allow the computers to run even faster.
The Linux kernel allows more than one Ethernet adapter to be "bonded" together, appearing as a single, fast interface to the computer. The virtual memory manager can be altered to increase performance. Global process identifiers allow any node to access any process using a service called Distributed Interprocess Communications.
Several books are available, web sites are devoted to the technology and a few commercial outfits sell
To really use a Beowulf cluster to its full potential, the software application must be written to the Beowulf specification. Beowulf doesn't take many computers and make them appear as one, allowing you to run any software in a distributed mode. It also can't be used as a load sharing system to, for instance, distribute http requests among several web servers.
Beowulf clusters were designed with processor intensive scientific applications in mind. Operations such as ray tracing, statistical analysis and complex mathematical functions work very well with this type of system.
A great reference on Beowulf, as well as several other clustering technologies, is available from
is the NOSPIN Group Beowulf: