[an error occurred while processing this directive]

  Linux Letter 09

The Linux Letter for August 2, 1999

Hello again, and welcome to another Linux Letter. You may have missed last week's letter, and for that all I can do is offer my humble apologies. As a starving, non-traditional college student, I was caught up in the frenzy of summer term finals in Multivariable Calculus and just didn't have the time to write the column. So I guess that this week I'd better do something good!

Besides the Internet and the general hullabaloo over open source software, one of the promising directions for Linux is in supercomputing applications. While Linux easily scales to multiple processors on a single computer, many organizations are finding that tremendous performance can be had by combining many single processor computers into clusters of systems that behave as a single computer. The common term for this is distributed computing.

Distributed computing breaks data up into individual-computer sized chunks, which are then processed by each computer in the cluster. Generally, a master computer controls the communications within the cluster, assigns the data to the individual systems and accepts the processed results back for presentation to the user.

While you may not have actually used a cluster of computers, Linux or not, chances are that you've already heard of distributed computing. Distributed.net is using the idea on a grand scale to attempt a brute force crack of the latest RC5-64 code. SETI@Home also uses distributed computing to process data from the Aricebo radio telescope in Puerto Rico as they search for extra-terrestrial intelligence. And here's how it works:

In the case of SETI@Home, a group of servers stand by to deliver relatively small packets of data to a client program that runs on another computer, perhaps yours. When the client contacts the server via the Internet, the server sends a packet of data to the client, then closes the connection. The client software then processes the data. When the data is processed, the client opens a connection to the server, uploads its data, then downloads another packet to be processed. The process repeats itself for as long as the client system is running. At the same time, many other clients are running the software, each downloading its own packets of data for processing. In that manner, many thousands of packets can be processed simultaneously, thus distributing the computing process among many computers.

The results can be breathtaking. In the case of SETI@Home, roughly 38,000 years worth of data processing has been done in just a few months. Yet the cost of the project has been minimal because the data is processed using spare CPU cycles of many thousands of computers, instead of dedicating a single supercomputer to the task.

On a smaller scale, Linux supports distributed computing with clusters. A cluster of as few as four computers can process certain data as much as 5 times faster as a single computer. Data that is very CPU intensive, such as graphics rendering lends itself very well to this type of processing.

Currently, the most popular clustering software for use with Linux is PVM. This is a massaging software that uses rsh to provide a secure means of communication between all of the computers. It coordinates the passing of data amongst the network nodes and provides API's to software programmers to allow them to take advantage of the clustering abilities of Linux.

Since networking hardware is becoming less expensive, cluster performance is improving. 100Mbps switches that used to be virtually unaffordable are emerging as commonplace methods of connecting network nodes. And even hubs perform well enough to be considered as candidates for clustering.

Another method of clustering uses a software package called Mosix. Instead of running programs as separate processes on remote computers, Mosix causes the cluster of computers to appear to be a single computer to the user. The program distributes data by examining the performance of the individual computers in the cluster, so that higher performance computers process more data and lower performance computer process less, so that each computer spends roughly the same amount of time processing data.

If you're interested in pursuing distributed computer, the best opportunity is probably through SETI@Home or Distributed.net. In fact, The NOSPIN Group maintains a team on SETI@Home, so you can actually see how your processing contributes to the performance of a very large "virtual" computer.

SETI@Home:
 http://setiathome.ssl.berkeley.edu

Distributed.net:
 http://www.distributed.net

MOSIX:
 http://www.mosix.cs.huji.ac.il

PVM: http://www.epm.ornl.gov/pvm/pvm_home.html

Extreme Linux: 
http://www.extremelinux.org

 

Hot Tip of the Week

Here's a quick one. If you have an accelerated video card (most modern video cards are) and yourXserver is also accelerated (ditto), you can increase your X performance by editing your XF86Config file and adding the option "accel" to the device section of the file.

But before you do...BACK UP THE FILE!

Happy computing!

Drew Dunn

 



Get your free email account...  TODAY!!!

 


The Power
of
Linux

 

[an error occurred while processing this directive]