Systems Development
Up Main Page

Up

 

One of the research projects / areas-of-interest I'm involved in deals with Cluster systems (aka "beowulf" systems to *nix devotees) large enough to handle computational tasks usually performed by Supercomputers or Supercomputer-class systems.

This is a very interesting turn of direction for me, since my very first job in the Computing/Data Processing industry was as a "Digital Computer Operator" at Lawrence Berkeley Lab back in the late 70's / early 80's. I was one of the operators that ran LBL's "BKY" system, which was comprised of two CDC 6000 (photos here and here) systems which were connected to a CDC 7600Z (photos here). The 6000B & 6000C were basically there to provide I/O services like Tape library, printing, plotting & card reader input into the 7600, although the 6000's were also able to run their own tasks as well. This was back in the time where Computers and their associated hardware took-up 1500 sq. ft. of floor-space, technicians, operators and engineers wore blue or white lab coats and pocket-protectors and were revered by their acquaintances as something of a marvel...almost a "priestly" status.

It's almost over 25 years since those days, and now I'm coming "full-circle" in researching "Massively Parallel" systems, but with a twist: Supercomputer-class systems build with hardware commonly available at your local PC store like Fry's Electronics - Pentium & Athlon CPUs, Abit & ASUS motherboards, etc.

This "commodity-based" Cluster gained a huge chunk of popularity when the University of Kentucky's "The Aggregate" organization debuted their KLAT2 Linux "Supercomputer". These folks were not new at this, being the group that came-up with the PAPERS system interconnect scheme that brought per-processor communications speed and latency to where they needed to be to reduce what is called "bisection bandwidth"; this is a critical measure of how well a multi-processor system or cluster will perform with a given workload. With their work on KLAT2, they were able to build a system that was as fast as the 200th fastest system in the world as recorded by The Top 500 List - and do it for around US$41,000.00; yes, that's right....just  about forty-one thousand US dollars!.

The independent research I'm doing is based on the work that The Aggregate has done with KLAT2 and to extend their designs to a system twice the machine-count (128 nodes) and quadruple the raw number of processors (256 CPUs - two Athlon MP CPU's per node). Two systems will be designed: one running Linux or a *nix variant (*BSD will not be counted-out) and another running Microsoft Windows 2003 Server or Windows 2000 Datacenter Server (the decision will depend on how helpful Microsoft will be once we are ready to begin the work).

In addition to the raw system design will be the inclusion of a custom-designed water-cooling system, with chilled water provided by a rather "artistic" waterfall-wall similar in design to those soothing water-sculpture walls, but encased in plexiglass) and the chilled water being provided to all 256 of the CPUs

These systems would be assembled by a non-profit organization to provide additional research to the National Science Foundation and at their completion to provide for low-cost Supercomputer-class computing facilities to local Universities, Colleges and other Institutions of learning & research.

As I was working on the basic design of these systems, a thought popped into my head that came totally out- of- left-field; surplus computers! Companies toss-out hundreds of disused system  a year - what better way to keep them in use and keep them out of the landfills than to build an ever-increasing Cluster system with these systems?!?

    Companies donate used systems and get tax breaks for contributing to a non-profit organization

    The non-profit organization keeps and renovates those that will suit out needs and donates the rest to other charities.

Definitely a win-win-win situation. The Environment wins by not having more disused computer hardware choking-up the landfills with toxic components, the non-profit wins by getting systems for this 3rd Cluster, and other charities get computers that have been refurbished and are in decent working order.

This project is in it's pure infancy - more info will be posted as things progress.