Clusters: Open Source Meets Commodity Hardware


Posted

in

, ,

During the last semester I wrote two papers for my Computer Architectures class. I spent quite a bit of time on them and have been thinking about posting them on my weblog for quite some time. I’m a bit worried about plagarism though, and I’m not sure what to do about it. I’m pretty sure that I can submit it to the auto-plagarism-detector service that my university subscribes to, and I’m probably going to do that now that this paper is posted.

Secondly, I’m releasing this paper under the by-nc-sa (Attribution NonCommercial ShareAlike 2.0) license, so unless you can turn in your paper to your teacher with a by-nc-sa license displayed on it, you can’t include it in your paper without proper citation.

PLEASE NOTE: If you are considering plagarising this, please don’t. If your teacher allows you to cite non-academic internet sources, then by all means borrow my ideas and cite me. What I would really suggest doing is taking a look at my primary sources and then heading to your university library or computer system to consult them yourself. All of the ACM journal sources that I cited are available online if your university subscribes to the ACM Portal. This paper was thoroughly researched but there were some late nights involved in the production of it so it is provided WITHOUT WARRANTY against correctness or anything like that. My RAID paper is probably a little better than this one.

Creative Commons License
This work is licensed under a Creative Commons License.

Clusters:

Open Source Meets

Commodity Hardware

Matt Croydon

CMSC 311

May 6, 2005

Before the mid-1980's, most supercomputers were large, monolithic machines. Over time the Top 500 Supercomputers list has seen clusters go from non-existent to being the dominant architecture[1], currently representing 296 out of the top 500 slots (almost 60%)[2]. Compared to monolithic supercomputers such as those from Cray Research, clusters are extremely cheap for the amount of performance realized. When lower cost is combined with cheap off-the-shelf hardware and open source software platforms, clusters can't help but improve and gain popularity.

Different Tools for Different Jobs

The definition of the word “cluster” varies greatly depending on the context in which it is used. A cluster is commonly used in high availability situations when, for example, equipment must gracefully fail over or requests must be divided among the available hardware. For clustered application servers, this can be accomplished by simple round robin DNS entries or more complex load balancing hardware or software.

Clusters are also used when data needs to be stored on multiple machines for redundancy or performance sake. MySQL, an open source database program, can use a cluster of computers for both data replication as well as load balancing[3] in several configurations.

This paper will focus on the most common and most popular form of clustering: clusters used for parallel computing, scientific computing, and simulation in educational, professional, and government organizations. More specifically, it will focus on open source software that is available to make the construction and administration of clusters easier and more powerful.

A Brief History

Cluster computing has its roots in the mid 1980's when developers wanted to tie together multiple computers in order to harness their collective power. In 1985, Amnon Barak developed the first predecessor to Mosix called MOS that ran on a cluster of four Digital Equipment Corporation (DEC) PDP/11 computers[4]. In 1986, DEC decided try try clustering for themselves with VAXCluster[5]. At the time, VAXCluster was able to take advantage of a much higher data rate of 70mbit/sec[5] but because of the proprietary interconnect used, VAXCluster remained much more tightly coupled, while MOS and Mosix decided to use token ring LAN technology[4]. As Mosix was ported to other platforms and improved, it was also available to take advantage of advances in networking technology without its looser coupling being effected. Mosix relied on patches to the Unix kernel in order to allow processes to migrate among nodes in the cluster. Mosix was later ported to the Linux kernel by Moshe Bar[6], where it thrives as an open source project.

Beowulf came on to the scene in the early to mid 1990's with a huge splash. Beowulf allowed users to tie together large numbers of lower cost desktop hardware (486 DXes at the time) rather than the specialized hardware used by Mosix and VAXCluster.[7]. The project originated at NASA's Goddard Space Flight Center and quickly led to a successful open source project[8] as well as a successful business for some of the original developers called Scyld Software.

Beowulf Internals

PVM (Parallel Virtual Machine) and MPI (Message Passing Interface) are standards for preforming parallel operations. Both frameworks have language bindings (for example FORTRAN, C, Perl, and other languages) that abstract the underlying standard in to something easier to work with. The direction of MPI is steered by a group called the MPI Forum. Since the release of the initial specification, the MPI Forum have updated the specification to MPI 2.0, adding features and clarifying issues that were deemed important[9] Each cluster software, tool, or operating system vendor implements their own version of MPI, so cross-platform compability is not guaranteed, but porting between MPI implementations is quite possible.

In contrast to MPI, PVM software is usually provided at the PVM website[10] in either source or binary form. From there users can call the PVM library directly or though third party bindings. PVM provides binaries for Windows, which can allow users to program parallel applications on a platform that they may be more familiar with. However, most Beowulf clusters run standard Linux or some variant thereof. PVM also supports monolithic parallel computers such as Crays and other specialized machines. Further differences and similarities between MPI and PVM can be found in the paper Goals Guiding Design: PVM and MPI by William Gropp and Ewing Lusk[11].

In recent years, another tool called BProc (the Beowulf Distributed Process Space) has expanded the abilities of parallel processing and management of data between nodes. BProc allows a parallel job to be started on the main controlling node and parallel-capable processes are automatically migrated to child nodes[12]. This paradigm is also used by Mosix and OpenMosix, which will be discussed later. BProc is an open source project available at bproc.sourceforge.net.

Parallel processes also need to take in to consideration the amount of time that will be needed for preparation, cleanup, and merging of parallel data. Amdahl's law[26] stipulates that the total execution time for a parallel program is equal to the parallel part of the problem divided by the number of nodes plus the serial part of the program. Even if a cluster contains thousands of nodes, the amount of time ti takes to execute the serial code is going to remain constant.

How to Build a Beowulf[7]

Large Beowulf clusters run complex simulations and crunch teraflops of information per second. At the same time, small 4-16 node clusters are often used in educational settings to teach parallel processing design paradigms to Computer Science students as well as cluster design and implementation to Computer Engineering students.

College Beowulf clusters are often (but not always) comprised of outdated computers and hand-me-down hardware. While extremely fast speeds cannot be obtained with these antiquated clusters, they are valuable in teaching and observing the differences between a program or algorithm written for a single processor machine and the same program/algorithm written for and run on a cluster.

There are several tools available for deploying a Beowulf cluster, but almost all require a basic installation of a compatible Linux distribution on either the mater node or the master and all child nodes. Scyld software makes what is widely considered the easiest to install Beowulf software. All that is needed is a $3 CD[14] containing an unsupported Scyld distribution for the master and each child node. Official copies with commercial support are also available directly from Scyld. Once the CD is booted on the mater note, a simple installation menu is presented. After installing and configuring Scyld on the master node, insert a Scyld CD in each child node and they automatically get their configuration information from the master node and the child nodes can run directly from CD.

Another popular package that runs on top of many modern RPM-based Linux distributions is OSCAR, the Open Source Cluster Application Resources project[15]. OSCAR offers a very simple user interface to install and configure cluster software on the master node. Once that is accomplished, client nodes can be Network booted and the client software automatically installed. OSCAR also supports other installation and boot methods.

While many colleges take the small cluster approach, Virginia Tech has taken advantage of the modern Macintosh platform and created a top 10 supercomputer for a fraction of traditional costs. Virginia Tech started out with desktop machines, but now maintains a cluster of 1100 Apple XServe 1U servers running Mac OS X server (based on an open source BSD-derived core called Darwin).

Another Approach to Clustering: OpenMosix

While most Beowulf clusters are dedicated to cluster-related tasks all the time, clustering does not have to be that way. OpenMosix is a set of patches to the Linux kernel and a few userland monitoring tools for keeping track of where processes are running and how efficiently. OpenMosix is extremely flexible. Nodes can join or leave a cluster whenever they wish. Many programs and algorithms can take advantage of clustering with the automatic node migration built in to OpenMosix. Whenever a new process is spawned or forked (as is common in traditional Unix-like software design) OpenMosix may choose to execute that process locally or on another node.

Many OpenMosix clusters are implemented in a head/client node configuration much like Beowulf clusters, but they are not limited to such configurations. Because OpenMosix is just a patch to the standard kernel, machines in a cluster can have multiple uses. They can run standard graphical window managers and be used as desktop machines while processes are migrated to them if they have computing cycles to spare. OpenMosix does an excellent job at making sure that client nodes still have enough resources to do whatever else they are doing in addition to cluster process execution.

In addition to the mutli-use scenario, OpenMosix cluster nodes can run as true peers. For example, if there are 20 computers currently connected to a dynamic cluster and all but a few of them are idle, processes from the machines being actively used can be automatically migrated for execution throughout the cluster. Similarly, if all computers are heavily used, virtually no process migration will occur since execution will be quicker on the local machine. Also, if 400MHz desktop machine needs to do some complex calculations, as long as the program is written in a way that can take advantage of process migration, those calculations could be run extremely quickly on an idle 3GHz machine. Many of the scenarios above are described in a Linux Journal article entitled Clusters for Nothing and Nodes for Free[16], but also come from my experiences building and experimenting with a 2-3 node OpenMosix cluster a few years ago[17].

Recently the OpenMosix community has embraced “instant clusters,” or the idea that any hardware with local network connections can become a cluster without interfering with its other uses. The OpenMosix website lists a page[18] with several open source “instant cluster” software projects. The most popular project is called ClusterKnoppix[19], a Linux distribution with OpenMosix installed on it that runs directly from CD-ROM. With a minimum of one CD burned on a master node, a 30 seat computer lab can instantly become a 30 node cluster without disturbing the operating system installed on the hard drives.

To share data among nodes, OpenMosix uses the Cluster File System, a concept originally developed for the Mosix project called the Mosix File System[4]. The file system was renamed after the Mosix project closed its source code and Moshe Bar and others began working on the GPL-licensed[20] code which would become OpenMosix between 2001 and 2002. This cluster file system along with the ability to run a cluster as peer-nodes gives OpenMosix quite an advantage over traditional monolithic and cluster systems.

How Open Source Helps Clusters

While some computational clusters run on Windows, the vast majority run on top of an open-source Linux distribution. The Linux Kernel itself is open source and depending on the Linux distribution, all, most, or at least some of the operating system is open source. Sometimes Linux distributions can be open source without being free (as in no cost) such as Red Hat Enterprise Linux. There are many excellent free (open source and no cost) Linux distributions to run Beowulf, OpenMosix, or any other type of clustering software on.

There are many open source applications that help users install, configure, and maintin clusters; many have been mentioned before. These include OSCAR, Beowulf and OpenMosix themselves, various PVM and MPI implmentations, BProc, and more. In addition to the tools already mentioned, there is a suite of open source utilies for OpenMosix called openMosixView[21]. The various programs included in the suite allow for visualization as well as graphical management of the cluster, visual feedback for processes, process migration, load per node, and also allow for logging and analysis of cluster performance.

There are many other interesting open source clustering projects that don't require a Beowulf or OpenMosix frame to run on. One of the most popular examples of this is distcc[22], a program that allows for distributed compilation of C or C++ code. Distcc is quite lightweight and does not require a shared filesystem, it just requires child nodes to be running distcc in daemon mode.

The Future of Clusters

While Robert Lucke considers openMosix the next generation of clustering software because of its flexibility[23], some of the most stunning advances are happening in the world of grid and distributed computing[24]. Grid computing can mean different things to different people, but generally extends computing platforms beyond location and geography.

The SETI@home project[25] has managed to create a very powerful supercomputer by utilizing the spare CPU cycles of thousands of desktop machines spread throughout the world. The program usually runs as a screen saver so that it does not consume computing resources while the machine is being actively utilized. SETI@home and other projects are pushing the envelope of using spare processor cycles to tackle a task that would otherwise require large dedicated clusters or supercomputers.

While grid and distributed computing may take away part of the supercomputing market share that clusters (and particularly those built on open source software using commodity hardware), I believe that clusters are here to stay. Individual component prices continues to drop, network throughput is improving, and cluster software continues to evolve. Expect to hear even more about clusters over the next several years.

References

[1] Top500 Supercomputer Sites, “Charts for November 2004 – Clusters (NOW),” April 2005, http://top500.org/lists/2004/11/overtime.php?c=5.

[2] Top500 Supercomputer Sites, “Highlights from Top500 List for November 2004,” April 2005, http://top500.org/lists/2004/11/trends.php.

[3] J. Zawodny and D. Balling, High Performance MySQL, Sebastapol: O'Reilly and Associates, 2004, chaps. 7 and 8.

[4] A. Barak et al, The Mosix Distributed Operating System: Load Balancing for Unix, Berlin: Springer-Verlag, 1993, pp. 1-18.

[5] N. Kronenberg et al, “VAXcluster: a closely-coupled distributed system,” in ACM Transactions on Computer Systems (TOCS), 1986, pp. 130-146.

[6] The openMosix Project, “openMosix, an Open Source Linux Cluster Project,” April 2005, http://openmosix.sourceforge.net/.

[7] T. Sterling et al, How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters, Cambridge, Mass: The MIT Press, 1999.

[8] The Beowulf Project, “Beowulf.org: The Beowulf Cluster Site,” April 2005, http://www.beowulf.org/.

[9] The MPI Forum, “Message Passing Interface,” April 2005, http://www-unix.mcs.anl.gov/mpi/.

[10] Computer Science and Mathematics Divison, Oak Ridge National Laboratory, “PVM: Parallel Virtual Machine,” April 2005, http://www.csm.ornl.gov/pvm/pvm_home.html.

[11] W. Gropp and E. Lusk, “Goals Guiding Design: PVM and MPI,” in IEEE International Conference on Cluster Computing (CLUSTER'02), 2002, pp. 257-268.

[12] E. Hendricks, “BProc: The Beowulf Distributed Process Space” in Proceedings of the 16
th
international conference on Supercomputing, 2002, pp. 129-136.

[13] P. Prins, “Teaching Parallel Computing Using Beowulf Clusters: A Laboratory Approach,”in Journal of Computing Sciences in College, 2004, pp. 55-61.

[14] Linux Central, “CDROM with Scyld Beowulf,” April 2005, http://linuxcentral.com/catalog/index.php3?prod_code=L000-089.

[15] Open Source Cluster Application Resources, “OSCAR: Open Source Cluster Application Resources”, April 2005, http://oscar.openclustergroup.org/.

[16] A. Perry et al, “Clusters for Nothing and Nodes for Free,” Linux Journal, Vol 2004, Issue 123, July, 2004.

[17] Matt Croydon, “OpenMosix Success,” April 2005, http://www.postneo.com/2002/11/20/openmosix-success.

[18] openMosix, “Instant openMosix, The Fast Path to an openMosix Cluster,” April 2005, http://openmosix.sourceforge.net/instant_openmosix_clusters.html.

[19] ClusterKnoppix, “ClusterKnoppix: Main Page,” April 2005 http://bofh.be/clusterknoppix/.

[20] Open Source Initiative, “Open Source Initiative – The GPL:Licensing” April 2005 http://www.opensource.org/licenses/gpl-license.php.

[21] openMosixView, “openMosixView: a cluster-management GUI,” April 2005, http://www.openmosixview.com/index.html.

[22] Martin Pool, “distcc: a fast, free distributed C/C++ compiler,” April 2005,

http://distcc.samba.org/.

[23] R. Lucke, Building Clustered Linux Systems (Hewlett-Packard Professional Books), Upper Saddle River, New Jersey: Prentice Hall, 2004.

[24] M. Holliday et al, “A Geographically-distributed, Assignment-structured, Undergraduate Grid Computing Course” in Proceedings of the 36th SIGCSE technical symposium on Computer science education, 2005, pp 206-210.

[25] The SETI@home Project, “SETI@home: Search for Extraterrestrial Intelligence at Home,” April 2005, http://setiathome.ssl.berkeley.edu/.

[26] G. Pfister, In Search of Clusters: The Coming Battle in Lowly Parallel Computing. Upper Saddle River, New Jersey: Prentice Hall, 1998. pp. 184-185.

Comments

96 responses to “Clusters: Open Source Meets Commodity Hardware”

  1. youtube downloader Avatar

    bwjmy youtube downloader ktlcuo

  2. teneryfa Avatar

    Great blog! I genuinely love how it is easy on my eyes as well as the info are well written.

  3. Natural weight loss pills Avatar

    Why there are so many comments in this blog?

  4. vyvoz stroitelnogo musora spb…

    vyvoz musora…

  5. wall safe Avatar

    An cool blog post right there mate ! Cheers for the post .

  6. dieta Avatar

    Danke für die Design-Tipps. Kann ich gut gebrauchen.

  7. versacheck payroll Avatar

    You have done an extraordinary job! we appreciate you your fantastic article as well as blog this makes my day…

  8. pozycjonowanie stron Avatar

    Generally I do not post on blogs, but I would like to say that this post really forced me to do so. Really nice post!

  9. dieta Avatar

    When I read all comments I want cry…

  10. fheiffh Avatar

    This site has plenty of extremely useful information on it! Appreciate informing me. I must admit that I was curious about the feel of your blog but it is really quite appealing. Awesome job, keep it up

  11. budapest reseguide Avatar

    continue with the the great work on the site. I appreciate it. Could use some more frequent updates, but i am sure that you got some better stuff to do , hehe. :p

  12. teds woodworking review Avatar

    When are you going to post again? You really entertain me!

  13. pozycjonowanie Avatar

    Would you be thinking about exchanging links?

  14. link building Avatar

    Have you thought about adding some videos to your article? I think it might enhance viewers’ understanding.

  15. zero friction marketing Avatar

    You made several nice points there. I did a search on the matter and found nearly all people will go along with with your blog.

  16. dentysta Kraków Avatar

    I added your blog to bookmarks. And i’ll read your articles more often! Before this, it would be possible for the government to arrest you just based on whatever you were saying, if they didn’t like it.

  17. zero friction marketing Avatar

    You made several fine points there. I did a search on the matter and found nearly all people will agree with your blog.

  18. zero friction marketing Avatar

    You made certain nice points there. I did a search on the subject and found a good number of people will have the same opinion with your blog.

  19. adult jobs Avatar

    Great posting keep up the good work

  20. paraprotex Avatar

    Wow – now that’s perspective! I think we often react in agreement or disagreement because of our emotions, but hearing another side, passionately presented, really makes us think!

  21. last minute Turcja Avatar

    Your site has helped me a lot to bring back more confidence in myself. Thanks! Ive recommended it to my friends as well.

  22. funny dogs Avatar

    keep up the nice work on the blog. I love it. Could use some more frequent updates, but i’m quite sure that you got better stuff to do , hehe. 😉

  23. sterydy Avatar

    Mate thanks for such a good article. I completely agree with you

  24. noni Avatar

    Good posting, im subscribing to your rss. Many thanks once more.

  25. Issac Maez Avatar

    Hey, I searched for this blog on Bing and just wanted to say thanks for the excellent read. I would have to agree with it, thank you again!

  26. Gregory Gautier Avatar

    I have found your blog on Google and the contents on your post is very elucidative. I pleased reading this one.

  27. Nelly Batcher Avatar

    I have found your blog on Google and the contents on your post is very elucidative. I enjoy reading this one.

  28. Seolinkvine Avatar

    I regard something genuinely special in this website .

  29. dance classes Avatar

    Your articles are very inspiring indeed. I keep coming back!

  30. Paketresor Avatar

    keep up the good work on the blog. I appreciate it. Could maybe use some more updates more often, but i am quite sure you have got other things stuff to do like we all do. 🙂

  31. paraprotex Avatar

    Good share, great article, very usefull for us…thanks.

  32. Lilliana Winikoff Avatar

    Your weblog is excellent. Thank you greatly for delivering a lot of very helpful information and facts. Thanks, turah Wiederin

  33. Hank Fredlund Avatar

    Wow! Thank you! I always wanted to write in my site some thing like that. Can I take part of your publish to my blog? Thanks, rnelius Ruter

  34. Luella Timper Avatar

    Your weblog is excellent. Thank you greatly for delivering a lot of very helpful information and facts. Thanks, rt Thury

  35. Bridgett Mundorf Avatar

    Dude, I must say that overall I am fairly taken with this web page. It is apparent that you know you subject matter and you are passionate about it.

  36. Rick Davids Avatar

    I’ll be subscribing to your feed and I hope you post again soon. Thanks , iah Platek

  37. Kira Vanauken Avatar

    I think that is an interesting point, it made me think a bit. Thanks for sparking my thinking cap.

  38. studia mba Avatar

    Spotted a link to this post over at Delicious. Thanks for posting it.

  39. pozycjonowanie Avatar

    Hi there 😀 Is it okay if I go a bit off topic? I’m trying to view your website on my new mobile but it doesn’t display properly, do you have any suggestions?

  40. deskowania Avatar

    I was suggested this website by my cousin. I’m not sure whether this post is written by him as nobody else know such detailed about my problem. You are amazing! Thanks!

  41. drzwi drewniane warszawa Avatar

    Spot on with this write-up, I actually think this website needs way more consideration. I’ll probably be again to learn far more, thanks for that info.

  42. wpolscemamymocneseo Avatar

    Wow! Thanx! I always wanted to write in my own site something like that. May I take part of your post to my blog?

  43. learn dance online Avatar

    Fantastic write-up, im subscribing to your feed. Cheers yet again.

  44. Paketresorna Avatar

    keep up the good work on the site. I love it. Could use some more frequent updates, but i am quite sure you have got more or better stuff to do like we all do. =p

  45. Temple Jansson Avatar

    Thank you for this intresting article. :-)y Shutters