Popular computing
Posted by arnulfo on 2009/11/02
I am interested in High-performance computing exploration and research, comparison of algorithms and platforms and tools. Example applications are optimization algorithms are Monte Carlo Simulations of Lattice QCD.
There are several approaches you may take to increase your computing resources. The availability of broad-band Internet connections and the still increasing cost-performance ratio of computing hardware as well as the enormous computational potential of Internet-connected infrastructure have sparked an interest on dynamically configured computational grids available on as-needed bases. This is referred to as Cloud Computing. Cloud computing involves the provision of dynamically scalable and often virtualized resources offered as a service over the Internet. The term cloud is used as a metaphor for the Internet, based on how the Internet is depicted in computer network diagrams and is an abstraction of the underlying infrastructure it conceals. Mayor players on Information Technology such as IBM, SUN, Microsoft, and Amazon have offerings of this kind. Some of them might be open to support research projects. There are also non-commercial projects that have taken advantage of the Internet computing power in what is known as volunteer computing such as SETI@home.
One example of the volunteer computing paradigm is the Great Internet Mersenne Prime Search (http://www.mersenne.org/ ) which has been finding large primes since 1996. On April of 2009 a Mersenne prime, a 12,837,064 digit number, was found on a regular office workstation. This calculation took 29 days on a 3.0 GHz Intel Core2 processor. The basic idea here is to have a central server assigning tasks to a variable and unreliable computational pool, using idle resources that would be wasted otherwise.
A different approach is the one taken by Galaxy Zoo 2, an interactive project that allows the user to participate in a large-scale project classifying millions of galaxies. Galaxy Zoo 2 is a new version of the highly successful project that enables members of the public to say whether a galaxy was spiral or elliptical, and which way it was rotating, Galaxy Zoo 2 asks users to delve deeper into 250,000 of the brightest and best galaxies to search for the strange and unusual. Here the project is asking people themselves to spend time looking and analyzing data.
A more general approach was a result of the two original goals of SETI@home project: to prove the viability and practicality of the ‘distributed grid computing’ concept, and to do useful scientific work by supporting an observational analysis to detect intelligent life outside Earth. The first of these goals is generally considered to have succeeded completely. The current BOINC environment, a development of the original SETI@home, is providing support for several computationally intensive projects in a wide range of disciplines.
The Berkeley Open Infrastructure for Network Computing (BOINC) is a non-commercial middleware system for volunteer and grid computing. It was originally developed to support the SETI@home project before it became useful as a platform for other distributed applications in areas as diverse as mathematics, medicine, molecular biology, climatology, and astrophysics. The intent of BOINC is to make it possible for researchers to tap into the enormous processing power of personal computers around the world. BOINC has been developed by a team based at the Space Sciences Laboratory at the University of California, Berkeley led by David Anderson, who also leads SETI@home. As a “quasi-supercomputing” platform, BOINC has about 570,000 active computers (hosts) worldwide processing on average 2 petaFLOPS as of July 2009.
In another complementary vein there have been some efforts to leverage the availability of graphics-acceleration-hardware to use a low cost high performance platform. General-purpose computing on graphics processing units (GPGPU, also referred to as GPGP and to a lesser extent GP²) is the technique of using a GPU, which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the CPU. It is made possible by the addition of programmable stages and higher precision arithmetic to the rendering pipelines, which allows software developers to use stream processing on non-graphics data.