search

CPU appeal

I have reached a point where I would benefit from having access to more CPU time than I do right now. So I have two questions:

  1. If I can get hold of a little money, what would be the cheapest way to get hold of some hardware to run my own cluster? The ability to customise heavily would probably be an advantage, because my experiments are so heavily CPU-bound that I can afford to skimp on things like hard drive speed.
  2. Assuming that I can't get hold of my own hardware, does anyone have some spare computers they could loan me some time on? I just need a C++ compiler, a remote shell and a practical way to transfer files, but bear in mind that if I can get any money for this I'll be buying my own hardware, so if I'm doing this I wouldn't have any funding to pay for it. I think the only resources that would be suitable are spare computers that are really sitting idle for a lot of the day.

Trackbacks

Trackback URL for this entry is: http://blog.case.edu/exg39/mt-tb.cgi/5601

Comments

The Ohio Supercomputing Center looks like what you want.

Posted: February 1, 2006 06:43 PM

Thanks for the suggestion. I would love to play with a compute farm like that, but unfortunately the OSC is for tenured and tenure-track faculty only, and I am a lowly PhD student.

Posted: February 2, 2006 12:08 AM

the OSC is for tenured and tenure-track faculty only

I don't think that's really true. I think that if you want to get a new node of the cluster located at your facility, you have to be a tenured or tenure-track faculty member to apply. However, Case already has a bunch of nodes located all over the place. I know that the Mech E department has a node, and that I can get access to it if I wanted to (although I haven't tried).

Posted: February 2, 2006 02:23 AM

Interesting. In that case the website does a poor job of explaining, as it definitely says that one must fit those criteria to apply for an account.

Posted: February 2, 2006 08:33 AM

Why are you doing this? What is it you are calculating that would be worthwhile supporting?


I know at least two PI's personally who have large farms ( and I'll point out that David Baker whose
work you reviewed on this site has an third one.)


So, what are you doing, why is it something that someone might want to support?


I'll give you some strategy for getting time at any of the exisiting super computing facilities. Be prepared to talk about your research, what it is and why it is important. Then ask, how do I apply myself. If the answer is you have to have a sponsor, find a sponser, If you are a Ph.D. student, ideally, your advisor, should be able to help, if not ask members of your committee. Or other facualty at your Department. Honestly if your advisor can't pull the strings to get you what you need ... I do need to ask why are you
working for them .. and if no one on your committe e can help you with this then whya re you even wasting your life at that institution.


O.K. step 1. Tell us what you are doing and why its worth supporting.


Step 2. Ask, if you need a sponsor, find someone
who will be a sponsor.


I have found it really helps to be able to talk about the scientific questions I am addressing. To talk about the computer programs I have written and need to run, to show preliminary results of running those programs.


This approach has worked for me more than once
in the past.


Hmm, maybe I should answer your other question:
How to build a cluster. Buy computers that are at the sweet spot in terms of cost per unit of cpu power, and buy as many of them as you can afford. Buy new machines, not old used junk, or you will spend all your time fixing broken machines.
Put Linux on them, and you've got your cluster and away you go.


There is one more aspect to this question. Is your problem trivially parallel. i.e. can you run the same program with different input sets on a bunch of separate machines and get your answer.


Some problems, really large linear systems, Large systems of partial differential equations, weather models, etc, are not trivially parallel.


In that case the best solution is MPI, the message passing library. This can be built and installed on top of Linux and can be included into any C program you are writting. Learning how to use MPI is a graduate level CS topic and usually takes a semester. Implementing algorithms in MPI is usually a PhD dissertaion in CS, though I know a few people who have done it in Applied Math.


If you go with MPI, you really want to get an MPI stack that is built on the VMI ( virtual machine interface), which I hope is available from the NCSA at UIUC. (MPI on top of VMI is vastely the
way to go, I have no idea if it has made it to open source software. I know that work was in progress three years ago.)


I think that is everything to say about getting started with a computing cluster.

Posted: March 5, 2006 03:43 PM

Chuck,

I'm not convinced that your idea about asking PIs for computers I can borrow is realistic. Perhaps this just says something very positive about your institution, but where I work the people who have large amounts of computing power at their disposal have it because they need it for their own projects, and don't have much to spare. Each PI spends as much money as they can get hold of on equipment, and it's never enough, because there just isn't that much money in the academic science world. If there were, I wouldn't have this problem in the first place.

The specific example of David Baker is instructive in this respect. His lab does have quite staggering amounts of computing power available to it [here are some impressive photos of their server room], but their research is so CPU-hungry that these resources still aren't adequate for the group, so they've set up a distributed computing project to get more.

Maybe this is a really strange thing to do, but I took the approach of finding an advisor whose work is of interest to me, and who has the relevant expertise to be able to give me useful feedback, instead of looking for the professor who had the most money.

As for the technical side, my question is not so much how to build a set of computers—I know how to do that much already, and frankly I'd be worried about a Computer Science PhD candidate who didn't—as how to get them working as one unit, which will save me a lot of time once its up and running. My lab's existing cluster uses OpenMosix, which is very handy because it lets us distribute jobs evenly between the machines while only having to transfer files to and issue commands from one place. I know there are other things out there, though, and I was wondering if any reader might happen to have experience with one of the alternatives.

What I've found out from asking around in other fora is that ClusterKnoppix might be useful, because it could save me having to put any kind of drives or I/O other than a network card on the cluster nodes. That in turn will save some money up front as well as reducing power consumption and heat & noise output once I get this thing up and running.

Fortunately, my problem is trivially parallel. This saves me having to mess around with parallel programming and the debugging headaches that entails, and it means that the handful of not especially fast machines to which friends have given me shell access have been very useful. It also dramatically reduces the technical requirements for a cluster, because slow inter-CPU communication won't cause many bottlenecks.

Posted: March 6, 2006 12:12 AM

Post a comment