You are not logged in.
We have about 10 heterogeneous machines we would like to run various jobs on. The current situation is that people log in on a machine with ssh, see if other people are running stuff on it, then use screen to run the job.
I'd like to automate this process, but I don't have enough time to install a full-fledged cluster solution. So what's the simplest thing I can do?
Thanks!
Last edited by lardon (2010-01-21 13:47:57)
Autojump, the fastest way to navigate your filesystem from the command line!
Offline
At my lab, we use torque http://www.clusterresources.com/product … anager.php, it's a bit of a pain in the ass. I was also at a place where they used scripts on top of pmake/customs; it was convenient but didn't support load management. At some point we relied on scripts that use ssh/ssh-keygen on a shared nfs to log to the remote machines, gather stats (a la top) and let people choose where to run their jobs.
Offline
At some point we relied on scripts that use ssh/ssh-keygen on a shared nfs to log to the remote machines, gather stats (a la top) and let people choose where to run their jobs.
That may well be what I'll end up doing, I just wanted to make sure nobody had already made a nice package doing exactly that
I've used torque in the past, but that's definitely not something I want to set up myself
Autojump, the fastest way to navigate your filesystem from the command line!
Offline
My work uses condor. (We have 72 CPUs and I just submitted 8000 jobs to the queue...)
Online
My work uses condor. (We have 72 CPUs and I just submitted 8000 jobs to the queue...)
Any idea how complex it is to setup and to use?
Autojump, the fastest way to navigate your filesystem from the command line!
Offline
To setup, medium complexity. To use, quite simple. You can create a wrapper script that submits all you jobs using a default set of parameters.
Online
I have dug my old scripts. You can download them at http://www-lium.univ-lemans.fr/~favre/files/mssh.tgz. They don't work anymore but it should be a good starting point. Note that you need to share everything by nfs/nis.
I also used condor at some point. It's great but it might be overkill.
[edit] wrong url.
Last edited by benob (2010-01-22 10:34:18)
Offline
At my lab, we use torque http://www.clusterresources.com/product … anager.php, it's a bit of a pain in the ass. I was also at a place where they used scripts on top of pmake/customs; it was convenient but didn't support load management. At some point we relied on scripts that use ssh/ssh-keygen on a shared nfs to log to the remote machines, gather stats (a la top) and let people choose where to run their jobs.
I used torque during an internship. It worked really well and I didn't have the impression it was complicated to set up. I mean, I didn't set it up myself but I probably came across every config file torque used and it seemed pretty easy and straightforward, I believe anybody could set one up in a couple hours.
Offline