You are not logged in.

#1 2014-05-15 09:49:09

Never
Member
Registered: 2008-07-01
Posts: 103

Understanding cluster computing

Ok so I have written this program in python2.7 that runs on a desktop machine running Arch Linux. Without getting into the big details, this program takes a data-point, performs a series of mathematical manipulations, and each Process [from multiprocessing.Process] first creates what needs to be written to the database, places them in queues, takes them out in the main process, then places them into into new processes that write them to the database.

Everything is working but the new issue is the time, the number of calculations has gotten so high that a single point will take somewhere between 7 and 8 seconds to complete before it can grab a new entry. I know this doesn't sound like a lot of time but for this environment it is significant, truthfully I would like everything to be done in about .1 to .5 a second, question being, how do I get there?

I don't know anything about the world of super computing or high performance computing and there seems to be several ways to implement it, but for different reasons. I have an idea of what I do not want, I don't want to adjust the program to take on distributive computing at the application layer, I think this is an example of what I don't want.

I have a slight idea of what I do want:

1) Build a server rack, centered around a motherboard like this and components that work with it.

2) Have the OS only on the master of the cluster, basically I don't want a distribution day where I am stuck updating the distribution of each motherboard/CPU(s) combo in the cluster.

3) I like arch linux but am not married to it, so anything that will work will do.

4) That I can ssh in, do whatever commands are necessary and then just ./run.py the application and the OS, or whatever, handles the distribution of the workload across all the nodes in the cluster without the application needing any additional coding for it to make use of all that power (outside of the already used multiprocessing.Process).

I haven't read anything specific on the Arch Linux site about the OS doing this, I have read this about Gentoo and this other software that I think is its own OS called Rock. Am I on the right track looking at these two pieces of software? Anyone have anything to say about them, why one would be better suited to my purpose than the other? Or another well documented tutorial I could follow? Anyone every done anything like this already and knows exactly what software and configuration I should use? Thank you for your time.

Offline

#2 2014-05-15 10:49:35

jakobcreutzfeldt
Member
Registered: 2011-05-12
Posts: 1,041

Re: Understanding cluster computing

http://gridscheduler.sourceforge.net/
https://gnu.org/software/parallel/
https://gnu.org/software/gnubatch/

ProTip: don't use Python for numerical computation unless you don't mind waiting. Consider Fortran or C. If you prefer a Python-like syntax and don't mind using a new language, check out Julia.  If you must stick with Python, use PyPy, unless you're using Numpy, which (I think) is still not supported by PyPy (but I might be wrong).

Edit: also look into MPI.

Last edited by jakobcreutzfeldt (2014-05-15 10:53:25)

Offline

#3 2014-05-15 11:22:23

Trilby
Inspector Parrot
Registered: 2011-11-29
Posts: 29,523
Website

Re: Understanding cluster computing

Just to echo what jakobcreutzfeldt said, before you spend thousands of dollars on more hardware, work on optimizing the code and rewriting in C (or Fortran as suggested above - I think C has some practicality advantages without sacrificing performance, but that's another conversation).

Even if you don't know C, or don't know it well enough, you can definitely hire someone to revise/rewrite the code for much cheaper than what you'd pay for all that hardware.

Or ... use the forums and archers will compete against each other for the fastest implementation of the code for free.  Apparently we'll even test the code, and generate performance reports - all for free.  Of course I'm being mildly facetious as the code for your project may be much more complicated - but nonetheless it sonuds like a project many programmers would enjoy, so contracting someone shouldn't be hard.

Last edited by Trilby (2014-05-15 11:27:44)


"UNIX is simple and coherent..." - Dennis Ritchie, "GNU's Not UNIX" -  Richard Stallman

Offline

#4 2014-05-15 21:32:03

Never
Member
Registered: 2008-07-01
Posts: 103

Re: Understanding cluster computing

Thank you, the code is about as optimized as it is going to get, and unfortunately the application borrows heavily from selenium and sqlalchemy and I am not really in position to start rewriting in a brand new language. Again, I am looking for pointers in having a cluster, having an OS that controls all aspects of that cluster, essentially one 'system' to update and maintain (if that exist), and, for ANY application regardless of how it is written, to make full us of the resource of the entirety of that cluster as that cluster's controller sees fit (again if that exist).

Offline

Board footer

Powered by FluxBB