F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

Pudge · 2009-06-16 03:08:58

diesel1 wrote:

I noticed the GPU temperature rise from 65C to 75C within a few minutes.
EDIT: After around 15 minutes my system hard-locked (very rare). I need some sleep so I will update on what happens later......

Most 9800GT video cards are set at the factory to have the fan running at 50% or less to keep it somewhat quiet. To make the GPU run cooler, use nvclock (extra) to set the fan speed to about 65 to 75%. This keeps my 9800GT cooler without being too obnoxious to listen to. On my 9800GTX+ I ended up putting a 90 mm fan blowing on the component side of the card.

http://www.linuxjournal.com/content/adj … phics-card

Pudge

Last edited by Pudge (2009-06-16 03:12:31)

diesel1 · 2009-06-16 10:27:12

Hi all,

I left the system running CPU and GPU clients and everything was ok when I checked this morning. The graphics card seems to be ok at 74C.

Diesel1.

hatten · 2009-06-16 13:24:10

How long is the timelimit for a package? I consider putting f@h at my familys computer, but they ain't powered on more than maybe 4 hours a day.

diesel1 · 2009-06-16 21:04:16

hatten wrote:

How long is the timelimit for a package? I consider putting f@h at my familys computer, but they ain't powered on more than maybe 4 hours a day.

Hi hatten,

If you have a high spec system that can complete the unit then go for it. Also you could request smaller units, memory and complexity requirements that is.

Diesel1.

Last edited by diesel1 (2009-06-16 21:11:49)

diesel1 · 2009-06-16 21:19:12

Hi all,

The GPU is currently running at 45C.

Nooooo! the GPU client has stalled/crashed!

No wonder the GPU was at 45C, I restarted it and the GPU went straight to 74C with barely a step in fan speed.

Do the W/U times ramp up like the CPU ones do?

Should I use the g80 option?

Diesel1.

PS. I wish the GPU client was more KISS, more Arch, Linux native.

Last edited by diesel1 (2009-06-17 09:14:44)

hatten · 2009-06-18 18:44:32

diesel1 wrote:

hatten wrote:
How long is the timelimit for a package? I consider putting f@h at my familys computer, but they ain't powered on more than maybe 4 hours a day.
Hi hatten,
If you have a high spec system that can complete the unit then go for it. Also you could request smaller units, memory and complexity requirements that is.
Diesel1.

well, they ain't especially high spec, but there's still a lot of CPU power that ain't used, even though it won't give me units especially often, it is still better that nothing.
But the question remains, what is the timeout for a package, how fast does it have to be finished in order to help?

diesel1 · 2009-06-18 20:19:26

hatten wrote:

diesel1 wrote:
hatten wrote:
How long is the timelimit for a package? I consider putting f@h at my familys computer, but they ain't powered on more than maybe 4 hours a day.
Hi hatten,
If you have a high spec system that can complete the unit then go for it. Also you could request smaller units, memory and complexity requirements that is.
Diesel1.
well, they ain't especially high spec, but there's still a lot of CPU power that ain't used, even though it won't give me units especially often, it is still better that nothing.
But the question remains, what is the timeout for a package, how fast does it have to be finished in order to help?

Hi hatten,

The faq says:

"Are there any limits to how long my machine can take to finish a work unit (WU)?

Yes. Work Units are serial in nature. When a completed WU is sent back, a new work unit is generated from those results. This must happen many times over within each project (group of work units). A generation 1 work unit must be turned in before a generaton 2 work unit is created and sent out.
To keep these generations moving along, we have to set expiration deadlines in the event a work unit is not uploaded in a timely manner (lost, deleted, whatever). These unfinished work units "expire" and are reassigned to new machines. You will still receive credit for all WUs completed and uploaded prior to the preferred deadline. However, after the preferred deadline, your contribution is not as useful scientifically because another copy of that work unit had to be sent out to another contributor. Even if you eventually complete the work unit, that other contributor still had to process duplicate work to assure the science moves forward. And it would be unfair not to also credit that second contributor.
Even so, full credit is given up until the final deadline. After the final deadline has expired, the client will automaticlly discard the work unit and download new work. If you have trouble completing work units before the preferred deadline, it is recommended to either run the FAH client more hours each day, or to run the client on a faster computer.
As we move to larger and longer WUs, we will extend the expiration time as needed. Deadlines vary on the order of a few days to a several weeks, depending on the nature of the WU. Turn in a work unit just before the deadline is not the goal. It is most helpful to the project to return work units as quickly as possible. And how these deadlines are determined is explained a few answers below."

Also the faq says this about deadlines:

"How do you set the deadlines for the work units?

Each work unit is benchmarked on a dedicated 2.8 GHz Pentium 4 machine with SSE2 disabled. For most work units (although there may be exceptions, described in the next paragraph), we apply this equation:
timeout = 20 * (daysPerWU) + 2 deadline = max(30* (daysPerWU) + 2,10)
where daysPerWU is the number of days it took to complete the unit. The "+2" days is there to give an additional buffer for fast WUs (to allow for servers down, etc). If 30*daysPerWU is less than 10 days, we set the deadline to 10 days, as a minimum time for all projects. The timeout is the time at which the WU is resent to another client and the deadline is the last time that we will give stats credit for the WU.
Occasionally, deadlines may be set shorter or longer than the above calculation indicates, but the reason for having deadlines at all is that the sooner we get back work units, the sooner we can put the results to good use. Also, different projects have different requirements server-side and may require shorter or allow longer deadlines (e.g. "pfold" calculations can often be run without any deadlines, whereas MREMD calculations work best with very tight deadlines). The assignment server does take machine performance into account in making assignments, thereby allowing slower machines to receive more appropriate work units."

I think you should start folding. You will notice from the client and the stats if you are not getting the units done quickly enough to meet deadlines and then, as the faq says, you should leave your computer on longer!

All the best,

Diesel1.

hatten · 2009-06-19 00:43:51

Ah, thanks, although i should have checked the faq myself =p
I will check the CPU speed of the computers, and if it's too low i won't do anything, but else i will add them

jac · 2009-06-19 23:19:23

I just figured I better drop by and thank everyone for their assistance in setting up the GPU client, it works now, thanks! Also, is it bad to have multiple clients using the same machine ID? What happens when you get more than 16 "machines"?

diesel1 · 2009-06-21 00:11:39

hatten wrote:

Ah, thanks, although i should have checked the faq myself =p
I will check the CPU speed of the computers, and if it's too low i won't do anything, but else i will add them

Hi hatten,

Are you the 'Hatten' on kakao stats?

If so congratulations on joining the fold.

Diesel1.

Pudge · 2009-06-21 00:44:57

jac wrote:

I just figured I better drop by and thank everyone for their assistance in setting up the GPU client, it works now, thanks! Also, is it bad to have multiple clients using the same machine ID? What happens when you get more than 16 "machines"?

You can have the same machine ID on totally different computers, but you can't use the same Machine ID for multiple clients on the same computer. On one of my computers, I am running 2 SMP clients and 1 GPU client. All three clients on that computer must have different Machine IDs. The term Machine ID is confusing, Client ID would make more sense.

You can have more than 16 different computers, all with a client as Machine ID 1, so that's not a problem. I doubt that anyone would ever have more than 16 clients on a single computer.

Pudge

Last edited by Pudge (2009-06-21 00:46:24)

hatten · 2009-06-21 08:10:45

diesel1 wrote:

hatten wrote:
Ah, thanks, although i should have checked the faq myself =p
I will check the CPU speed of the computers, and if it's too low i won't do anything, but else i will add them
Hi hatten,
Are you the 'Hatten' on kakao stats?
If so congratulations on joining the fold.
Diesel1.

That's my main computer that has been folding for a while, i haven't added the family's computers yet..if i will.

jac · 2009-06-21 13:44:30

Pudge:

OK, thank you very much for the information. That makes much more sense now that you've explained it.

diesel1 · 2009-06-23 23:16:25

Hi all,

Since updating my system, yes kernel, nvidia et al, my gpu client was giving me EUE reports. Now sfter rebooting my gpu client is stopping with the UNSTABLE_MACHINE error........

[10:01:02] Entering M.D.
Reading file work/wudata_05.tpr, VERSION 3.1.4 (single precision)
Reading file work/wudata_05.tpr, VERSION 3.1.4 (single precision)
Reading sasa-enabled ir 0 0
Initializing Nvidia gpu library
Run: exception thrown during GuardedRun
[10:01:09] Run: exception thrown during GuardedRun
[10:01:09] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[10:01:09] Going to send back what have done -- stepsTotalG=0
[10:01:09] Work fraction=0.0000 steps=0.
[10:01:13] logfile size=4993 infoLength=4993 edr=0 trr=23
[10:01:13] - Writing 5529 bytes of core data to disk...
[10:01:13] Done: 5017 -> 1884 (compressed to 37.5 percent)
[10:01:13] ... Done.
[10:01:13]
[10:01:13] Folding@home Core Shutdown: UNSTABLE_MACHINE

Were should I start to solve this?

Any help would be appreciated,

Diesel1.

Last edited by diesel1 (2009-06-24 13:28:26)

Pudge · 2009-06-25 00:41:48

I would stop the gpu client, remove the work folder and contents, remove queue.dat, remove unitinfo.txt, then restart the client and start with a fresh work unit. The other work unit may have been corrupted when it was shut down. If it still doesn't work with a fresh work unit, then visit the folding@home forums and see if there is an issue with the 2.6.30 kernel and the GPU client.

Pudge

diesel1 · 2009-06-25 22:23:34

Pudge wrote:

I would stop the gpu client, remove the work folder and contents, remove queue.dat, remove unitinfo.txt, then restart the client and start with a fresh work unit. The other work unit may have been corrupted when it was shut down. If it still doesn't work with a fresh work unit, then visit the folding@home forums and see if there is an issue with the 2.6.30 kernel and the GPU client.
Pudge

Hi Pudge,

I noticed since my last post that the 6.24b client was suffering the 'Error: Could not get length of results file work/wuresults_03.dat' problem' for the last 24 hours. After installing the fah6_static binary the client seemed to begin folding a new WU.

As for the wine/GPU client I think I must uninstall my combination of the wrapper in the AUR and your very helpful mini-guide (is it in the wiki?). I am using the 2.2 CUDA toolkit and nVidia driver 185.18.14. I wondered if you knew just what to remove, just the AUR files or follow that with a rerun of your how to?

Anyway I will let you know what happens.

Diesel1.

Ouch my PPD is way down!

whaevr · 2009-06-25 23:10:35

I would start with the nvidia driver, I'm almost confident thats what is causing the problem. I used the 2.2 cuda toolkit fine with the 180.51 driver

diesel1 · 2009-06-26 00:00:23

whaevr wrote:

I would start with the nvidia driver, I'm almost confident thats what is causing the problem. I used the 2.2 cuda toolkit fine with the 180.51 driver

I think you are correct whaevr. When I run the AUR wrapper installation again all the packages are identical, the nVidia driver is the major change.

You do mean regress the driver? I do have the 180.51 driver in /var/lib/pacman/cache but when I try to install it pacman says it needs kernel26!! How do you downgrade the driver?

At least the cpu client seems fine now.

Many thanks,

Diesel1.

UPDATE: It seems from pacman that nvidia 180.51 requires kernel < 2.6.30!

What now?

Last edited by diesel1 (2009-06-26 00:15:22)

Pudge · 2009-06-26 01:13:35

@ diesel1

Unfortuneatly I am not able to try the newest Kernel and Nvidia driver and see if they work for me or not. All my Arch computers with the WINE and GPU setups are out of commission for another week or two. I am in the process of having the entire basement remodeled, new drywall, upgrading the electrical circuits, lighting, carpets, etc. Until that is done, most of my folding farm will be inactive. That's why my production has dropped drastically.

And yes, while the basement was down to bare wall studs, and bare ceiling joists, I took advantage of it and ran Cat 6 cables all through the house.

On the bright side. By the time I put my Arch/WINE/GPU computers back in action, you guys will have this all worked out!

Pudge

diesel1 · 2009-06-26 18:57:38

Pudge wrote:

@ diesel1
Unfortuneatly I am not able to try the newest Kernel and Nvidia driver and see if they work for me or not. All my Arch computers with the WINE and GPU setups are out of commission for another week or two. I am in the process of having the entire basement remodeled, new drywall, upgrading the electrical circuits, lighting, carpets, etc. Until that is done, most of my folding farm will be inactive. That's why my production has dropped drastically.
And yes, while the basement was down to bare wall studs, and bare ceiling joists, I took advantage of it and ran Cat 6 cables all through the house.
On the bright side. By the time I put my Arch/WINE/GPU computers back in action, you guys will have this all worked out!
Pudge

Wow Pudge, your new setup sounds nice!

I guess having new stuff (kernel etc.) is going to cause my PPD to fall as well.

TTFN,

Diesel1.

whaevr · 2009-06-27 01:59:23

I guess we just either wait for the nvidia driver to be updated, or the wrapper.

*waits*

diesel1 · 2009-06-27 18:32:24

whaevr wrote:

I guess we just either wait for the nvidia driver to be updated, or the wrapper.
*waits*

Or......

We could start up our very own 'OpennVidia' driver!

It would probably be finished by .... Tuesday or Wednesday next week.

*waits* until at least Tuesday.

Diesel1.

Last edited by diesel1 (2009-06-27 18:33:52)

Pudge · 2009-06-27 22:25:28

OK, the suspense was killing me. The Utility room is where my folding farm is. Right now the utility room is temporary storage for most of the stuff from the other rooms in the basement. I moved some things, and climbed over some things, and got to my folding farm.

I fired up one of my Arch/WINE/GPU computers, and did a pacman -Syu. Tried to start the GPU client and it said I had an Unstable machine. Same symptoms you guys are having.

Next, I removed nvidia-185.18.14-1 and nvidia-utils-185.18.14-1 and I think I removed lib32-nvidia-utils. I had to use pacman -Rd nvidia-utils to remove nvidia-utils because of existing dependencies. Using pacman -U I down graded the kernel to kernel26 2.6.29.4-1

Then used pacman -U to load nvidia-180.51-1 and nvidia-utils-180.51-1 and lib32-nvidia-utils-180.51-1. I had all these files in
/var/cache/pacman/pkg as I haven't cleared the pacman cache for a while.

Rebooted, and the GPU client started up fine and is currently at 14% on the Work Unit.

Basically, my Arch 64 installation is totally up to date including the rpcbind thing, except for kernel26-2.6.29.4 and the 180.51 nvidia drivers. So, obviously there is a problem in one or both of them and down grading them will work around the problem for now.

HTH

Pudge

diesel1 · 2009-06-28 12:07:30

Pudge wrote:

OK, the suspense was killing me. The Utility room is where my folding farm is. Right now the utility room is temporary storage for most of the stuff from the other rooms in the basement. I moved some things, and climbed over some things, and got to my folding farm.
I fired up one of my Arch/WINE/GPU computers, and did a pacman -Syu. Tried to start the GPU client and it said I had an Unstable machine. Same symptoms you guys are having.
Next, I removed nvidia-185.18.14-1 and nvidia-utils-185.18.14-1 and I think I removed lib32-nvidia-utils. I had to use pacman -Rd nvidia-utils to remove nvidia-utils because of existing dependencies. Using pacman -U I down graded the kernel to kernel26 2.6.29.4-1
Then used pacman -U to load nvidia-180.51-1 and nvidia-utils-180.51-1 and lib32-nvidia-utils-180.51-1. I had all these files in
/var/cache/pacman/pkg as I haven't cleared the pacman cache for a while.
Rebooted, and the GPU client started up fine and is currently at 14% on the Work Unit.
Basically, my Arch 64 installation is totally up to date including the rpcbind thing, except for kernel26-2.6.29.4 and the 180.51 nvidia drivers. So, obviously there is a problem in one or both of them and down grading them will work around the problem for now.
HTH
Pudge

Hi Pudge, I tried to do this as well but I forgot about the lib32-nvidia-utils package.

My GPU client now seems to be working.

Diesel1.

darkenergy · 2009-06-29 13:44:54

My 700 MHz Pentium III is now folding for Arch. It will take a while for the first unit to complete . . .

Arch Linux

#351 2009-06-16 03:08:58

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#352 2009-06-16 10:27:12

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#353 2009-06-16 13:24:10

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#354 2009-06-16 21:04:16

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#355 2009-06-16 21:19:12

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#356 2009-06-18 18:44:32

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#357 2009-06-18 20:19:26

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#358 2009-06-19 00:43:51

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#359 2009-06-19 23:19:23

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#360 2009-06-21 00:11:39

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#361 2009-06-21 00:44:57

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#362 2009-06-21 08:10:45

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#363 2009-06-21 13:44:30

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#364 2009-06-23 23:16:25

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#365 2009-06-25 00:41:48

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#366 2009-06-25 22:23:34

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#367 2009-06-25 23:10:35

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#368 2009-06-26 00:00:23

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#369 2009-06-26 01:13:35

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#370 2009-06-26 18:57:38

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#371 2009-06-27 01:59:23

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#372 2009-06-27 18:32:24

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#373 2009-06-27 22:25:28

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#374 2009-06-28 12:07:30

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

#375 2009-06-29 13:44:54

Re: F@H Arch Linux Team - Recruitment & Stats thread - HELP US !.

Board footer