More duff units back in the mix so check out WU progress or the lack of it. Have just binned over 17 units two of which had 10hrs between them. One crashed on start-up, the others just didn’t show any progress. Off now to check the other rigs. :xfinger:
Update: Binned another dozen or so, yet again some with 10+ hrs with zero progress. Out of the new downloads another 3 were duff.
Was thinking it might have something to do with the new BOINC version, but it’s only on one rig as I haven’t updated the others as yet, so it is down to the WU’s and not the new proggie version.
Same here, almost everything I had was bad. Picked up QuantumFIRE as back-up job. Its in alpha stage and thought a few WU’s would help them get set up better.
10/03/2010 10:21:04|Docking|Reason: Unrecoverable error for result 1ohr_47_mod0013b_413_268165_0 (aborted by user)
10/03/2010 10:21:05|Docking|Starting 1hvk1ajv_mod0014crossdockinghiv1_3951_283140_1
10/03/2010 10:21:05|Docking|Starting task 1hvk1ajv_mod0014crossdockinghiv1_3951_283140_1 using charmm34 version 623
10/03/2010 10:21:06|Docking|Computation for task 1ohr_47_mod0013b_413_268165_0 finished
17h35m that one had been going and stuck on 0%, that one was on windows
found another couple on the linux boxes, 25hours of computation time and stuck on 0%.
10/03/2010 10:41:03 Docking Computation for task 1iiq_44_mod0013b_4519_357421_0 finished
10/03/2010 10:41:04 Docking Computation for task 1iiq_44_mod0013b_4527_293724_0 finished
I was just about to ask the more knowledgable crunchers amongst you about the no progress after 15hrs syndrome…I’ve had three machines making no progress for four days. How can I sift out the good w/units from the bad ones…or can I???:tiphat:
I just removed all the units on the machine with the same starting code, I did try a few more and left them running for 5-10mins and they all had the same issue.
Suspend all WU’s then select and run 1 or 2 at a time - dependant on number of cores. After 1 minute or so - dependant on the initial run time shown - if they haven’t shown a % increase, they are duff. Work your way through stopping those that are shown to work and then move on down through the units. Once sorted, just un-suspend the workers and bin the rest. Worked for me and because of the problem I did the same when new WU’s were downloaded onto the PC. A bit of a prat on I know, but at least you get results.
:furious: Found 1 machine with a unit at 1% after 91 hours, working through units now on all machines, if no movement from zero in 5 mins I have aborted the unit. sigh:.