Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: OpenPandemics GPU Beta Test - March 12 2021 [ Issues Thread ] |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 160
|
Author |
|
Vester
Senior Cruncher USA Joined: Nov 18, 2004 Post Count: 323 Status: Offline Project Badges: |
Well, whatever the reason for the pause in this Beta, I'm saving lots of electricity. I switched back to Milkyway@home so my wife won't see a fluctuation of the power bill. ;) |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
Good evening everyone. I apologize for being MIA the past few days. Here are some of the developments that have been going on.
1. I got a new Mac build machine that I was working towards getting up and configured properly. However, I have time boxed that and ran out of that time :P So for now the next version of the application, 7.28 will be built using the old build machine. 2. We did get the code changes and they required a manual review of it from myself and another developer on WCG to make sure that everything was in good shape. This passed our examines with an A-. It did have an issue with windows and using std::min and std::max...but we corrected those in the version sent to you. 3. I started an Alpha test earlier today and have seen results coming back from those machines and things are looking proper. very good chance that we will see a beta tomorrow morning (my time, Texas ya'll [ say that with a good texan accent in your head] ) 4. I am increasing the max elapsed time from the 20x to 100x as the differences between the top of the line GPU and the some of the intel gpus differ greatly. 5. The next beta will still include the dlg file that needs to be uploaded as these results will go back to the researchers to make sure that the rehydration of the results is what they expect. 6. I have been going over the scheduler and checking to see about fixing the credits per result that are given. I still do not have that fixed, however, I will not keep points from launching this to production if a solution is not available. I understand points are important to some, but having GPU running sooner than later is best in my opinion. Thanks, -Uplinger I seem to have a pretty good trend of end of the week beta tests...not exactly the pattern I was going for, but a pattern none the less. |
||
|
Grumpy Swede
Master Cruncher Svíþjóð Joined: Apr 10, 2020 Post Count: 1886 Status: Offline Project Badges: |
Thanks for the update!
----------------------------------------Regarding the credits: The first GPU Beta run, did have working credit that was at least somewhat proper(ish). Maybe something in the code was changed after that, since after that first run, the credit fell drastically, ending up at zero in the latest Beta run. Also, the Estimated task size (GFLOPs) were much much higher in the early Betas that in the later ones, even though the latest one were supposedly larger production size units. Looking forward to the next run. [Edit 4 times, last edit by Grumpy Swede at Mar 26, 2021 3:04:12 AM] |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
Unfortunately for the credit on the first GPU it didn't have knowledge of previous points, so it was skewed to grant the max points allowed...which didn't seem wrong at the time...but it going to 0 is pretty much exactly the wrong thing I want to happen. I have toyed with the idea of granting points based on how many ligands are processed per work unit. However since it's not apples to apples with the CPU version, that seems incorrect. But not ruled out entirely.
As for the estimated size, I did use the average size reported by all results, however, the last beta had older cards able to operate because of the single precision fix. This caused the difference between the fastest and the slowest to diverge even greater. I have it on my release plan for tomorrow to have those updated for the new results sent out. Thanks again for the feedback and all the help in the forums. -Uplinger |
||
|
Grumpy Swede
Master Cruncher Svíþjóð Joined: Apr 10, 2020 Post Count: 1886 Status: Offline Project Badges: |
Thanks for the explanation Uplinger. I'm sure the credit issue will be fixed sooner or later. I agree though, with what you say in your previous post:
----------------------------------------...."having GPU running sooner than later is best in my opinion" Edit, added: After all, points are just points, and they're certainly not scientific results. You can't buy anything that I know of, with BOINC credits [Edit 1 times, last edit by Grumpy Swede at Mar 26, 2021 3:42:08 AM] |
||
|
Richard Haselgrove
Senior Cruncher United Kingdom Joined: Feb 19, 2021 Post Count: 360 Status: Offline Project Badges: |
As for the estimated size, I did use the average size reported by all results, however, the last beta had older cards able to operate because of the single precision fix. This caused the difference between the fastest and the slowest to diverge even greater. I have it on my release plan for tomorrow to have those updated for the new results sent out. I don't quite understand your point here. The 'size' of a task (rsc_fpops_est) should be a dimensionless number - the number of floating point operations needed to compute the result required. Apart from a single experiment at SETI@Home, over 10 years ago, no BOINC science application that I know of has ever counted or reported the number of flops performed.What BOINC can do, and does, is report time. Not unnaturally, the fastest devices report the shortest time. Taking an average of time, over these limited test runs, is open to distortion - the fastest cards will also complete a greater number of tasks before they run out, and exaggerate the number of short times in the sample. Artificially boosting the rsc_fpops_bound will be helpful in getting a fuller picture of the results returned, but should not be necessary for regular running. Because the time limit is calculated pro-rata for each host, based on the known size of the task and speed of the device, there should be no need to adjust the bound once a realistic value for rsc_fpops_est is established. The other situation where the time estimate can be exceeded is when an application stalls and no further mathematical progress is being made. In the case, the placebo effect of BOINC's reported pseudo-progress can direct the attention in the wrong direction. Extending the bound in this case will merely waste more electricity before the task is eventually aborted. I am new to this particular project, and I don't yet have a full understanding of the procedure for assessing points or credits here. I'll refrain from commenting on those until I know more. |
||
|
supdood
Senior Cruncher USA Joined: Aug 6, 2015 Post Count: 333 Status: Offline Project Badges: |
Uplinger--Thank you for your continued hard work to get the GPU version ready for production.
----------------------------------------On credits, I think a good compromise would be to select an average credit value and simply assign that value to all validated WUs, like some other BOINC projects do. Some WUs will be larger, some smaller, but the credits granted would average out in the end. This seems like a better solution than either 0 credit or delaying the release. ---------------------------------------- [Edit 1 times, last edit by supdood at Mar 26, 2021 11:50:47 AM] |
||
|
cehunt
Senior Cruncher CANADA Joined: Oct 10, 2011 Post Count: 172 Status: Offline Project Badges: |
Hi:
Well, the GeForce GTX 660M VC in my Alienware laptop bombed out again. Anybody else experienced a similar failure? Clive Hunt |
||
|
goben_2003
Advanced Cruncher Joined: Jun 16, 2006 Post Count: 145 Status: Offline Project Badges: |
Hi: Well, the GeForce GTX 660M VC in my Alienware laptop bombed out again. Anybody else experienced a similar failure? Clive Hunt Hi Clive, As discussed previously in these beta threads, it seems that the 660M along with other older nvidia GPUs only support openCL 1.1 despite the driver saying it supports openCL 1.2. This may be because nvidia uses the same driver across many generations of GPUs and some of the newer ones do have openCL 1.2. Here is nvidia's specifications page for the 660M. It says openCL 1.1: https://www.nvidia.com/en-us/geforce/gaming-l...-gtx-660m/specifications/ Also, you may want to move to the newer beta thread: OpenPandemics GPU Beta Test - March 26 2021 [ Issues Thread ] |
||
|
cehunt
Senior Cruncher CANADA Joined: Oct 10, 2011 Post Count: 172 Status: Offline Project Badges: |
Hi:
I have a GTX660M VC in my laptop as well. I am curious what VC did you replace it with. Cehunt |
||
|
|