Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 162
Posts: 162   Pages: 17   [ Previous Page | 8 9 10 11 12 13 14 15 16 17 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 196268 times and has 161 replies Next Thread
Speedy51
Veteran Cruncher
New Zealand
Joined: Nov 4, 2005
Post Count: 1220
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU Beta Test - Feb 27 2021 [ Issues Thread ]

I currently have no work like the majority of people. I am interested to know whether or not anybody noticed the GPU load %? On my RTX 2070 under Windows my tasks seem to jump between 30% 50% and 100%. I am assuming this is because they process the jobs at such a speed that it is impossible to stay at any% for very long before moving on to the next job within the task. For my load percentages I used GPU Z 2.36
----------------------------------------

[Mar 5, 2021 2:48:46 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Rickjb
Veteran Cruncher
Australia
Joined: Sep 17, 2006
Post Count: 666
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU Beta Test - Feb 27 2021 [ Issues Thread ]

My experience with 2 of i7-3770K HD4000 iGPU, Win 7 x64, HD4000 driver 10.18.10.5161
All 3 GPU betas that started ran the initial AutoGrid part (on CPU?). One then ran just one GPU sub-task and then hung, the other 2 hung without executing any GPU work - according to the WCG online log files.
One WU was server-aborted with a time-limit error, while the other 2 were user-aborted.
This behaviour may be caused by a problem similar to that found here by Crystal Pellet with his old AMD laptop.
------
More details:
WUs that started and pretended to run on the i7-3770K HD4000 iGPU, Win 7 x64, HD4000 driver 10.18.10.5161, on 2 machines:
_ BETA_OPN1_0020015_00040_3 ==> error - exceeded time limit - elapsed time (ET) 7.36h - Device: Deimos
_ _ This WU ran one GPU sub-job, taking 1m55s, then the task hung
_ _ until it got a breakpoint error, probably when resuming from device sleep.
_ BETA_OPNG_0021028_00090_3 ==> user-aborted after ET = 18h - Device: Callisto
_ BETA_OPNG_0021025_00285_3 ==> user-aborted after ET = 17.7h - Device: Deimos
Other received beta WUs were user-aborted before they started.
------
Here is the main part of the log file from BETA_OPNG_0021028_00090_3:
_ <core_client_version>7.16.11</core_client_version>
_ <![CDATA[
_ <message>
_ aborted by user</message>
_ <stderr_txt>
_ projects/www.worldcommunitygrid.org/wcgrid_beta29_autodockgpu_7.25_windows_x86_64__opencl_intel_gpu_102 -jobs OPNG_0021028_00090.job -input OPNG_0021028_00090.zip -seed 44376090 -wcgruns 1600 -wcgdpf 32
_ INFO: Using gpu device from app init data 0
_ INFO:[20:29:34] Start AutoGrid...
_ autogrid4: Successful Completion.
_ INFO:[20:30:29] End AutoGrid...
_ INFO:[20:30:30] Start AutoDock for ZINC001297736733_RX1--6y84_001_gln110-rot--CYS156.dpf(Job #0)...
_ OpenCL device: Intel(R) HD Graphics 4000
_ </stderr_txt>
------
So in the log file above, no AutoDock runs completed. The WU ran the AutoGrid part, but then just sat in memory, apparently doing nothing.
The snippet of the log file for the GPU run in BETA_OPN1_0020015_00040_3 is:
_ INFO:[03:53:22] Start AutoDock for ZINC000255623610_RX1--6lu7_001--CYS145_wcgsplit2.dpf(Job #0)...
_ OpenCL device: Intel(R) HD Graphics 4000
_ INFO:[03:55:17] End AutoDock...
------
General information that other members may find useful:
BOINC Client startup messages when the Intel HD4000 GPU is detected:
_ OpenCL: Intel GPU 0: Intel(R) HD Graphics 4000 (driver version 10.18.10.5161, device version OpenCL 1.2, 1195MB, 1195MB available, 147 GFLOPS peak)
_ 02-Mar-2021 22:38:34 [---] OpenCL CPU: Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz (OpenCL driver vendor: Intel(R) Corporation, driver version 3.0.1.10891, device version OpenCL 1.2 (Build 76427))
If your setup is recognised by BOINC and your WCG Device Profile allows GPU work on your GPU(s), each time your client tries to fetch new work you should see:
_ Requesting new tasks for CPU and Intel GPU
If you have the cc_config.xml flag coproc_debug set, these 2 lines will appear in your event log (stdoutdae.txt) once per minute while the GPU task exists:
_ [coproc] intel_gpu instance 0; 1.000000 pending for BETA_OPN1_0020015_00040_3
_ [coproc] intel_gpu instance 0: confirming 1.000000 instance for BETA_OPN1_0020015_00040_3
------
Questions arising:
The log file snippet above that shows the single AutoDock job completed indicates that the HD4000 is capable of running AutoDock4, but only sometimes.
* Why does it not run for every instance? Is a higher GPU RAM allocation, and/or higher GPU voltage needed?
* Will the debug messages in the upcoming v1.26 betas give more info on why these WUs hung?
----------
Other Info:
The HD4000 iGPUs successfully run all of the OpenCL demo programs accessible via GPU Caps Viewer >> 3D Demos.
----------
Conclusion:
The HD4000 GPU may be too slow to warrant putting much effort into getting it crunching OPN1-GPU Wus.
However, getting it actually working may provide insights into solving problems with other GPUs.
Information gained by me in getting it this far will be useful setting up and running a discrete GPU.

- HTH - Rick -
----------------------------------------
[Edit 1 times, last edit by Rickjb at Mar 5, 2021 6:51:05 AM]
[Mar 5, 2021 6:46:34 AM]   Link   Report threatening or abusive post: please login first  Go to top 
bozz4science
Advanced Cruncher
Germany
Joined: May 3, 2020
Post Count: 104
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU Beta Test - Feb 27 2021 [ Issues Thread ]

I currently have no work like the majority of people. I am interested to know whether or not anybody noticed the GPU load %? On my RTX 2070 under Windows my tasks seem to jump between 30% 50% and 100%. I am assuming this is because they process the jobs at such a speed that it is impossible to stay at any% for very long before moving on to the next job within the task. For my load percentages I used GPU Z 2.36


I noticed sth similar and my intuition to interpreting these sudden jumps in GPU util. is very much the same as yours. See parts of one of my earlier posts on 2nd March.

Anyone tried so far running multiple GPU WUs concurenntly on the same GPU? Was wondering if you can increase WU output by forcing the GPU to hold the GPU load more constantly on a high level instead of these short bursts up to 100% and then back to 0%. [...] However, due to the inherent nature of these WUs, the GPUs' VRMs are getting kicked hard. They continiously have to adjust the voltage of the GPU chip up and down according to the short intensive bursts of the computations. For the 1660 Super voltage was all over the place.
----------------------------------------

AMD Ryzen 3700X @ 4.0 GHz / GTX1660S
Intel i5-4278U CPU @ 2.60GHz
----------------------------------------
[Edit 1 times, last edit by bozz4science at Mar 5, 2021 10:37:15 AM]
[Mar 5, 2021 10:36:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
koschi
Cruncher
Joined: Dec 16, 2007
Post Count: 5
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU Beta Test - Feb 27 2021 [ Issues Thread ]

Put this into an app_config.xml in your WCG project directory:

<app_config>
<app>
<name>beta29</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>1.00</cpu_usage>
</gpu_versions>
</app>
</app_config>

[Mar 5, 2021 12:30:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU Beta Test - Feb 27 2021 [ Issues Thread ]

Good morning/evening to everyone.

I am moving batches up to production now and moving version 7.26 to the production environment. I am planning on starting the next round of beta in the next few hours.

Note: it will have a new thread for it as it is updated application.

Thanks,
-Uplinger
[Mar 5, 2021 3:33:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 1886
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU Beta Test - Feb 27 2021 [ Issues Thread ]

Thanks Uplinger. I'm starting my GPU cruncher. (iGPU HD4600 + GTX980 Strix)
Not the most modern computer, but when retired one can't upgrade all the time.
----------------------------------------

[Mar 5, 2021 4:22:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Richard Haselgrove
Senior Cruncher
United Kingdom
Joined: Feb 19, 2021
Post Count: 360
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU Beta Test - Feb 27 2021 [ Issues Thread ]

@ Uplinger,
Are you moving all variants to production at the same time?

I think there are some issues still with the intel_gpu OpenCL variant which haven't been answered yet. Rickjb's comments earlier today match my experience with a pair of HD 4600s right at the beginning. Other users have described excessive runtimes and progress display which are consistent with the pseudo-progress generated by BOINC before the first checkpoint.

I'd suggest holding the intel_gpu version back for further examination.
[Mar 5, 2021 4:28:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 1886
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU Beta Test - Feb 27 2021 [ Issues Thread ]

All tasks for my Intel HD4600 worked perfect. The only strange thing was that the elapsed times since last checkpoint went up to a count of several hours through the run (much higher than the actual runtime), despite it having checkpointed (and checkpoints counted) for each ligand.

This watched through BoincTasks, which gives more info easily seen, than Boinc Manager. No excessive runtimes either for my HD4600

Edit: Example here: https://www.worldcommunitygrid.org/ms/device/...og.do?resultId=1532147303

Edit: I run the HD4600 with a pretty old Driver: 10.18.10.3907 (if it ain't broke don't fix it)
----------------------------------------

----------------------------------------
[Edit 3 times, last edit by Grumpy Swede at Mar 5, 2021 5:33:38 PM]
[Mar 5, 2021 4:37:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU Beta Test - Feb 27 2021 [ Issues Thread ]

----CENSORSHIP ALERT----

User Peter Hucker is posting to this forum and his posts are not appearing. They are visible to anyone not logged in, or to himself, but not to any logged in user including me.

Cease this purile childish nonsense or all machines will be removed from your project.
[Mar 5, 2021 4:47:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
tegz
Cruncher
Joined: Mar 10, 2021
Post Count: 3
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics GPU Beta Test - Feb 27 2021 [ Issues Thread ]

I'm new to this work so not sure if I'm running the latest/beta version -but I did get work for GTX 950 yesterday and completed ok. I did notice later some file downloading that weren't names like the work files showing later..viz:
2705 World Community Grid 07-04-2021 11:26 Started download of 919d0f5d75e85fc2f4ddc66820555e8e.gpf
2704 World Community Grid 07-04-2021 11:26 Finished download of 89d93b7b5839cec1ff7ab63638cac930.pdbqt
2703 World Community Grid 07-04-2021 11:26 Started download of 89d93b7b5839cec1ff7ab63638cac930.pdbqt
2702 World Community Grid 07-04-2021 11:26 Started download of d8d69c0223b76abbf75e161ba6380318.pdbqt
2701 World Community Grid 07-04-2021 11:26 Project requested delay of 121 seconds
2700 World Community Grid 07-04-2021 11:26 Scheduler request completed: got 1 new tasks
2699 World Community Grid 07-04-2021 11:26 Requesting new tasks for CPU and NVIDIA GPU

2698 World Community Grid 07-04-2021 11:26 Sending scheduler request: To fetch work.
2697 World Community Grid 07-04-2021 10:49 Finished download of mip1.MIP1_00331854.2
2696 World Community Grid 07-04-2021 10:48 Finished download of mip1.MIP1_00331854.cst
2695 World Community Grid 07-04-2021 10:48 Started download of mip1.MIP1_00331854.cst

The last 3 MIP1 files are showing in new work- but not the previous numbers. I wondered if zip files were for work- or updates to client?
[Apr 7, 2021 2:50:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 162   Pages: 17   [ Previous Page | 8 9 10 11 12 13 14 15 16 17 | Next Page ]
[ Jump to Last Post ]
Post new Thread