Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Official Messages Forum: News Thread: 2023-04-06 Update (WU Distribution Update) |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 51
|
Author |
|
Grumpy Swede
Master Cruncher Svíþjóð Joined: Apr 10, 2020 Post Count: 1868 Status: Offline Project Badges: |
OPNG has stopped being sent out in any significant numbers, many hours ago. The last time I got something worth talking about when it comes to OPNG was 18 OPNG tasks yesterday, 2023-04-09 16:59:51 UTC. I dunno about ARP, since I do not ask for those monsters.
----------------------------------------[Edit 2 times, last edit by Grumpy Swede at Apr 10, 2023 3:12:28 PM] |
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 738 Status: Offline Project Badges: |
ARP stopped being sent out sometime over the weekend. I'm still getting MCM.
My results list backlog has been shrinking (as it is suppose to be). It looks like they got the db clean up process working again, and it has a lot to do. My guess is that they have restricted sending out WU (maybe just the big ones like OPNG and APR) so they can devote processing power to cleaning up this past mess. |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1842 Status: Offline Project Badges: |
ARP stopped being sent out sometime over the weekend. I'm still getting MCM. Well,...My results list backlog has been shrinking (as it is suppose to be). It looks like they got the db clean up process working again, and it has a lot to do. My guess is that they have restricted sending out WU (maybe just the big ones like OPNG and APR) so they can devote processing power to cleaning up this past mess. The number of PV jail inmates has been drastically reduced, though I still have WUs hanging in there from back on 2/26 (with two returned results, one no reply). And for doing all this (for me, this were probably around 1,500 WUs), the credits don't seem to add up, neither "points" nor runtime... :( Ralf |
||
|
Aperture_Science_Innovators
Advanced Cruncher United States Joined: Jul 6, 2009 Post Count: 139 Status: Offline Project Badges: |
ARP stopped being sent out sometime over the weekend. I'm still getting MCM. Well,...My results list backlog has been shrinking (as it is suppose to be). It looks like they got the db clean up process working again, and it has a lot to do. My guess is that they have restricted sending out WU (maybe just the big ones like OPNG and APR) so they can devote processing power to cleaning up this past mess. The number of PV jail inmates has been drastically reduced, though I still have WUs hanging in there from back on 2/26 (with two returned results, one no reply). And for doing all this (for me, this were probably around 1,500 WUs), the credits don't seem to add up, neither "points" nor runtime... :( Ralf PV jail here is down from about 14k at its highest to right around 1700. And all of mine (at least at a quick look) dating back to February are waiting on the wingman after a No Reply / Detatched from the original send. So, progress seems to be pretty good on that front |
||
|
bfmorse
Senior Cruncher US Joined: Jul 26, 2009 Post Count: 274 Status: Offline Project Badges: |
Cyclops,
System status update, please. |
||
|
Jesse Viviano
Cruncher United States of America Joined: Dec 14, 2007 Post Count: 15 Status: Offline Project Badges: |
My one box (Linux) fails all re-tries, and I'm also unable to get new work units in both Linux and Windows. I'm glad I set a 2 day cache on the Linux box when WCG restarted. I may increase that to 3 days until confidence is restored in WCG availability. I'm impressed the validator seems to have caught up Friday-Saturday. That's great! Not impressed at the transient HTTP errors again even on the new flash storage array. Are we out of disk space, or is this just a stuck process? A self-healing BOINC architecture would be really nice, if such a manual process is simple enough that it can be automated. I am pretty sure that WCG is not out of disk space. BOINC upload servers on other projects have reported that they are out of disk space when they are unable to find any free inodes or disk space. Einstein@home once suffered a situation where it often reported out of disk space when its storage ran out of unused inodes (file system structures used to describe files in Unix-style operating systems), and started taking too long to find used but now free inodes to take uploads with. During that time, it often reported out of disk space errors to BOINC clients. The situation was solved when Einstein@home moved to storage formatted with a later version of its file system that includes a dynamically-built B-tree of used but free inodes so that reuse of inodes that once belonged to deleted files can now be quickly recycled for new files. This is how I know that BOINC upload servers will report out of space errors to its clients if the storage cannot accept an upload. These errors feel like the earlier WCG download errors where the download server(s) ran out of threads to allow clients to download files, and if you did manage to snag a thread to download a file, the download was super slow. The download side's problems appear to be solved by replacing the HDDs with SSDs. Now the upload side feels like it has the same problems that the download side used to have. When the download side was the bottleneck, there apparently was no way have enough work units in the field to allow result uploads to crush the upload servers. The OpenPandemics and Mapping Cancer Markers projects have very small uploads of several kilobytes that the upload server(s) can easily take. The Africa Rainfall Project has big uploads of around 105 megabytes split into up to 7 files per result by my estimation from what I can see in BOINC's upload page in the advanced view, so these big uploads are crushing the upload server(s). I am now remembering that I was having trouble uploading before, not downloading. I think that I misremembered what I had trouble in earlier. I am remembering that downloads were never the problem, but uploads have always been the real pain point. I am posting a correction. However, the rest still stands. I am not editing my old post because it has been quoted by others before this post. |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1842 Status: Offline Project Badges: |
I am now remembering that I was having trouble uploading before, not downloading. I think that I misremembered what I had trouble in earlier. I am remembering that downloads were never the problem, but uploads have always been the real pain point. I am posting a correction. However, the rest still stands. I am not editing my old post because it has been quoted by others before this post. Well, we/WCG had problems in the past at various times with both upload and downloads, though most of the time it was the later that was causing all the frustration, most of the times the uploads actually worked fine, and a couple of times or so, the problem was with both.Ralf |
||
|
nivrip
Senior Cruncher North Yorkshire Joined: Sep 13, 2007 Post Count: 258 Status: Offline Project Badges: |
I'm only set up to receive OPN & MCM and i noticed last week that there was a problem with uploads, although not with downloads.
----------------------------------------However, over the last 48 hours everything is running smoothly again. Is this because the system has just about caught up and there is little, or no, backlog now?
ЮРКШИР КРУНЧЕР
|
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 11798 Status: Offline Project Badges: |
Yes, but that is because return of ARP1 units has tailed off. Once they come back there will probably be more problems.
|
||
|
Hans Sveen
Veteran Cruncher Norge Joined: Feb 18, 2008 Post Count: 633 Status: Offline Project Badges: |
Hi
----------------------------------------Just got some real fresh SCC units😍 [Edit 1 times, last edit by Hans Sveen at Apr 21, 2023 5:54:17 AM] |
||
|
|