Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 51
Posts: 51   Pages: 6   [ Previous Page | 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 202650 times and has 50 replies Next Thread
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 1868
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-04-06 Update (WU Distribution Update)

OPNG has stopped being sent out in any significant numbers, many hours ago. The last time I got something worth talking about when it comes to OPNG was 18 OPNG tasks yesterday, 2023-04-09 16:59:51 UTC. I dunno about ARP, since I do not ask for those monsters.
----------------------------------------

----------------------------------------
[Edit 2 times, last edit by Grumpy Swede at Apr 10, 2023 3:12:28 PM]
[Apr 10, 2023 3:08:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 738
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-04-06 Update (WU Distribution Update)

ARP stopped being sent out sometime over the weekend. I'm still getting MCM.

My results list backlog has been shrinking (as it is suppose to be). It looks like they got the db clean up process working again, and it has a lot to do. My guess is that they have restricted sending out WU (maybe just the big ones like OPNG and APR) so they can devote processing power to cleaning up this past mess.
[Apr 10, 2023 3:15:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1842
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-04-06 Update (WU Distribution Update)

ARP stopped being sent out sometime over the weekend. I'm still getting MCM.

My results list backlog has been shrinking (as it is suppose to be). It looks like they got the db clean up process working again, and it has a lot to do. My guess is that they have restricted sending out WU (maybe just the big ones like OPNG and APR) so they can devote processing power to cleaning up this past mess.
Well,...

The number of PV jail inmates has been drastically reduced, though I still have WUs hanging in there from back on 2/26 (with two returned results, one no reply). And for doing all this (for me, this were probably around 1,500 WUs), the credits don't seem to add up, neither "points" nor runtime... :(


Ralf
----------------------------------------

[Apr 10, 2023 4:19:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Aperture_Science_Innovators
Advanced Cruncher
United States
Joined: Jul 6, 2009
Post Count: 139
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-04-06 Update (WU Distribution Update)

ARP stopped being sent out sometime over the weekend. I'm still getting MCM.

My results list backlog has been shrinking (as it is suppose to be). It looks like they got the db clean up process working again, and it has a lot to do. My guess is that they have restricted sending out WU (maybe just the big ones like OPNG and APR) so they can devote processing power to cleaning up this past mess.
Well,...

The number of PV jail inmates has been drastically reduced, though I still have WUs hanging in there from back on 2/26 (with two returned results, one no reply). And for doing all this (for me, this were probably around 1,500 WUs), the credits don't seem to add up, neither "points" nor runtime... :(


Ralf

PV jail here is down from about 14k at its highest to right around 1700. And all of mine (at least at a quick look) dating back to February are waiting on the wingman after a No Reply / Detatched from the original send. So, progress seems to be pretty good on that front
----------------------------------------

[Apr 10, 2023 4:29:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bfmorse
Senior Cruncher
US
Joined: Jul 26, 2009
Post Count: 274
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-04-06 Update (WU Distribution Update)

Cyclops,
System status update, please.
[Apr 10, 2023 8:10:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Jesse Viviano
Cruncher
United States of America
Joined: Dec 14, 2007
Post Count: 15
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-04-06 Update (WU Distribution Update)

My one box (Linux) fails all re-tries, and I'm also unable to get new work units in both Linux and Windows. I'm glad I set a 2 day cache on the Linux box when WCG restarted. I may increase that to 3 days until confidence is restored in WCG availability.

I'm impressed the validator seems to have caught up Friday-Saturday. That's great!

Not impressed at the transient HTTP errors again even on the new flash storage array. Are we out of disk space, or is this just a stuck process?

A self-healing BOINC architecture would be really nice, if such a manual process is simple enough that it can be automated.

I am pretty sure that WCG is not out of disk space. BOINC upload servers on other projects have reported that they are out of disk space when they are unable to find any free inodes or disk space.

Einstein@home once suffered a situation where it often reported out of disk space when its storage ran out of unused inodes (file system structures used to describe files in Unix-style operating systems), and started taking too long to find used but now free inodes to take uploads with. During that time, it often reported out of disk space errors to BOINC clients. The situation was solved when Einstein@home moved to storage formatted with a later version of its file system that includes a dynamically-built B-tree of used but free inodes so that reuse of inodes that once belonged to deleted files can now be quickly recycled for new files. This is how I know that BOINC upload servers will report out of space errors to its clients if the storage cannot accept an upload.

These errors feel like the earlier WCG download errors where the download server(s) ran out of threads to allow clients to download files, and if you did manage to snag a thread to download a file, the download was super slow. The download side's problems appear to be solved by replacing the HDDs with SSDs. Now the upload side feels like it has the same problems that the download side used to have. When the download side was the bottleneck, there apparently was no way have enough work units in the field to allow result uploads to crush the upload servers. The OpenPandemics and Mapping Cancer Markers projects have very small uploads of several kilobytes that the upload server(s) can easily take. The Africa Rainfall Project has big uploads of around 105 megabytes split into up to 7 files per result by my estimation from what I can see in BOINC's upload page in the advanced view, so these big uploads are crushing the upload server(s).

I am now remembering that I was having trouble uploading before, not downloading. I think that I misremembered what I had trouble in earlier. I am remembering that downloads were never the problem, but uploads have always been the real pain point. I am posting a correction. However, the rest still stands. I am not editing my old post because it has been quoted by others before this post.
[Apr 10, 2023 10:25:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1842
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-04-06 Update (WU Distribution Update)

I am now remembering that I was having trouble uploading before, not downloading. I think that I misremembered what I had trouble in earlier. I am remembering that downloads were never the problem, but uploads have always been the real pain point. I am posting a correction. However, the rest still stands. I am not editing my old post because it has been quoted by others before this post.
Well, we/WCG had problems in the past at various times with both upload and downloads, though most of the time it was the later that was causing all the frustration, most of the times the uploads actually worked fine, and a couple of times or so, the problem was with both.

Ralf
----------------------------------------

[Apr 11, 2023 12:32:08 AM]   Link   Report threatening or abusive post: please login first  Go to top 
nivrip
Senior Cruncher
North Yorkshire
Joined: Sep 13, 2007
Post Count: 258
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-04-06 Update (WU Distribution Update)

I'm only set up to receive OPN & MCM and i noticed last week that there was a problem with uploads, although not with downloads.

However, over the last 48 hours everything is running smoothly again. Is this because the system has just about caught up and there is little, or no, backlog now?
----------------------------------------
ЮРКШИР КРУНЧЕР
[Apr 11, 2023 11:35:52 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 11798
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-04-06 Update (WU Distribution Update)

Yes, but that is because return of ARP1 units has tailed off. Once they come back there will probably be more problems.
[Apr 11, 2023 12:13:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Hans Sveen
Veteran Cruncher
Norge
Joined: Feb 18, 2008
Post Count: 633
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2023-04-06 Update (WU Distribution Update)

Hi

Just got some real fresh SCC units😍
----------------------------------------
[Edit 1 times, last edit by Hans Sveen at Apr 21, 2023 5:54:17 AM]
[Apr 21, 2023 5:53:34 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 51   Pages: 6   [ Previous Page | 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread