Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 203
Posts: 203   Pages: 21   [ Previous Page | 3 4 5 6 7 8 9 10 11 12 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 45160 times and has 202 replies Next Thread
Jean-David Beyer
Senior Cruncher
USA
Joined: Oct 2, 2007
Post Count: 334
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-08-19 (Networking Issue Update)

Could you possibly be enticed to read existing forum posts?


The forums here is so poorly organized that it is necessary to read all threads on all topics to be sure of getting the relevant ones. As a practical matter, that is impossible and unreasonable.
----------------------------------------

[Aug 25, 2022 2:58:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sensualpoet
Cruncher
Joined: Nov 23, 2005
Post Count: 5
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-08-19 (Networking Issue Update)

Thanks for clarifying the other link vs scrolling down the page,

It would be nice to have personal stats again but the best news, then, is that these are now live WUs, not just test units. The rest of this will get sorted at some point.
[Aug 25, 2022 3:55:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1842
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-08-19 (Networking Issue Update)

The forums here is so poorly organized that it is necessary to read all threads on all topics to be sure of getting the relevant ones. As a practical matter, that is impossible and unreasonable.
The problem is not that the forums are so poorly organized, the problem is rather that for the last two month, "some people" are so unreasonable not to look in the appropriate subforums (like News or Support) first to see if an issue has already been mentioned and/or explained, but just post the same thing over and over again where they just happen to be, anywhere on the forum...

The answers ARE out there, you just have to read them...

Ralf
----------------------------------------

[Aug 25, 2022 4:33:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Richard Haselgrove
Senior Cruncher
United Kingdom
Joined: Feb 19, 2021
Post Count: 360
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-08-19 (Networking Issue Update)

I've just downloaded a 102 MB MCM1 dataset file at a painfully slow 350 KBps average (on a 70 MB fibre link). It took a couple of retries to grab a connection, but kept on plodding after that - no further connection drops.

In the past, BOINC projects have used external caching services for files like this, where the same file will be downloaded by multiple users. That might be worth considering while things sort themselves out.
[Aug 25, 2022 5:01:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1842
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-08-19 (Networking Issue Update)

I've just downloaded a 102 MB MCM1 dataset file at a painfully slow 350 KBps average (on a 70 MB fibre link). It took a couple of retries to grab a connection, but kept on plodding after that - no further connection drops.

In the past, BOINC projects have used external caching services for files like this, where the same file will be downloaded by multiple users. That might be worth considering while things sort themselves out.
A CDN isn't likely to help with the issues at Krembil/WCG at all.
This 102MB file (and likely those 3x/4xMB sized files of ARP1) are rather the exception than the rule of the type of files that are being transmitted to the clients.
The vast majority are <1K files, which all should fit into one Ethernet segment, followed by a large number of files with maybe a couple (dozen) KB in size. And those files are unique to one specific WU, no point in trying to cache them, you would just be stuck with the same issue, just duplicated.

350KB/sec seems slow, but when I have been able to check on the transfer of those files, once my client is getting a connection, I get those 102MB files with a minimum of 800-900KB/sec, more commonly 4-6MB/sec, all on a 500MBit/sec "Ethernet on premises" line here in the office..

As I mentioned before, I suspect a problem with the server setup at Krembil, either with the number of concurrent connections available on the IP stack or an issue with the number of active file handles on the file system side (I think that was the reason why IBM switched the WCG cluster at some point to ZFS). Neither of such issues are even remotely helped by any caching...

Ralf
----------------------------------------

[Aug 25, 2022 5:37:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Richard Haselgrove
Senior Cruncher
United Kingdom
Joined: Feb 19, 2021
Post Count: 360
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-08-19 (Networking Issue Update)

Fair enough. The examples of caching I was thinking of were for new application/version releases, where everyone wants a new binary at the same time, and - even worse - maybe a new cufft.dll file - some of those are huge. That's when caching can help, and that's why I thought it was worth adding to the brainstorming session.
[Aug 25, 2022 6:03:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1842
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-08-19 (Networking Issue Update)

Fair enough. The examples of caching I was thinking of were for new application/version releases, where everyone wants a new binary at the same time, and - even worse - maybe a new cufft.dll file - some of those are huge. That's when caching can help, and that's why I thought it was worth adding to the brainstorming session.
CDN (Content Delivery Network) Caching only helps for files that are being requested multiple times, the more (both requests and different files being requested), the more savings and possibly better performance you get. But that, by and large, doesn't apply for the WUs of WCG (or most other BOINC projects), as there are mostly unique files for each WU. That 102MB MCM file is an exception here, the only other files that would benefit would be downloading the initial project files, like all the screen savers and other files that don't change as long as you are attached to a project.

What has always puzzled me is that those <1KB files are stalling at always 107 bytes. That is in my experience a file system issue rather than a bandwidth or connection issue, as the connection in fact does get established, but the server can't (fast enough) push the data from the database to the IP stack to push that data out.

Well, let's see when the data center guy is coming back from the summer break... confused

Ralf
----------------------------------------

[Aug 25, 2022 6:36:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Jake1402
Senior Cruncher
USA
Joined: Dec 30, 2005
Post Count: 180
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-08-19 (Networking Issue Update)

Well, I think that I will set all my computers to not get new tasks. I'm not sure what they are testing now but whatever it is didn't get fixed. This way I will reduce the load on the network.
----------------------------------------
Join the Chicago-IL-USA team!
2 AMD FX 8320/AMD R9 270X/Win 10
2 AMD FX 8320/AMD RX 560/Linux Mint 20.3
Intel Pentium G240/Win 10
[Aug 25, 2022 6:48:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Adrian Ene
Cruncher
România
Joined: Feb 9, 2016
Post Count: 2
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-08-19 (Networking Issue Update)

Hi,

Issue still present today




[Aug 25, 2022 10:08:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1842
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: 2022-08-19 (Networking Issue Update)

Hi,

Issue still present today
No kidding...
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by TPCBF at Aug 25, 2022 10:26:14 PM]
[Aug 25, 2022 10:25:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 203   Pages: 21   [ Previous Page | 3 4 5 6 7 8 9 10 11 12 | Next Page ]
[ Jump to Last Post ]
Post new Thread