Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 121
Posts: 121   Pages: 13   [ Previous Page | 4 5 6 7 8 9 10 11 12 13 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 98185 times and has 120 replies Next Thread
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDDT-2 Work Unit distribution updated

Did you have a lapse, or where in the standard BOINC world is that documented to apply to WCG too?

I would assume WCG is using the code they themselves has developed and later gotten added to BOINC-source, 1st. in 2006, for so various changes to "reliable"-mechanism done in March 2008

hmmm, from reading latest [possible not yet incorporated code, thus won't copy] it says that the current reliability percent in decremented/incremented by 5% from last percent. Nowhere that I can read does it say it resets on a single error to the 10% "as were it a new device" needing again > 76 valid tasks. Kind of silly if a single task fails because of WCG app code issues, WCG then loosing a volunteer device for possible days from the reliability pool.

At any rate, ran the ropes on it, made a matrix and got a rush job after an error in much much less than 77.

edit: now I'm replying to posts that have been removed while typing. Doubts?
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Mar 15, 2010 11:16:56 PM]
[Mar 15, 2010 11:14:28 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDDT-2 Work Unit distribution updated

hmmm, from reading latest [possible not yet incorporated code, thus won't copy] it says that the current reliability percent in decremented/incremented by 5% from last percent.

Don't confuse + with *

On valid results it's multiplied, and this decreases with 5%. But, on error it adds 0.05 if you're looking on old code.
http://boinc.berkeley.edu/trac/changeset/12466

The more resent code uses 0.1. Since it's from August 2008 would expect WCG did upgrade to this code before making the v6.2.xx-client available...
http://boinc.berkeley.edu/trac/changeset/15771

Nowhere that I can read does it say it resets on a single error to the 10% "as were it a new device" needing again > 76 valid tasks. Kind of silly if a single task fails because of WCG app code issues, WCG then loosing a volunteer device for possible days from the reliability pool.

Not reset, but increase, so if you're generating multiple validation-errors in a row, you'll need more than 77 to get "reliable" again.

At any rate, ran the ropes on it, made a matrix and got a rush job after an error in much much less than 77.

Computation/client-error or validation-error? Please remember, it is the Validator that increases/decreases the host.error_rate, but the validator only looks on tasks reported as "success", not on any tasks that's been reported as bad by the client...
----------------------------------------


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
[Mar 16, 2010 1:13:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDDT-2 Work Unit distribution updated

As the A types are so long, how about sending out the quicker families first? The erlc WUs were around 750 boinc credits long. The ts02 WUs are around 550. If you send whichever family has fastest A type units next, that should tend to allow WCG to generate B and C WUs sooner.
[Mar 16, 2010 2:23:08 AM]   Link   Report threatening or abusive post: please login first  Go to top 
wplachy
Senior Cruncher
Joined: Sep 4, 2007
Post Count: 423
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDDT-2 Work Unit distribution updated

From my small view of this I suggest that 4 copies of each type A be sent to reliable devices without the "fast return" criteria. When the Minimum Quorum is met Server Abort any WUs not in progress.

The suggestion is based on what I saw from the second set of type A that were sent and then resent based on the new criteria. Of the 16 I picked up 8 were completed by the first distribution before the second group completed them and 15 were completed by first within hours of the second group. In fact, 2 of the second distribution were Server Aborted.

I believe this would give more people a chance to obtain WUs and based on this small set reduce the long, long wait for Minimum Quorum to be met that was experienced with the first set.
----------------------------------------
Bill P

[Mar 16, 2010 5:59:50 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDDT-2 Work Unit distribution updated

hmmm, from reading latest [possible not yet incorporated code, thus won't copy] it says that the current reliability percent in decremented/incremented by 5% from last percent.

Don't confuse + with *

On valid results it's multiplied, and this decreases with 5%. But, on error it adds 0.05 if you're looking on old code.
http://boinc.berkeley.edu/trac/changeset/12466

The more resent code uses 0.1. Since it's from August 2008 would expect WCG did upgrade to this code before making the v6.2.xx-client available...
http://boinc.berkeley.edu/trac/changeset/15771

Nowhere that I can read does it say it resets on a single error to the 10% "as were it a new device" needing again > 76 valid tasks. Kind of silly if a single task fails because of WCG app code issues, WCG then loosing a volunteer device for possible days from the reliability pool.

Not reset, but increase, so if you're generating multiple validation-errors in a row, you'll need more than 77 to get "reliable" again.

At any rate, ran the ropes on it, made a matrix and got a rush job after an error in much much less than 77.

Computation/client-error or validation-error? Please remember, it is the Validator that increases/decreases the host.error_rate, but the validator only looks on tasks reported as "success", not on any tasks that's been reported as bad by the client...

Let's say that you're loosing it in translation and testing. You wrote, emphasis mine:
If has a single validation-error, the computer needs atleast 77 valid tasks in a row afterwards to get below 0.002 again ...

Read the Matrix and comments. I might even for you add the up/down table how one gets from 10% to 0.2% in 76 steps. If 0.2% + 1 [validation]error adding 5% of 0.2% = 0.21% gives a different outcome than than 0.2%*1.05, give someone a call.

edit:

PS I wrote ´[possible not yet incorporated code, thus won't copy]` as in was shown the code PRESENTLY used at WCG. We´re on server version 6.01 and understand that Dr. A is not taking in any WCG code contributions until WCG has upgraded the server to what ever version is now common... and that´s being worked on. Now you need to start understanding and accepting that WCG is NOT your run off-the-mill BOINC standard grid. We're interested how things work at WCG... that's what we support. All these reconstructions how it might work here from elsewhere code is confusing members.

edit2: For your edification, the WCG present rules work that any rating below 0.1% is held at 0.1% i.e. can't get lower. To get there from a new device position needs 90 valid results. Similarly, in worst case, the rating never goes above 1.0 i.e. 100%. In latter case the daily quota has long decreased to 1 per day anyhow, per core, for a previous regularly performing device. You then need 122 valid results (it needs 48 errors to get from 10% to 100%, at WCG). Think you posted that 122 number... ah yes, here http://www.worldcommunitygrid.org/forums/wcg/...ad,28648_offset,50#271000

Note to self: Need to check if a quota 1 condition resets the rating to 1.0 (100%) or if it keeps normally incrementing per the 5% per error rule until 1.0 is reached.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 2 times, last edit by Sekerob at Mar 16, 2010 9:56:17 AM]
[Mar 16, 2010 6:48:23 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDDT-2 Work Unit distribution updated

As the A types are so long, how about sending out the quicker families first? The erlc WUs were around 750 boinc credits long. The ts02 WUs are around 550. If you send whichever family has fastest A type units next, that should tend to allow WCG to generate B and C WUs sooner.

The ts are verification tasks and of limited quantity [I think] put through the ABC cycle, and not sure but think all A-type of that set has already left. The meat the erlc, hivp etc have a fairly stable run time, the erlc A-type pretty much complete. The slowing factor is that the results have to be send to the scientists, not WCG techs who do the replication and feeding.

erlc A type done, we've got 17 targets x 1000 x 2 more of these A types to go.

wplachy,

4 copies, means that potentially all 4 are started before any validation can be attempted which means throwing 30-100+ CPU hours in the bin. The client has to take the contact initiative, do the ready to report clearing [on next work fetch for instance], so depending on settings that can be additional multiple hours, so not in favor of that suggestion.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Mar 16, 2010 7:07:29 AM]   Link   Report threatening or abusive post: please login first  Go to top 
wplachy
Senior Cruncher
Joined: Sep 4, 2007
Post Count: 423
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDDT-2 Work Unit distribution updated

Sekerob
I understand your point, but without the A completion B+C doesn't happen. I personally have no problem with burning the cycles or time to move this project forward.
----------------------------------------
Bill P

[Mar 16, 2010 8:14:28 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDDT-2 Work Unit distribution updated

My interpretation is that the progress, or lack there off, by sticking to the best grid efficiency was a conscious decision. It might be different were the latest client version stable and widely installed, in combo with the latest BOINC server version. Neither is.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Mar 16, 2010 8:28:47 AM]   Link   Report threatening or abusive post: please login first  Go to top 
I need a bath
Senior Cruncher
USA
Joined: Apr 12, 2007
Post Count: 347
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDDT-2 Work Unit distribution updated

Also, I might add, throwing cycles at this project takes them away from other projects. Think of how much we are doing for the other projects while we are forced to wait!
----------------------------------------

[Mar 16, 2010 11:05:13 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDDT-2 Work Unit distribution updated

Shuffling a little...
Let's say that you're loosing it in translation and testing. You wrote, emphasis mine:
If has a single validation-error, the computer needs atleast 77 valid tasks in a row afterwards to get below 0.002 again ...

PS I wrote ´[possible not yet incorporated code, thus won't copy]` as in was shown the code PRESENTLY used at WCG.

Well, I've been using 0.1, since this is the current BOINC-code it is by far the easiest to look-up, and didn't even remember initially it had been 0.05 before. For the AFAIK wast majority of WCG-users, they aren't getting validation-errors anyway, so the important to know is the initial 77 to get "reliable".

But anyway, to show the calculations...

Read the Matrix and comments. I might even for you add the up/down table how one gets from 10% to 0.2% in 76 steps. If 0.2% + 1 [validation]error adding 5% of 0.2% = 0.21% gives a different outcome than than 0.2%*1.05, give someone a call.

Using % can easier lead to errors, but since you wants it in this way...

Start, new host: initial 10%:
76 validated gives 10% * 0.95^76 = 0.202% > 0.2% in other words not reliable.
77 validated gives 10% * 0.95^77 = 0.193% < 0.2% and can be reliable.

For error after 77 validated: Starts at 0.193%:
0.193% * 0.95 + 5% = 5.183%
To become reliable again:
5.183% * 0.95^63 = 0.205% > 0.2% and not reliable.
5.183% * 0.95^64 = 0.194% < 0.2% and reliable again.

Or, if you uses the 0.1, to be a little more "conservative" in the estimates:
0.193% + 10% = 10.193%
10.193% * 0.95^77 = 0.196% < 0.2% and reliable again.

edit2: For your edification, the WCG present rules work that any rating below 0.1% is held at 0.1% i.e. can't get lower.

In latter case the daily quota has long decreased to 1 per day anyhow, per core, for a previous regularly performing device.

(it needs 48 errors to get from 10% to 100%, at WCG).

confused

None of these statements seems to be consistent with either the currently checked-in BOINC-code nor the old code supplied by WCG and checked-in in 2008...

So, new code not yet supplied by WCG to BOINC?
Or, if old checked-in code, do you have any exact links to the particular code/version you're looking at?
----------------------------------------


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
----------------------------------------
[Edit 1 times, last edit by Ingleside at Mar 16, 2010 6:18:34 PM]
[Mar 16, 2010 6:01:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 121   Pages: 13   [ Previous Page | 4 5 6 7 8 9 10 11 12 13 | Next Page ]
[ Jump to Last Post ]
Post new Thread