Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 2802
Posts: 2802   Pages: 281   [ Previous Page | 179 180 181 182 183 184 185 186 187 188 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 549468 times and has 2801 replies Next Thread
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 11798
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Thank you for the confirmation.

Mike
[Jan 25, 2022 12:12:57 AM]   Link   Report threatening or abusive post: please login first  Go to top 
MJH333
Senior Cruncher
England
Joined: Apr 3, 2021
Post Count: 199
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Mike,

I've now picked up ARP1_0031033_092, a triplet with 36 second timestep.

Also, ARP1_0033231_090, which I mentioned yesterday, has now validated. I was puzzled to see that it was a twin, rather than a triplet. I thought all these early generation tasks would be triplets, but I've probably got the wrong end of the stick here!

Cheers,
Mark
[Jan 25, 2022 1:12:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 234
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

ARP1_0034392_089 has completely crashed out. Eight systems, eight errors.
[Jan 25, 2022 3:17:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 11798
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Thank you, Mark.

31033 has been recorded.

You were right about triplets. I suspect it might be one of a very few that might have been unstuck this week. Its next incarnation may be as a triplet.

Mike
[Jan 25, 2022 4:44:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 11798
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Thank you spRocket. I thought it would.

Mike
[Jan 25, 2022 4:46:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 11798
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Mini update.

The leaders have all moved on a generation to 131,121 & 116.

I suspect that a very few of the stuck units have been set free, but 2 of the previously unstuck units have now stuck. I reckon there are about 67 units still stuck.

Mike
[Jan 25, 2022 5:11:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

A quick update on the stuck workunits.

There are 47 stuck workunits left - the rest have been set in motion.

Of these:

  • 3 might have to be re-run from the beginning
  • 3 cannot be processed even with changing the more granular step_size of 24
those 6 will be examined by Delft before we can get them running again.

Of the remaining 41, we are re-running them on our servers before we put them back on the grid. This is a slightly slow process but it allows us to be sure that we understand the issue and that they are running properly before we send them out again. Some have to be re-run for multiple generations and as a result we have only been able to put 4-5 back into circulation each day. With the help of Delft, we have a way to detect the issue in the validator so we will identify this issue in the first generation it occurs in from now on so we shouldn't get these stuck jobs again (we will still have to periodically re-run the jobs with a smaller step size).

I hope to have the remaining 41 running again within the next 7-10 days.
[Jan 25, 2022 7:43:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 11798
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Thank you, Kevin. Now we know.

Have you allowed for 0033711_099 and 0034392_089 restuck this week?

Cheers

Mike
----------------------------------------
[Edit 2 times, last edit by Mike.Gibson at Jan 25, 2022 10:15:22 PM]
[Jan 25, 2022 10:13:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 1980
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Another undiscovered triplet, 32-bit, time_step=36:
workunit 130444485
ARP1_0004211_085_0  Linux Fedora  Valid    2022-01-25T15:27:12  2022-01-26T07:03:02   14.12/14.42    576.9/609.0 
ARP1_0004211_085_1 Linux Ubuntu Valid 2022-01-25T15:26:24 2022-01-26T05:56:14 13.25/13.29 641.1/609.0
ARP1_0004211_085_2 Linux Debian InProgr 2022-01-25T15:33:38 2022-01-30T03:33:38 0.00/0.00 0.0/0.0


Cheers,
Adri
[Jan 26, 2022 11:07:45 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Work Available

ARP1_0034392_089 has completely crashed out. Eight systems, eight errors.

I received this one with a 24 second time step. It has been running about 2 hours so far
[Jan 26, 2022 2:44:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 2802   Pages: 281   [ Previous Page | 179 180 181 182 183 184 185 186 187 188 | Next Page ]
[ Jump to Last Post ]
Post new Thread