Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Support Forum: Website Support Thread: Server Errors. [ RESOLVED ] |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 352
|
Author |
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: |
Meanwhile, WCG have not said that they're now running regular scans for large directory-files that have a high proportion of entries for deleted files. They've added more RAM to the servers, but unless there's a lid on the size of the problem they may hit the RAM limit again at some stage. The filesystem at this point contains about 13.5 million files contained in a directory structure that has 2 top level directory and 1024 sub directories for each of those two top level directories. This gives about 6,500 files per directory. We see a turnover of about 50-65% of the files every 2-3 days. It is a very volatile filesystem. The issue with the large number of directory blocks being assigned to the directory inode occurs because once GPFS assigns blocks to a directory inode, they are never released, even if the number of files in the directory is significantly reduced. We had bug last year that resulted in a very large number of files being created in the sub directories (specifically temp files were created that we not being deleted properly). This resulted in the directories having 60-70 thousand files in them. Thus GPFS assigned more blocks to store this data. Some of the directory inodes had 14MB blocks assigned. Only about 1/4 of the subdirs were impacted in this way and based on our calculations, GPFS needed about 9GB of RAM to be able to cache enough of the inodes to achieve optimal performance. We only had 2.0GB of RAM assigned to the cache and thus performance was not optimal. Once we reduced the size of the directory inodes down to a max of 1MB, the cache now only needs to be about 2.0 GB to perform optimally. We added RAM to the servers so that we could increase the cache so that there is additional RAM available. We are also adding monitoring to the servers so that we will automatically become aware if size of the directory inodes grows excessively again. |
||
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: |
Thanks, KNReed: "We are also adding monitoring to the servers so that we will automatically become aware if size of the directory inodes grows excessively again."
That's what I was really asking. It would be better if the monitoring & re-creation of directories was handled automatically by the filesystem driver software. That might avoid the need to take the system offline to do it manually. ( ==> IBM wishlist? ) I haven't looked at *ix filesystems since the 16-bit days (eg Bell Labs UNIX v6). The directory behaviour that you describe is the way it was done back then, and it's probably been done the same way ever since. On to the next WCG hurdle ... HCCGPU? |
||
|
|