Anyone here has any experience of seeing rename() system call stuck for 100 seconds or even upto 7-8 mins..??
I have a code which parses files and put them in output directories according to the configuration. I use rename() to move the files. But sometimes rename() stucks, it doesnt return anywhere from 10 seconds to 10 mins.. Most of the times it works fine.. i have no idea why!!!
the system is HP Itanium. Though its a very busy system with 5 processors and 98% of cpu usage but i dont think this will make rename stuck and that too for 10mins or even 10seconds..
I thought to post it here for some suggestions or feedbacks otherwise the last thing i will do is debug into rename. :(
It is probably IO contention but it could be other things going on in the kernel. Is there a swap partition on the same local disk? Are you able to duplicate the problem in your development or test environment?
In any event, this is where I usually call on our sysadmins for help. They have tools to help identify these sorts of problems.
Debugging rename() isn't going to get you anywhere. You are almost certainly going to find that it is stuck in the kernel syscall.
i am sure it will be a difficult task to find the problem.. but then i have to find some solution.. you are correct, it might be blocking somewhere deep down which is not possible to debug..
the scenario is that files are generated continuously from switches which have huge telecom data..if files are not parsed quickly lakhs of files accumulate..and this is what is happening..
we cant simulate the exact production environment which runs lots of heavy applications.. and my development machine runs nothing.. it works fine here..
for your question, here is the answer and lots of it..
its a hp unix itanium machine..
uname -a gives this:
HP-UX optmed B.11.31 U ia64 3312227422 unlimited-user license
if you see the load:
System: optmed Fri May 29 20:04:29 2009
Load averages: 52.00, 51.37, 47.32
273 processes: 154 sleeping, 119 running
Cpu states:
CPU LOAD USER NICE SYS IDLE BLOCK SWAIT INTR SSYS
0 51.82 0.0% 8.7% 91.3% 0.0% 0.0% 0.0% 0.0% 0.0%
5 52.19 0.0% 0.0% 100.0% 0.0% 0.0% 0.0% 0.0% 0.0%
--- ---- ----- ----- ----- ----- ----- ----- ----- -----
avg 52.00 0.0% 4.4% 95.6% 0.0% 0.0% 0.0% 0.0% 0.0%
today the load is less.. its I think holiday where the production system is.. and in these processes my small process also runs and fights for CPU.. :)
anyway thanks for your advice..