| Date: Tue, 23 Sep 2008 09:59:21 +0100
| From: "Philip Reynolds" <philip.reynolds at gmail.com>
|
| 2008/9/22 Brian Foster <blf at blf.utvinternet.ie>:
| > this has me puzzled: I want to see the resource limits
| > (rlimits) for some _other_ process. is there a command,
| > file (e.g., in /proc ?), or system call that will let me
| > do this. Linux 2.6.4 (SUSE 9.1). [ ... ]
|
| The short answer is, as far as I'm aware, no. [ ... ]
|
| If you're looking to see if/when the program hits a resource
| limit, an strace(1) should show that up as the program will
| receive an appropiate signal.
actually, I was making a speculative guess why the program
wasn't writing a ‘core’-drop on a signal 11 (core file size
rlimit was probably 0). I wanted to verify if that was the
case, by working my way through the relevant process tree,
especially after I changed it but without effect! ;-(
(I realise there are many other possible reasons for
no ‘core’.)
the program in question is a worker started by a (non-root)
daemon under circumstances which are Very Difficult (AFAIK)
to control. the signal 11's happen perhaps once a day in
normal usage. (a useful fact is it seems, from examining
the opaque logs, the worker is consistently failing, and
is failing soon after being started — given it can run for
over a day when it works, that's Very Useful .... ;-) )
I wanted to first verify which worker by examining ‘core’,
and possibly also, of course, get a clew as to what the
feck is going wrong. but even with what should be an
inherited non-0 core file rlimit, there's no ‘core’.
(again, I realise there are many other possible reasons
for no ‘core’.)
since I've a reasonable guess which worker it is, what
I'll try now is substituting a front-end shell script
to do a bit of logging, use strace(1) — albeit that
probably won't tell me much since the worker is Very
compute-intensive — and perhaps check/set the core file
rlimit, &tc. if that not-too-invasive front-work works,
then I'll try a rather more invasive one front-end; e.g.,
one which runs the worker using gdbserver(1). the idea
is when the worker is fired up, it'll block waiting for
a gdb(1)-connection. that connection, which I can make
at my leisure, should allow me to watch in real-time
what is going on ....
( sorry for being a bit vague as to what the programs
are. I want to get a handle on what is wrong before
making any reports/accusations: my set-up is not at
all “standard” and I cannot rule out that I am, in
effect, “causing” the nonetheless poor behaviour.
(and yes, things have changed recently (right about
the time the signal 11's started); but it is, AFAIK,
awkward to revert to the previous working set-up.) )
cheers!
-blf-
--
“How many surrealists does it take to | Brian Foster
change a lightbulb? Three. One calms | somewhere in south of France
the warthog, and two fill the bathtub | Stop E$$o (ExxonMobil)!
with brightly-coloured machine tools.” | http://www.stopesso.com
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!