Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't fork() when profiling already started #335

Closed
alk opened this issue Aug 23, 2015 · 3 comments
Closed

Can't fork() when profiling already started #335

alk opened this issue Aug 23, 2015 · 3 comments

Comments

@alk
Copy link
Contributor

alk commented Aug 23, 2015

Originally reported on Google Code with ID 332

What steps will reproduce the problem?
1. launch program with LD_PRELOAD & CPUPROFILE=/tmp/profile
2. Wait 'till your process perform fork()
3. ... see top / strace - it can't fork (actually can't clone(2))

What is the expected output? What do you see instead?
Expected result: process forked
Got: it eats 100% of CPU and never success with fork

What version of the product are you using? On what operating system?
CentOS release 5.3 (Final) 2.6.18-164

Please provide any additional information below.

The output of `strace -p ...`:

rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [PROF], 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], ~[KILL STOP RTMIN RT_1], 8) = 0
getpid()                                = 20942
open("/proc/20942/maps", O_RDONLY)      = 15
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2af177c10000
read(15, "00400000-006d3000 r-xp 00000000 "..., 4096) = 3972
close(15)                               = 0
munmap(0x2af177c10000, 4096)            = 0
rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [PROF], NULL, 8) = 0
rt_sigreturn(0x1ab34d7c)                = 56
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x2af1739e5090)
= ? ERESTARTNOINTR (To be restarted)
------------------------------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I think that's because of http://lxr.linux.no/linux+v2.6.38/kernel/fork.c#L1264

It seems what delivering of ITIMER_PROF is always enqueued, so clone(2) will never
succeed on high values of "CPUPROFILE_FREQUENCY"; I have used the default (100), and
when I teared it down (-> 50) process forked OK and now I can profile it's child.

Surely, it will be OK if I will use ProfilerStart() _after_ fork(), but can't it be
done automatically? ( because in last case I must explicitly link with -lprofiler).

Reported by jdkrot on 2011-05-13 14:18:53

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Hmm, weird.  If I understand the report, it can happen when the cpu profile frequency
is fast enough a clone() call can't finish between the invocations.  I haven't heard
any other reports of this, so it seems like it's pretty uncommon.  (And I would expect
clone to be a lot faster than 10ms.)  Though maybe it's just that most users aren't
using fork.

I don't see how this could be fixed in perftools, without capturing the fork syscall,
which I'd rather not do.  The best thing to do may just be to lower the profile frequency,
like you've done.

Nice catch figuring out what was going on, btw!

Reported by csilvers on 2011-05-15 22:38:34

  • Labels added: Type-Defect, Priority-Low

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

> And I would expect clone to be a lot faster than 10ms

I think that's because of my very specific case: I'm allocating a lot of memory in
master process (~400 Mb), and then perform fork() [using this memory 'read-only' within
childs, what gives be benefit]. Since clone() copy memory pages references, it may
take long.

I could be wrong, but it seems logical. In addition, I've moved most of allocations
into child code, and now master-process (which became <10Mb) fork()s fine even with
hight profile frequency.

Reported by jdkrot on 2011-05-16 14:48:15

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Seems great!  Given there's a good workaround for you, and I'm not sure we want to change
the code too much to handle something like this, I'm going to close this WorkingAsIntended.

Reported by csilvers on 2011-05-24 04:32:18

  • Status changed: NotABug

@alk alk closed this as completed Aug 23, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant