Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling timer always armed on initialization #409

Closed
alk opened this issue Aug 23, 2015 · 19 comments
Closed

Profiling timer always armed on initialization #409

alk opened this issue Aug 23, 2015 · 19 comments

Comments

@alk
Copy link
Contributor

alk commented Aug 23, 2015

Originally reported on Google Code with ID 406

The code currently arms the profiling timer unconditionally in ProfileHandler::RegisterThread
for timer sharing detection. Unfortunately, this means the profiling timer will stay
active, even if profiling is inactive and we have shared timers in case if no other
threads get registered later.

This poses a problem for Chromium, which links the profiling code into the executable.
See http://code.google.com/p/chromium/issues/detail?id=115149

Reported by mnissler@google.com on 2012-02-21 20:50:07

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Proposed fix is attached. See http://codereview.chromium.org/9416072 for more information.

Reported by mnissler@google.com on 2012-02-21 21:07:07


- _Attachment: [gperftools_prof_signals.patch](https://storage.googleapis.com/google-code-attachments/gperftools/issue-406/comment-1/gperftools_prof_signals.patch)_

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

From Mattias:

The patch is a bit tricky, here is some background (taken from the chromium code review):

The issue with the timer setup code is that older kernels and pthread
implementations (linuxthreads) incorrectly implemented timers per-thread instead
of per-process as mandated by POSIX. Arming timers is much harder in that model,
since the setitimer call must actually be issued by each thread that's supposed
to be profiled.

The current gperftools code would just arm the timers on thread startup, no
matter whether profiling was enabled or not, which obviously isn't an option for
us.

The idea behind my patch is to keep timers disabled initially. In order to get
them armed, the code sends SIGPROF signals to the individual threads via
pthread_kill(), at which point the thread check whether they need to arm their
timers.

In theory, this works just fine, but there is a caveat: In order to send
SIGPROFs, I need the thread identifiers of the registered threads. It's no big
deal to record them when they register. However, a problem arises at thread
termination. In theory, the pthread identifier for the thread can become invalid
when the thread terminates (in case of detached threads) or when the
corresponding pthread_join() call completes (for non-detached threads). However,
the profiling code doesn't have a simple way to learn about thread terminating
and/or thread identifiers becoming invalid. So I have added an UnregisterThread
function that threads should call at termination.

In practice, UnregisterThread is mostly not required, since pthread_kill() will
return ESRCH if it doesn't know the thread. That's assuming that the thread
identifier is still usable at that point though, which pthread doesn't
guarantee. For linuxthreads, that assumption holds. For NPTL, it doesn't hold in
all cases, and I have indeed been able to crash pthread_kill for an exotic
scenario using user-supplied stack memory that I unmap after terminating the
thread. Similarly, for recycled thread stacks the thread identifiers might get
recycled, which can lead to a thread being profiled that did never register for
it (I don't think that's much of an issue though since we assume all threads
should register).

I have tested on a modern NPTL-based system (kernel 2.6.38.8 / eglibc 2.11.1) and an
older
LinuxThreads-based system (kernel 2.4.27 / glibc-2.3.2). Unit tests are passing,
and the signal handlers seem to fire at the right times.

Note that a much simpler option for solving the problem would be to assume that we
have a
proper per-process setitimer implementation (that assumption holds for kernels 2.6.10
and later), in which case we could just delete the timer mode detection code and
not do the signal trick. Not sure what the backwards compatibility policy of perftools
is.

Let me know if you have questions!

Reported by chappedm on 2012-03-02 16:18:07

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

So far the patch sounds reasonable although I can't help but feel that there is some
level of risk in this one. Also, I would like a bit of clarification since I am not
overly familiar with the cpu profiler. In my experience with the cpu profiler, you
just link against the shared library and set CPUPROFILE in your environment. This enables
global profiling of the entire process. Is this patch intended to work in the same
way or would this patch only apply to those who are manually registering which threads
they want to have profiled?

Reported by chappedm on 2012-03-03 19:57:42

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Regarding you wariness about the patch, I do agree that it changes the way of doing
things quite a bit, so I very much appreciate a proper review :)

Regarding your questions:

Just linking the shared library and setting CPUPROFILE is sufficient on POSIX-compliant
pthread implementations (i.e. those that have per-process timers), as a random thread
will receive the timer signal. On linuxthreads-based systems, it has always been the
case that a ProfilerRegisterThread() call was necessary to get profiling signals on
that thread, both with the old and the new code (see the unit test).

The patch is intended to keep the semantics of the existing code, but improving it
to only arm the timer(s) in case they're actually in use (i.e. a profiling callback
is installed). The only difference is that in the new model, you should call ProfilerUnregisterThread()
on thread shutdown if you've previously called ProfilerRegisterThread() on thread startup.
If you don't, bad things may happen, but as explained previously that only happens
under very rare circumstances.

I also just noticed that I hadn't properly exposed ProfilerUnregisterThread, which
is fixed in the updated patch I've attached.

Reported by mnissler@google.com on 2012-03-12 15:21:48


- _Attachment: [gperftools_prof_signals.patch](https://storage.googleapis.com/google-code-attachments/gperftools/issue-406/comment-4/gperftools_prof_signals.patch)_

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Great thanks! I am doing a bunch of cleanup tonight and this should be making its way
onto the list :)

Reported by chappedm on 2012-05-04 17:30:03

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Reported by chappedm on 2012-05-04 17:30:24

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Reported by chappedm on 2012-05-04 17:42:17

  • Status changed: Accepted

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

After applying this patch there is a unit test failure running 'make check'. Here is
the manual run:

david@hatch:~/gperftools$ ./profile_handler_unittest 
threads have shared timers
Running RegisterUnregisterCallback
Check failed: !(IsSignalEnabled())
Aborted

Can you please take a look at this and update the patch or supply a supplemental patch.

Regards,

Dave

Reported by chappedm on 2012-09-18 03:48:20

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

I'll eventually get back to this, but probably not this week and I'll be on vacation
till October 6th.

Random question: Is gperftools meant to support pthread with non-POSIX-compliant pthread
implementations (i.e. per-thread timers)? If not, we can as well just remove timer
mode detection and assume timers are global. FWIW, this is the conclusion we reached
in the google-internal discussion about this.

Reported by mnissler@google.com on 2012-09-19 09:41:47

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Any update now that you are back. As for the pthread question I have successfully run
it on FreeBSD with non-POSIX-compliant pthreads without issue. 

Reported by chappedm on 2012-10-28 14:28:18

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

I'm quite busy at the moment, but I'll get back to you once I find a few cycles to spend
on this.

Reported by mnissler@google.com on 2012-10-29 08:03:47

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Just checking in again. I am looking to get a release out very soon now and would like
to make sure that this is included.

Reported by chappedm on 2012-12-22 20:15:37

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

So, I realize I'm being rather unresponsive and I apologize for that. Given that is
the case, feel free to time out aggressively on me. I'm still interested in getting
this sorted eventually though, so I will reply eventually, like I do today :)

I'll attach an updated patch that adjusts profile_handler_unittest.cc (not sure how
I could miss that earlier). I basically ripped out all checks for signal handler installation
(it's installed unconditionally now) and shared-vs-non-shared timers (all the timers
should be enabled/disabled correctly no matter whether they're shared or not). What
I haven't tested is whether this works on a system with separate timers (I lost the
VM I installed back then when moving to a new machine). I think we should collect some
test results from a machine that has separate timers before shipping this though :)

Following up on the question from comment 9: What are your thoughts on just dropping
support for separate timers? That would simplify the code significantly and obviate
the need to send signals to other threads just to enable timers. Do you have any indication
on whether people are still running gperftools on systems with separate timers?

I'll be out for a week starting tonight, so I will be slow to respond again (sigh!).
But when I'm back I'll hopefully have time to either produce a reduced patch that drops
support for separate timers entirely or test the current code in a VM that runs a really
old LinuxThreads glibc (depending on your thoughts).

Reported by mnissler@google.com on 2013-01-04 15:52:02


- _Attachment: [gperftools_prof_signals.patch](https://storage.googleapis.com/google-code-attachments/gperftools/issue-406/comment-13/gperftools_prof_signals.patch)_

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Sounds great. Thanks for the updated patch. It will be interesting to see how the separate
timer support pans out.

Reported by chappedm on 2013-01-07 13:29:20

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Back from vacation...

Regarding your comment: Are you interested in (a) seeing the patch in comment 13 tested
on a separate-timer OS or (b) seeing the alternative patch that assumes shared timers
and drops support for separate timers entirely?

Reported by mnissler@google.com on 2013-01-14 16:40:02

@bsilver8192
Copy link
Contributor

Does anybody know what the current status on this is? I would like this issue fixed, and would be happy to take over on getting it merged.

I think @saittam is the original reporter. Any chance you're still interested in working on this, and if not, do you mind if I start with what you did?

@saittam
Copy link

saittam commented Jan 17, 2016

I'm the original reporter indeed. Unfortunately, this has lingered in the hardly-will-ever-get-to portion of my TODO list since forever. I originally got interested in this because the continuously firing signals tended to confuse gdb (which is a different bug). I've since carried the patch that's attached to the issue in my working trees, but there was nothing else that provided a strong motivation to invest time into pushing for a resolution. At this point, I can't tell whether the problem still exists, and whether the patch still applies... sorry. The original reproduction instructions should still be valid though, so if you're seeing SIGPROF firing in a single-threaded test program after calling ProfileHandler::RegisterThread, the issue is probably still valid.

My lack of spare time to spend is only going to get worse in the upcoming months, so I'd be most delighted if you were to pick this issue up and resolve it.

bsilver8192 pushed a commit to bsilver8192/gperftools that referenced this issue Jan 22, 2016
It causes a noticeable performance hit and can sometimes confuse GDB.

Tested with and without CPUPROFILE_PER_THREAD_TIMERS=1, and on a
linuxthreads system (CPUPROFILE_PER_THREAD_TIMERS=1 isn't supported
there) (I couldn't get all the tests to run there before, but
profile_handler_unittest and profiler_unittest.sh passed).

Originally written by mnissler@google.com. I mostly had to deal with
integrating with CPUPROFILE_PER_THREAD_TIMERS.

Fixes gperftools#409
bsilver8192 added a commit to bsilver8192/gperftools that referenced this issue Jan 22, 2016
It causes a noticeable performance hit and can sometimes confuse GDB.

Tested with and without CPUPROFILE_PER_THREAD_TIMERS=1, and on a
linuxthreads system (CPUPROFILE_PER_THREAD_TIMERS=1 isn't supported
there) (I couldn't get all the tests to run there before, but
profile_handler_unittest and profiler_unittest.sh passed).

Originally written by mnissler@google.com. I mostly had to deal with
integrating with CPUPROFILE_PER_THREAD_TIMERS.

Fixes gperftools#409
@bsilver8192
Copy link
Contributor

Forgot to include "Fixes: #409" in my PR #762 that just got merged which fixes this... Somebody want to close it manually?

@alk
Copy link
Contributor Author

alk commented Jan 27, 2016

Fixed by #762. Thanks to Brian Silverman's great effort.

@alk alk closed this as completed Jan 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants