Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test for distinct address spaces is broken #484

Closed
alk opened this issue Aug 23, 2015 · 5 comments
Closed

Test for distinct address spaces is broken #484

alk opened this issue Aug 23, 2015 · 5 comments

Comments

@alk
Copy link
Contributor

alk commented Aug 23, 2015

Originally reported on Google Code with ID 481

What version of the product are you using? On what operating system?

32-bit Linux, tcmalloc 2.0, GCC 4.7.1, eglibc 2.13

Please provide any additional information below.

Whilst trying to debug a crash in heap-checker_unittest.sh, I came across some very
strange code in src/base/linuxthreads.cc, which attempts to detect whether a related
process is a fork or a thread by trying to determine whether the address spaces are
distinct.  The offending code is as follows (lines 446-458):

  if (sys_ptrace(PTRACE_PEEKDATA, pid, &i, &j) || i++ != j ||
      sys_ptrace(PTRACE_PEEKDATA, pid, &i, &j) || i   != j) {
    /* Address spaces are distinct, even though both
     * processes show the "marker". This is probably
     * a forked child process rather than a thread.
     */
    sys_ptrace_detach(pid);
    num_threads--;
    sig_num_threads = num_threads;
  } else {
    found_parent |= pid == ppid;
    added_entries++;
  }

As far as I can tell, what this is trying to do is peek at the other process's address
space to see if the memory at &i is the same as in the current process, then frob i,
and see whether &i in the other process changes as well.  This is broken for two reasons:

1. It appears to assume that the PTRACE_PEEKDATA will load the data at the address
in the 3rd argument, into the variable given as the 4th argument, and return 0 on success
(which will evaluate to false, and continue evaluating the statement after the ||).
 This is not true - according to the man page, the 4th argument is ignored, and the
*return value* is the data, with errno used to indicate success/failure.

2. Even if the PTRACE_PEEKDATA calls are fixed, the test can ever work as desired,
since i is a local variable and stack variables are always thread-local.  The data
at address &i in a related process is not guaranteed to be equal to i, regardless of
whether it is a fork or a thread.

As far as I can tell, the only reason it "works" is because the PTRACE_PEEKDATA calls
are failing, returning zero (which evaluates to false), and setting errno to EFAULT
(14), which is then reflected in the output from my failed test run:

  In main(): heap_check=local
  Thread finding failed with -1 errno=14
  Thread finding callback was interrupted or crashed; can't fix this
  ----
  FAIL: heap-checker_debug_unittest.sh

Please find attached a patch which uses a different approach, based on using the __WCLONE
option with waitpid().  From my testing, this appears to detect relatives correctly,
including the distinction between forks and processes, since the parent process - which
the code checks for as a sanity-check - is a "real" process, not a clone.

Unfortunately, this patch has not fixed my unit test failure, since the crash really
is within the callback, not the thread-finding routine.

Reported by philip.allison@smoothwall.net on 2012-11-01 11:15:26


- _Attachment: [fix-thread-detection.patch](https://storage.googleapis.com/google-code-attachments/gperftools/issue-481/comment-0/fix-thread-detection.patch)_
@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

r167 | chappedm@gmail.com | 2012-11-03 10:52:42 -0400 (Sat, 03 Nov 2012) | 4 lines

issue-481: Replaced test mechanism for distinct address spaces with a more reliable
mechanism

Rather than using sys_ptrace+PTRACE_PEEK_DATA to determine whether address spaces are
distinct, we now use sys_waitpid+__WCLONE. See issue-481 for a more detailed rationale.

Reported by chappedm on 2012-11-03 14:53:38

  • Status changed: Accepted

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Reported by chappedm on 2012-11-03 14:53:57

  • Status changed: Fixed

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

I think this introduced the following race condition which I can reproduce occasionally
trying to use ListAllProcessThreads against a single-threaded application:

- the directory scanner finds the main thread of a single-threaded program (ie not
a clone)
- it PTRACE_ATTACHes to this main thread, causing it to asynchronously stop.
- the new wait call here, with __WCLONE, will return immediately, because the parent
pid is not a clone. So, it returns immediately even though the process is not yet stopped.
- we finish the iteration over threads, and since we only found the parent, we go back
and try to do the /proc/* based matching. In the retry step, we first try to PTRACE_DETACH
from the parent. But, if the async STOP hasn't been delivered yet, this DETACH will
fail.
- on the next loop through, our attempted call to PTRACE_ATTACH now fails because we're
already attached.

This ends up causing the attached parent to be orphaned, and the process ends up SIGSTOPPED
when the tracer process exits.

The fix is to do something like:


                int waited_pid;
                bool is_parent = (pid == ppid);
                int flags = is_parent ? __WALL : __WCLONE;

                while ((waited_pid = sys_waitpid(pid, &status, flags)) < 0) {
                  if (errno != EINTR) {
                    sys_ptrace_detach(pid);
                    num_threads--;
                    sig_num_threads = num_threads;
                    goto next_entry;
                  }
                }
                found_parent |= is_parent;

Reported by tlipcon on 2013-11-27 22:04:12

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Have you tried on master or 2.{0,1} ? In master this ticket's change was reverted.

Reported by alkondratenko on 2013-11-27 22:05:40

@alk
Copy link
Contributor Author

alk commented Aug 23, 2015

Ah, I was looking at svn trunk - didn't catch that the project moved to git. Thanks
for noting that this was reverted.

Reported by tlipcon on 2013-12-03 22:08:05

@alk alk closed this as completed Aug 23, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant