My favorites | Sign in
Project Home Downloads Wiki Issues Code Search
New issue   Search
for
  Advanced search   Search tips   Subscriptions
Issue 202487: sudo hangs and leaves dump_syms as a zombie
2 people starred this issue and may be notified of changes. Back to list
 
Project Member Reported by davidjames@chromium.org, Aug 19, 2011
From http://chromeos-botmaster.mtv.corp.google.com:8026/builders/x86-alex%20canary/builds/885

Looking at the machine I noticed it was stuck on the archive_build step.

$ ps -eo pid,etime,cputime,cmd | grep -E 'bash ./cros_gen|dump_syms' | grep -v grep


15140    02:15:27 015695    02:15:22 00:00:05 /bin/bash ./cros_generate_breakpad_symbols --board=x86-alex
16636    02:08:33 00:00:00 sudo dump_syms /build/x86-alex/usr/lib/libgmock_main.so.0.0.0 /build/x86-alex/usr/lib/debug/usr/lib
16637    02:08:33 00:00:00 [dump_syms] <defunct>

Looks like that dump_syms process has been running for a long time, but it's hung in sudo rather than in the actual process it was running.

$ strace -p 16636
Process 16636 attached - interrupt to quit
select(6, [5], [], NULL, NULL

$ sudo lsof -p 16636
COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF     NODE NAME
sudo    16636 root  cwd    DIR                9,1     4096 20644171 /b/cbuild/x86-alex-canary-master/chroot/home/chrome-bot/trunk/src/scripts
sudo    16636 root  rtd    DIR                9,1     4096 15442363 /b/cbuild/x86-alex-canary-master/chroot
sudo    16636 root  txt    REG                9,1   168368 21540052 /b/cbuild/x86-alex-canary-master/chroot/usr/bin/sudo
sudo    16636 root  mem    REG                9,1    47432 21523413 /b/cbuild/x86-alex-canary-master/chroot/lib64/libnss_files-2.10.1.so
sudo    16636 root  mem    REG                9,1    43384 21523433 /b/cbuild/x86-alex-canary-master/chroot/lib64/libnss_nis-2.10.1.so
sudo    16636 root  mem    REG                9,1    88880 21523437 /b/cbuild/x86-alex-canary-master/chroot/lib64/libnsl-2.10.1.so
sudo    16636 root  mem    REG                9,1    31432 21523409 /b/cbuild/x86-alex-canary-master/chroot/lib64/libnss_compat-2.10.1.so
sudo    16636 root  mem    REG                9,1  1399984 21523440 /b/cbuild/x86-alex-canary-master/chroot/lib64/libc-2.10.1.so
sudo    16636 root  mem    REG                9,1    96632 21523304 /b/cbuild/x86-alex-canary-master/chroot/lib64/libz.so.1.2.5
sudo    16636 root  mem    REG                9,1    14512 21523446 /b/cbuild/x86-alex-canary-master/chroot/lib64/libdl-2.10.1.so
sudo    16636 root  mem    REG                9,1    55712 21523459 /b/cbuild/x86-alex-canary-master/chroot/lib64/libpam.so.0.83.0
sudo    16636 root  mem    REG                9,1    10464 21523481 /b/cbuild/x86-alex-canary-master/chroot/lib64/libutil-2.10.1.so
sudo    16636 root  mem    REG                9,1   123168 21523496 /b/cbuild/x86-alex-canary-master/chroot/lib64/ld-2.10.1.so
sudo    16636 root  mem    REG                9,1  1575552 21728251 /b/cbuild/x86-alex-canary-master/chroot/usr/lib64/locale/locale-archive
sudo    16636 root    0r  FIFO                0,8      0t0    32497 pipe
sudo    16636 root    1w   REG                9,1     1062 25019071 /b/cbuild/x86-alex-canary-master/chroot/tmp/sym.bJff
sudo    16636 root    2w   REG                9,1     1339 18726942 /b/cbuild/x86-alex-canary-master/chroot/tmp/err.13To
sudo    16636 root    3r   REG                9,1     1521 21531563 /b/cbuild/x86-alex-canary-master/chroot/etc/passwd
sudo    16636 root    4r   REG                9,1      788 21531541 /b/cbuild/x86-alex-canary-master/chroot/etc/group
sudo    16636 root    5u  unix 0xffff8803229b8000      0t0  8072223 socket


I killed the offending sudo process that was hung and the build continued fine. It's strange that it looked like the sudo process was hung and not the actual dump_syms process.

See also  Bug 19386  which affected the same build because cbuildbot didn't handle the hang correctly.

Aug 19, 2011
#1 saintlou@chromium.org
This is the second time this is happening under 2 hours. Please take a look.
Labels: -Pri-2 Pri-0
Aug 19, 2011
#2 saintlou@chromium.org
(No comment was entered for this change.)
Owner: thieule@chromium.org
Cc: varunj...@chromium.org reveman@chromium.org
Aug 19, 2011
#3 thieule@chromium.org
Hard to tell what happened here.

Can I get a core dump of the hung processes next time this happens?  Dump both the sudo and the actual process.
Aug 19, 2011
#4 mkr...@chromium.org
It's a race condition bug in sudo:

http://blog.famzah.net/2010/11/01/sudo-hangs-and-leaves-the-executed-program-as-zombie/

It looks like it was introduced in sudo version 1.7.3, and fixed in version 1.7.5 (REF: http://www.gratisoft.us/bugzilla/show_bug.cgi?id=447).

Aug 19, 2011
#5 davidjames@chromium.org
Thanks mkrebs. Scottz can you upgrade sudo on all the bots to version 1.7.5 ?
Summary: sudo hangs and leaves dump_syms as a zombie
Owner: scottz@chromium.org
Aug 24, 2011
#6 davidjames@google.com
Moving this to Chrome issue tracker for Chrome infrastructure team
Status: Duplicate
Mergedinto: chromium:94204
Aug 24, 2011
#7 davidjames@google.com
Actually looks like it's sudo inside our chroot that needs to be updated here... I'll work on that
Status: Assigned
Owner: davidjames@chromium.org
Labels: -Pri-0 Pri-1
Mergedinto:
Aug 25, 2011
#8 bugdroid1@chromium.org
Commit: 3b8e3a731b332997fbd73f0b5c7e9c94ca6c9354
 Email: davidjames@chromium.org

Import upstream version of sudo-1.7.6_p1.

BUG=chromium-os:19387
TEST=Check that new sudo is unused because it is masked on all platforms.

Change-Id: I6bf7a373ea4a3c8e70f0b3d2122b09cd5eb7e705
Reviewed-on: http://gerrit.chromium.org/gerrit/6634
Reviewed-by: Darin Petkov <petkov@chromium.org>
Tested-by: David James <davidjames@chromium.org>

A	app-admin/sudo/sudo-1.7.6_p1.ebuild
Aug 25, 2011
#9 bugdroid1@chromium.org
Commit: b42503e616bd0806ff4d3047e05dacb94d16778d
 Email: davidjames@chromium.org

Customize sudo for Chrome OS and unmask new sudo.

BUG=chromium-os:19387
TEST=Full build with cbuildbot including tests.

Change-Id: I4fbacbecb662f53350250784a449f0c8cdfc2250
Reviewed-on: http://gerrit.chromium.org/gerrit/6635
Reviewed-by: Darin Petkov <petkov@chromium.org>
Tested-by: David James <davidjames@chromium.org>

D	app-admin/sudo/sudo-1.7.4_p5.ebuild
M	app-admin/sudo/sudo-1.7.6_p1.ebuild
Aug 25, 2011
#10 davidjames@google.com
FIXED, new builds should not see the hang anymore
Status: Fixed
Aug 26, 2011
#11 or...@chromium.org
Bulk claiming work to iteration-37
Labels: Iteration-37
Sep 2, 2011
#12 davidjames@google.com
Issue 19266 has been merged into this issue.
Cc: an...@chromium.org davidjames@chromium.org hungte@chromium.org
Sep 13, 2011
#13 chromeos...@chromium.org
(No comment was entered for this change.)
Labels: FixedIn-0.15-949.0
Sep 13, 2011
#14 chromeos...@chromium.org
(No comment was entered for this change.)
Labels: -FixedIn-0.15-949.0
Sep 20, 2011
#15 kr...@chromium.org
(No comment was entered for this change.)
Status: Verified
Sep 20, 2011
#16 chromeos...@chromium.org
(No comment was entered for this change.)
Labels: FixedIn-0.15-949.0
Oct 25, 2011
#17 chromeos...@chromium.org
(No comment was entered for this change.)
Labels: -FixedIn-0.15-949.0 FixedIn-949.0.0
Jan 20, 2012
#18 chromeos...@chromium.org
(No comment was entered for this change.)
Labels: FixedInIndex-20
Mar 6, 2013
#19 lafo...@google.com
(No comment was entered for this change.)
Labels: OS-Chrome
Mar 9, 2013
#20 bugdroid1@chromium.org
(No comment was entered for this change.)
Labels: -Area-Build Build
Sign in to add a comment

Powered by Google Project Hosting