Export to GitHub

google-coredumper - issue #8

Stacks can't be fully analysed on Sles11


Posted on Jan 21, 2011 by Massive Hippo

I've been successfully using coredumper 1.2.1 on Sles10 (64-bit), but it doesn't work on Sles11 (I applied the patch to remove reliance on linux/dirent.h). gdb can't analyze the stack below entry to a signal handler. The output from bt is

0 WriteCoreDump (file_name=0x407872 "core.signal.dmp") at src/coredumper.c:192

1 0x0000000000400c47 in signalhandler () at test8.cpp:18

2 0x00007f30010c76e0 in ?? ()

3 0x0000000000000000 in ?? ()

Although the result from the call to WriteCoreDump(...) is success, I also find that somewhere errno gets set to 14 (bad address) - that didn't happen on Sles10.

Is there any fix for this?

Here's the full sequence to demonstrate the problem. ajk@(none):/tmp/coredumper> cat test8.cpp const int version = 8;

include <google/coredumper.h>

include <stdio.h>

include <string.h>

include <errno.h>

include <signal.h>

include <stdlib.h>

include <sys/resource.h>

int result = 0; int lastError = 0; void signalhandler(int) { const char* filename = "core.signal.dmp"; errno = 0; result = WriteCoreDump(filename); lastError = errno;

} int main(int argc, char* argv[]) { char* filename = argv[0]; printf("%s Version %d\n", filename, version);

signal(SIGRTMIN, signalhandler);

printf(&quot;Raising SIGRTMIN\n&quot;);

raise(SIGRTMIN);

printf(&quot;Exiting; result = %d; last error %d:'%s'\n&quot;,
    result, errno, strerror(errno));

return 0;

} ajk@(none):/tmp/coredumper> g++ -Wall -ggdb test8.cpp -o dumptest8.exe /usr/local/lib/libcoredumper.a ajk@(none):/tmp/coredumper> ./dumptest8.exe ./dumptest8.exe Version 8 Raising SIGRTMIN Exiting; result = 0; last error 14:'Bad address' ajk@(none):/tmp/coredumper> gdb dumptest8.exe core.signal.dmp GNU gdb (GDB; SUSE Linux Enterprise 11) 6.8.50.20081120-cvs Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html&gt; This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-suse-linux". For bug reporting instructions, please see: <http://bugs.opensuse.org/&gt;... Core was generated by `./dumptest8.exe'.

0 WriteCoreDump (file_name=0x407872 "core.signal.dmp") at src/coredumper.c:192

192 ClearCoreDumpParameters(&params); (gdb) bt

0 WriteCoreDump (file_name=0x407872 "core.signal.dmp") at src/coredumper.c:192

1 0x0000000000400c47 in signalhandler () at test8.cpp:18

2 0x00007f30010c76e0 in ?? ()

3 0x0000000000000000 in ?? ()

Current language: auto; currently c (gdb) q Quitting: You can't do that without a process to debug.

Comment #1

Posted on Aug 1, 2011 by Happy Lion

I have the same issue still in the version 1.2.1 from April 2008: http://code.google.com/p/google-coredumper/downloads/detail?name=coredumper-1.2.1.tar.gz&can=2&q= on SUSE Linux Enterprise Server 11.0 (x86_64). You wrote coredumper.c:192. In version 1.2.1, this would be the call to ClearCoreDumpParameters(&params) Does anyone (Andy?) have successfully solved this? I would be very thankful for any hint before using gdb to try whether this is the point in the coredumper lib where the error occurs.

Comment #2

Posted on Aug 3, 2011 by Massive Hippo

Since support for this package seems to have evaporated, I considered diagnosing the issue myself, but was faced with an unbounded continuation engineering task that didn't look very rewarding. Instead I turned to the gdb gcore command. The sequence is to fork and exec gdb passing a parameter file (as the gcore script does) and wait for the child to terminate. This gives much the same results as the coredumper package (except you can't get inside to influence dumping shared store segments) but the gdb team do maintain the package. The other downside is that you have to have gdb installed, which may be an issue for some sites. Note there's a problem on SLES11 with gdb gcore for which a fix is available (I don't know if that fix fixes the core dumper issue as well - the symptoms are again unanalysable stacks). You can also fork and abort to get a dump using kernel. Shared store can be unmapped to avoid dumping it &/or the kernel controls set up for the process. The downsides are you have little control over the file named (by default SLES always use the 'core' pattern - you can create and switch to a directory to manage this, but the site can of course set any pattern they like for the system) and you lose all the threads apart from the one you fork from - not so hot in a multi-threaded environment.

Status: New

Labels:
Type-Defect Priority-Medium