Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all.bash gotest fails on line 164 segmentation fault on debian lenny on xen #386

Closed
gopherbot opened this issue Dec 5, 2009 · 20 comments
Closed

Comments

@gopherbot
Copy link

by tntknight:

gotest fails in all.bash with following error and segfault

===================================================================
make[2]: Entering directory `/usr/local/go/src/pkg/archive/tar'
8g -o _gotest_.8 common.go reader.go writer.go    reader_test.go 
writer_test.go
rm -f _test/archive/tar.a
gopack grc _test/archive/tar.a _gotest_.8
make[2]: Leaving directory `/usr/local/go/src/pkg/archive/tar'

Message from syslogd@standcanada at Sat Dec  5 15:30:57 2009 ...
standcanada kernel: ------------[ cut here ]------------

Message from syslogd@standcanada at Sat Dec  5 15:30:57 2009 ...
standcanada kernel: invalid opcode: 0000 [#1] SMP
/usr/local/go/bin/gotest: line 164:  8571 Segmentation fault      $E 
./$O.out "$@"

Message from syslogd@standcanada at Sat Dec  5 15:30:57 2009 ...
standcanada kernel: Process 8.out (pid: 8571, ti=ecbfc000 task=ed030410 
task.ti=ecbfc000)
make[1]: *** [test] Error 139
make[1]: Leaving directory `/usr/local/go/src/pkg/archive/tar'
make: *** [archive/tar.test] Error 2
Message from syslogd@standcanada at Sat Dec  5 15:30:57 2009 ...
standcanada kernel: Stack: 08eff20b 00000000 2cb81038 00000000 00000280 
0000000f 00000001 c0109941

Message from syslogd@standcanada at Sat Dec  5 15:30:57 2009 ...
standcanada kernel:        ecbdddd8 ecbddc80 00000007 080b90d8 000fffff 
00000051 90d8ffff 08eff20b

Message from syslogd@standcanada at Sat Dec  5 15:30:57 2009 ...
standcanada kernel:        00000001 bffab83c 0806f043 ecbfc000 c0106d46 
00000001 bffab788 00000010

Message from syslogd@standcanada at Sat Dec  5 15:30:57 2009 ...
standcanada kernel: Call Trace:

Message from syslogd@standcanada at Sat Dec  5 15:30:57 2009 ...
standcanada kernel:  [<c0109941>] write_ldt+0x1a7/0x1c3

Message from syslogd@standcanada at Sat Dec  5 15:30:57 2009 ...
standcanada kernel:  [<c0106d46>] syscall_call+0x7/0xb

Message from syslogd@standcanada at Sat Dec  5 15:30:57 2009 ...
standcanada kernel:  [<c0430000>] hrtimer_nanosleep_restart+0x28/0x51

Message from syslogd@standcanada at Sat Dec  5 15:30:57 2009 ...
standcanada kernel:  =======================

Message from syslogd@standcanada at Sat Dec  5 15:30:57 2009 ...
standcanada kernel: Code: f3 89 fe 89 04 24 e8 d5 00 00 00 89 f1 8b 7c 24 
04 8b 14 24 89 fe 31 ff 89 34 24 8b 34 24 89 7c 24 04 e8 79 dc ff ff 85 c0 
74 04 <0f> 0b eb fe 83 c4 10 5b 5e 5f c3 90 90 64 c6 05 21 b0 5a c0 01

Message from syslogd@standcanada at Sat Dec  5 15:30:57 2009 ...
standcanada kernel: EIP: [<c01034cb>] xen_write_ldt_entry+0x87/0x94 SS:ESP 
0069:ecbfdf64
==========================================================================
8g and 8l were built


What steps will reproduce the problem?
1.  8g helloworld.go 
2.  8l helloworld.8
3. ./8.out

What is the expected output? What do you see instead?

expected output: hello world
instead:
==========================================================================
Message from syslogd@standcanada at Sat Dec  5 18:52:52 2009 ...
standcanada kernel: ------------[ cut here ]------------

Message from syslogd@standcanada at Sat Dec  5 18:52:52 2009 ...
standcanada kernel: invalid opcode: 0000 [#3] SMP

Message from syslogd@standcanada at Sat Dec  5 18:52:52 2009 ...
standcanada kernel: Process 8.out (pid: 17332, ti=ec7d2000 task=ec80e030 
task.ti=ec7d2000)

Message from syslogd@standcanada at Sat Dec  5 18:52:52 2009 ...
standcanada kernel: Stack: 08eff205 00000000 2c8d2038 00000000 00000280 
0000000f 00000001 c0109941

Message from syslogd@standcanada at Sat Dec  5 18:52:52 2009 ...
standcanada kernel:        ec8a0358 ec8a0200 00000007 0805c474 000fffff 
00000051 c474ffff 08eff205
Segmentation fault
standcanada /usr/local/go/test:
Message from syslogd@standcanada at Sat Dec  5 18:52:52 2009 ...
standcanada kernel:        00000001 bfac8b7c 0804886f ec7d2000 c0106d46 
00000001 bfac8ac8 00000010

Message from syslogd@standcanada at Sat Dec  5 18:52:52 2009 ...
standcanada kernel: Call Trace:

Message from syslogd@standcanada at Sat Dec  5 18:52:52 2009 ...
standcanada kernel:  [<c0109941>] write_ldt+0x1a7/0x1c3

Message from syslogd@standcanada at Sat Dec  5 18:52:52 2009 ...
standcanada kernel:  [<c0106d46>] syscall_call+0x7/0xb

Message from syslogd@standcanada at Sat Dec  5 18:52:52 2009 ...
standcanada kernel:  [<c0430000>] hrtimer_nanosleep_restart+0x28/0x51


Message from syslogd@standcanada at Sat Dec  5 18:52:52 2009 ...
standcanada kernel:  =======================

Message from syslogd@standcanada at Sat Dec  5 18:52:52 2009 ...
standcanada kernel: Code: f3 89 fe 89 04 24 e8 d5 00 00 00 89 f1 8b 7c 24 
04 8b 14 24 89 fe 31 ff 89 34 24 8b 34 24 89 7c 24 04 e8 79 dc ff ff 85 c0 
74 04 <0f> 0b eb fe 83 c4 10 5b 5e 5f c3 90 90 64 c6 05 21 b0 5a c0 01

Message from syslogd@standcanada at Sat Dec  5 18:52:52 2009 ...
standcanada kernel: EIP: [<c01034cb>] xen_write_ldt_entry+0x87/0x94 SS:ESP 
0069:ec7d3f64
==========================================================================

What is your $GOOS?  $GOARCH?
GOOS=linux
GOARCH=386


Which revision are you using?  (hg identify)
bdfc3faa253a tip


Please provide any additional information below.
system is 32 bit Debian lenny running in a XEN slice on 64 bit hardware

uname -a
Linux standcanada 2.6.27.2-xenU #1 SMP Mon Oct 20 21:19:45 EDT 2008 i686 
GNU/Linux
@rsc
Copy link
Contributor

rsc commented Dec 10, 2009

Comment 1:

It's hard to say for sure, but it looks like this is a Xen kernel bug.
The stack trace you've shown:
standcanada kernel:  [<c0109941>] write_ldt+0x1a7/0x1c3
standcanada kernel:  [<c0106d46>] syscall_call+0x7/0xb
standcanada kernel:  [<c0430000>] hrtimer_nanosleep_restart+0x28/0x51
is definitely a kernel stack trace.  Perhaps the kernel is unable
or unwilling to accommodate Go's writing to the LDT.
I'll leave this as WaitingForReply because I'd like to know for
sure what's going on, but we'll need someone on a Xen system
to debug this.  Maybe there is documentation somewhere
about Xen disallowing such system calls.
Another thing to try is to run the binary using: strace -f 8.out
and see what system call strace thinks is running when the
kernel gets upset.

Labels changed: added helpwanted, os-linux.

Owner changed to r...@golang.org.

Status changed to WaitingForReply.

@gopherbot
Copy link
Author

Comment 2 by tntknight:

here are results of strace
strace -f ./8.out
execve("./8.out", ["./8.out"], [/* 24 vars */]) = 0
brk(0)                                  = 0x99008000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0xb8086000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0xb8085000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb8085690, limit:1048575, 
seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, 
useable:1}) = 0
modify_ldt(1, {entry_number:7, base_addr:0x805c474, limit:1048575, seg_32bit:1, 
contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}, 16) = 
0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

@rsc
Copy link
Contributor

rsc commented Dec 11, 2009

Comment 3:

I have attached two C programs. 
The first, x.c, calls modify_ldt.
The second, xx.c, calls set_thread_area
with the same arguments.
Please try compiling and running each of
them under Xen and see whether each works.
When they succeed, they print "ok" and "ok - set_thread_area".
If Xen kills off the modify_ldt version then that's
probably a Xen bug.  But if it lets the set_thread_area
one through, maybe we can switch to using that.

Attachments:

  1. x.c (707 bytes)
  2. xx.c (802 bytes)

@gopherbot
Copy link
Author

Comment 4 by tntknight:

both programs ran ok  (see below)
standcanada ~/ctest: gcc x.c -o x
standcanada ~/ctest: ./x
ok
standcanada ~/ctest: gcc xx.c -o xx
standcanada ~/ctest: ./xx
ok - set_thread_area
standcanada ~/ctest:
thanks for your efforts.

@nictuku
Copy link
Contributor

nictuku commented Dec 14, 2009

Comment 5:

I can reproduce this bug, but I got a different result when running "x" (it segfaults
sometimes).
Reproducing the bug:
~/go/src/pkg/archive/tar$ ./8.out 
Segmentation fault
~/go/src/pkg/archive/tar$ strace -f ./8.out -s 100000
execve("./8.out", ["./8.out", "-s", "100000"], [/* 22 vars */]) = 0
brk(0)                                  = 0x9905d000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb8043000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb8042000
set_thread_area({entry_number:-1 -> 6, base_addr:0xb8042690, limit:1048575,
seg_32bit:1, contents:0, read_exec_only:0, 
limit_in_pages:1, seg_not_present:0, useable:1}) = 0
modify_ldt(1, {entry_number:7, base_addr:0x80b9134, limit:1048575, seg_32bit:1,
contents:0, read_exec_only:0, 
limit_in_pages:1, seg_not_present:0, useable:1}, 16) = 0
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
Process 13739 detached
== dmesg ==
kernel BUG at arch/x86/xen/enlighten.c:436!
invalid opcode: 0000 [#2] SMP 
Pid: 13003, comm: 8.out Tainted: G      D W (2.6.27.2-xenU #1)
EIP: 0061:[<c01034cb>] EFLAGS: 00210282 CPU: 0
EIP is at xen_write_ldt_entry+0x87/0x94
EAX: ffffffea EBX: c4a89038 ECX: 00000001 EDX: 9134ffff
ESI: 08eff20b EDI: 00000000 EBP: 00000008 ESP: c9b75f64
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
Process 8.out (pid: 13003, ti=c9b74000 task=c9a257f0 task.ti=c9b74000)
Stack: 08eff20b 00000000 09a69038 00000000 00000280 0000000f 00000001 c0109941 
       c62e8c18 c62e8ac0 00000007 080b9134 000fffff 00000051 9134ffff 08eff20b 
       00000001 bff6f77c 0806ea1a c9b74000 c0106d46 00000001 bff6f6c8 00000010 
Call Trace:
 [<c0109941>] write_ldt+0x1a7/0x1c3
 [<c0106d46>] syscall_call+0x7/0xb
 [<c0430000>] hrtimer_nanosleep_restart+0x28/0x51
Testing with x.c and xx.c:
~/go/src/pkg/archive/tar$ ./x
Segmentation fault
~/go/src/pkg/archive/tar$ ./xx
ok - set_thread_area
(again)
~/go/src/pkg/archive/tar$ ./x
ok
~/go/src/pkg/archive/tar$ ./x
Segmentation fault
In the end, x.c crashes only sometimes, while "gotest" in pkg/archives/tar crashes every
time.

@gopherbot
Copy link
Author

Comment 6 by tntknight:

I went back and tested again and can confirm that x.c does cause a seg fault for me 
too on about 50% of tests (I've run it about 50 times and it appears to be random 
when it seg faults).  
xx.c did not crash ever in about 50 executions.
"gotest" in pkg/archives/tar crashes every time.

@rsc
Copy link
Contributor

rsc commented Dec 16, 2009

Comment 7:

Labels changed: added expertneeded, removed helpwanted.

@Luit
Copy link

Luit commented Jan 18, 2010

Comment 8:

Two weeks and still no change? The problem still seems to exist. What's holding this 
up?

@rsc
Copy link
Contributor

rsc commented Jan 18, 2010

Comment 9:

> What's holding this up?
We just have higher priority tasks, sorry.  If you'd like to fix it, please go ahead and
send 
us a patch.
I think that Devon tried to help someone on IRC with this problem, and switching to 
set_thread_area didn't help, so the problem is deeper than that.

Status changed to LongTerm.

@dhobsd
Copy link
Contributor

dhobsd commented Jan 18, 2010

Comment 10:

This is the person I was helping on IRC. I'll provide my patch here in case anybody
else wants to try it, but it causes segfaults for me on standard amd64 hardware
running i386.
I reverted my tree because it didn't seem to work for anybody, but this is what I
came up with. Perhaps it's a step in the right direction; perhaps setldt is a red
herring.

Attachments:

  1. sys.patch (1163 bytes)

@rsc
Copy link
Contributor

rsc commented Jan 18, 2010

Comment 11:

I installed Go on an Ubuntu Intrepid xen/x86_64 system but with GOARCH=386 (run 
x86-32 binaries) and it ran fine.  So it's not all Xens that are broken, at least.

@Luit
Copy link

Luit commented Jan 18, 2010

Comment 12:

That's a segfault for me too. Unpatched it's a Trace/breakpoint trap.
I could try more patches, but I'm afraid I'm way too inexperienced with x86 assembler 
to even try understanding sys.s, let alone patch it myself.

@Luit
Copy link

Luit commented Jan 18, 2010

Comment 13:

The most annoying thing is that binaries that run fine on my laptop won't run on my 
(xen) virtual server. 
Doesn't that just mean that the virtualisation is the one going wrong?

@rsc
Copy link
Contributor

rsc commented Jan 18, 2010

Comment 14:

Yes.  Xen doesn't completely virtualize an x86 Linux system and something the Go 
binaries are doing is not supported.

@Luit
Copy link

Luit commented Jan 18, 2010

Comment 15:

Then shouldn't some Xen-Guru be notified about this?

@rsc
Copy link
Contributor

rsc commented Jan 18, 2010

Comment 16:

The issue already says ExpertNeeded.  If you know any Xen experts, feel free to loop 
them in.  I don't.

@rsc
Copy link
Contributor

rsc commented Apr 20, 2011

Comment 17:

Other people seem to be using Xen okay.
No more information has come in about this, so not going to fix.

Status changed to TimedOut.

@djtm
Copy link

djtm commented Mar 12, 2016

The issue also comes up here for various people:
https://forum.syncthing.net/t/trace-breakpoint-trap-while-start-on-debian-6-32bit/504/6
There is a proposed workaround here:
http://william.shallum.net/random-notes/32-bit-golang-trace-breakpoint-trap-modify_ldt-enosys
But it doesn't work for me, probably because ldt wasn't compiled into my kernel.
It sounds like the call shouldn't be assumed to work any more:
https://lkml.org/lkml/2015/7/21/759

@davecheney
Copy link
Contributor

@djtm this issue was closed 5 years ago. Please open a new issue if this problem affects you.

@minux
Copy link
Member

minux commented Mar 12, 2016 via email

@golang golang locked and limited conversation to collaborators Mar 13, 2017
@rsc rsc removed their assignment Jun 22, 2022
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants