Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: exit with useful message on ARM OABI #2533

Closed
lupino3 opened this issue Dec 6, 2011 · 30 comments
Closed

runtime: exit with useful message on ARM OABI #2533

lupino3 opened this issue Dec 6, 2011 · 30 comments

Comments

@lupino3
Copy link

lupino3 commented Dec 6, 2011

What steps will reproduce the problem?
1. Set up a Debian Lenny ARM VM under QEMU using the image and instructions available at
http://people.debian.org/~aurel32/qemu/arm/

The emulated processor is an ARM9 (according to
http://en.wikipedia.org/wiki/List_of_ARM_microprocessor_cores):

debian-arm:~/go/src/pkg/syscall# grep Processor /proc/cpuinfo
Processor       : ARM926EJ-S rev 5 (v5l)

2. Follow the instructions at http://golang.org/doc/install.html until the hg clone
command.

3. Compile go with the following commands 
debian-arm:~/go/src# GOOS=linux GOARCH=arm GOARM=9 ./make.bash
(I have also done the same with GOARM=5)

4. Run gofmt

What is the expected output?
no output, gofmt should wait for input on stdin

What do you see instead?
Illegal instruction

Which compiler are you using (5g, 6g, 8g, gccgo)?
5g (but I don't directly use any compiler)

Which operating system are you using?
Debian GNU/Linux Lenny for ARM

Which revision are you using?  (hg identify)
c1702f36df03 (release-branch.r60) release/release.r60.3
@bradfitz
Copy link
Contributor

bradfitz commented Dec 7, 2011

Comment 1:

You shouldn't need GOOS=linux or GOARCH=arm if you're within a fully emulated system. 
Those will be detected.
Do other binaries fail, or just gofmt?
You've checked that the gofmt you're running is the one you just built?

@rsc
Copy link
Contributor

rsc commented Dec 7, 2011

Comment 2:

Can you please try
6nm $(which gofmt) | grep sfloat
If it does not print anything, then the GOARM=5 did not take.
On the system you are using you definitely need the GOARM=5.

@lupino3
Copy link
Author

lupino3 commented Dec 7, 2011

Comment 3:

Hello,
Brad: I tried without GOOS and GOARCH and the result is the same. I am not adding the
bin directory to PATH, so I am quite sure that I am running the one I just built.
Russ: which won't find gofmt as it is not in the PATH. Both give me just the output
"Illegal instruction".
FYI:
debian-arm:~# ls -l {go-noenv,go-linux-5}/bin/gofmt
-rwxr-xr-x 1 root root 1847596 2011-12-06 22:35 go-linux-5/bin/gofmt
-rwxr-xr-x 1 root root 1843500 2011-12-07 11:03 go-noenv/bin/gofmt
debian-arm:~# ./go-noenv/bin/gofmt
Illegal instruction
debian-arm:~# ./go-linux-5/bin/gofmt
Illegal instruction
In go-noenv I ran make.bash without specifying any environment variable; In go-linux-5 I
ran GOOS=linux GOARCH=arm GOARM=5 ./make bash.
Thanks,
Andrea

@rsc
Copy link
Contributor

rsc commented Dec 7, 2011

Comment 4:

Thanks for the information.  It sounds like the ARM 5 build
has regressed.  I have yet to find an ARM 5 system with
storage that can withstand the beating that the Go builder
puts it through.  We need to set one up again.
Russ

@rsc
Copy link
Contributor

rsc commented Dec 9, 2011

Comment 5:

Labels changed: added priority-later, removed priority-medium.

@davecheney
Copy link
Contributor

Comment 6:

Hello, 
I think this may be another manifestation of 
https://golang.org/issue/2321
Can you have a look at dmesg and the contents of /proc/cpu/alignment. I have to run with
the value set to mixup (2), mainly because python generates tonnes of unaligned
accesses, but also something in the tree in the last three months is generating
unaligned memory accesses.

@lupino3
Copy link
Author

lupino3 commented Dec 11, 2011

Comment 7:

Hello,
I followed your suggestions, to no avail. dmesg does not change after I execute gofmt.
Here is /proc/alignment:
User:           0
System:         0
Skipped:        0
Half:           0
Word:           0
DWord:          0
Multi:          0
User faults:    0 (ignored)
I set it to 2:
User:           0
System:         0
Skipped:        0
Half:           0
Word:           0
DWord:          0
Multi:          0
User faults:    2 (fixup)
And I have the same error (Illegal Instruction) when executing gofmt (both the one
compiled for linux-arm-5 and the other for which no environment variables were specified
t compilation time.
I can give this VM to any developer, if needed.
Thanks,
Andrea

@davecheney
Copy link
Contributor

Comment 8:

Are you able to upload the arm5 version to this issue for me to in on my system?

@lupino3
Copy link
Author

lupino3 commented Dec 12, 2011

Comment 9:

Hello,
yes, I can do that if you give me a place where I should put it.
Thanks,
Andrea

@davecheney
Copy link
Contributor

Comment 10:

You should be able to add it to the issue using attach file.

@lupino3
Copy link
Author

lupino3 commented Dec 12, 2011

Comment 11:

The maximum attachment file size is 10MB, while the VM size is 1.8GB..

@davecheney
Copy link
Contributor

Comment 12:

Sorry my mistake. I only need the arm5 gofmt binary that faults.

@lupino3
Copy link
Author

lupino3 commented Dec 12, 2011

Comment 13:

Sorry for the misunderstanding :)
Please find attached the version of gofmt compiled with GOOS=linux GOARCH=arm GOARM=5,
called gofmt5, and the version compiled without specifying any environment version,
called gofmt-noenv.
Thanks,
Andrea

Attachments:

  1. gofmt5 (1847596 bytes)
  2. gofmt-noenv (1843500 bytes)

@davecheney
Copy link
Contributor

Comment 14:

gofmt5 runs perfectly on arm5, gofmt-noenv does not (not unexpected), it blows up on the
first arm6 instruction in math.init
Program received signal SIGILL, Illegal instruction.
0x000377c0 in math.init·1 ()
(gdb) x/i 0x000377c0
=> 0x377c0 <math.init·1+40>:   vmov.f64        d0, #112        ; 0x70

@lupino3
Copy link
Author

lupino3 commented Dec 12, 2011

Comment 15:

My doubt is whether the CPU emulated by QEMU is ARM5 or not.. I supposed it was ARM9.
Here is my /proc/cpuinfo:
Processor       : ARM926EJ-S rev 5 (v5l)
BogoMIPS        : 537.39
Features        : swp half thumb fastmult vfp edsp java 
CPU implementer : 0x41
CPU architecture: 5TEJ
CPU variant     : 0x0
CPU part        : 0x926
CPU revision    : 5
Cache type      : write-through
Cache clean     : not required
Cache lockdown  : not supported
Cache format    : Harvard
I size          : 4096
I assoc         : 4
I line length   : 32
I sets          : 32
D size          : 65536
D assoc         : 4
D line length   : 32
D sets          : 512
Hardware        : ARM-Versatile PB
Revision        : 0000
Serial          : 0000000000000000

@rsc
Copy link
Contributor

rsc commented Dec 12, 2011

Comment 16:

Labels changed: added priority-go1.

@gopherbot
Copy link

Comment 17 by devers49:

This looks like the same problem as that diagnosed by "fango" on golang-nuts at
https://groups.google.com/forum/#!msg/golang-nuts/nkWJKjYOdmU/e66qQu3ZajkJ
Quoting fango:
"""
This inst `vmov immediate` is not available on vfp2, which is for 
ARMv5/6. Go ARM has only two flavors - soft or hardware floating 
point, and in latter case, it is vfp3 (ARMv7). The other inst not in 
vfp2 is `vcvt - convert floating point to fix point`. 
"""
A fix that has worked for me is to have 5l/asm.c's chipzero() and chipfloat()
always say "no" for VFP<3.  This forces the AMOVF and AMOVD cases in
5l/obj.c:ldobj1() to load all fp constants from a literal pool rather than hit
the "VMOV imm 32/64" cases in asmout().
I don't know what the best way is to feed this extra bit of ARM-related entropy
into 5l.  My current patch introduces a new environment variable, GOARMVFP,
and linker flag -3, analogous to GOARM and -F.  But maybe it should be decoded
from the existing GOARM.  Or maybe 5l should target VFPv2 unconditionally.

@gopherbot
Copy link

Comment 18 by devers49:

Oh, whoops, I've probably misunderstood the OP's problem.  The GOARM=5 binary should be
softfp and run fine on the emulated ARM926EJ, as indeed it does for dcheney.  If it
really
doesn't, something's up with the GOARM=5 build as rsc says.
But setting GOARM=_9_ (as in the OP) isn't going to be softfp: it will be VFPv3 and fail
on the
emulated armv5 hardware.  The "5" in GOARM=5 is presumably an ISA (architecture) version,
not a chip family.
In which case, perhaps my previous comment could be charitably interpreted as a
suggestion
to allow Go to target ARMv6 architectures without giving up hardware fp. Pre-Cortex
hardware is still potentially interesting - Raspberry Pi is an ARM11, for example.
Apologies for the noise.

@gopherbot
Copy link

Comment 19 by devers49:

In an attempt to redeem myself, I tried gofmt5 (which _is_ compiled with softfloat) on
the qemu
image linked by the OP.  It does indeed get a SIGILL there.
Gdb says this:
Program received signal SIGILL, Illegal instruction.
0x00032544 in runtime.rt_sigaction ()
(gdb) bt
#0  0x00032544 in runtime.rt_sigaction ()
#1  0x0002eb98 in sigaction ()
Backtrace stopped: frame did not save the PC
(gdb) disass
Dump of assembler code for function runtime.rt_sigaction:
0x0003252c <runtime.rt_sigaction+0>:    ldr     r0, [sp, #4]
0x00032530 <runtime.rt_sigaction+4>:    ldr     r1, [sp, #8]
0x00032534 <runtime.rt_sigaction+8>:    ldr     r2, [sp, #12]
0x00032538 <runtime.rt_sigaction+12>:   ldr     r3, [sp, #16]
0x0003253c <runtime.rt_sigaction+16>:   mov     r7, #174        ; 0xae
0x00032540 <runtime.rt_sigaction+20>:   svc     0x00000000
0x00032544 <runtime.rt_sigaction+24>:   add     pc, lr, #0      ; 0x0
I believe the problem is the "svc 0x0" to do the rt_sigaction system call.
It is following the arm linux EABI syscall convention, with the syscall number
in r7.  The qemu image seems to be for the "arm" (not armel) port of debian
lenny, which uses the older OABI convention where the syscall number is
stashed in the svc instruction itself.  If dcheney's armv5 is running an
EABI kernel, that would explain why it works there but not for the OP.
The following alternative renderings of "_exit(2)" give evidence that the
OP's qemu is OABI:
debian-arm:~# objdump -d foo
foo:     file format elf32-littlearm
Disassembly of section .text:
00008054 <_start>:
    8054:       e3a00002        mov     r0, #2  ; 0x2
    8058:       e3a07001        mov     r7, #1  ; 0x1
    805c:       ef000000        svc     0x00000000
    8060:       eafffffe        b       8060 <_start+0xc>
debian-arm:~# ./foo
Illegal instruction
debian-arm:~# objdump -d bar
bar:     file format elf32-littlearm
Disassembly of section .text:
00008054 <_start>:
    8054:       e3a00002        mov     r0, #2  ; 0x2
    8058:       ef900001        svc     0x00900001
    805c:       eafffffe        b       805c <_start+0x8>
debian-arm:~# ./bar
debian-arm:~# echo $?
2
Given that go's runtime/sys_linux_arm.s appears committed to the EABI
convention, I think the OP would have better luck with a different linux.
This might also be what was going on in the golang-nuts thread here:
http://groups.google.com/group/golang-nuts/browse_thread/thread/4b0a8808d33fe8e9
tl;dr: it's not go, and it's not the ARM hardware variant, it's the linux kernel variant.
How anyone ever manages to hit the right combination of variables in ARM-land is
beyond me.  I hope I've not just injected more confusion.

@davecheney
Copy link
Contributor

Comment 20:

Nice catch. A little googling suggests that some recent qemu's can handle EABI but it's
neither simple nor clear cut so this is very likely the problem. 
Sent from my iPad

@robpike
Copy link
Contributor

robpike commented Jan 13, 2012

Comment 21:

Status changed to Accepted.

@robpike
Copy link
Contributor

robpike commented Jan 13, 2012

Comment 22:

Owner changed to builder@golang.org.

@rsc
Copy link
Contributor

rsc commented Jan 24, 2012

Comment 23:

We require EABI; the runtime should die early on OABI.
How to tell the difference?

@rsc
Copy link
Contributor

rsc commented Feb 8, 2012

Comment 24:

Happy to do this if someone can say how to detect OABI systems.

Labels changed: added priority-later, expertneeded, removed priority-go1.

Status changed to LongTerm.

@minux
Copy link
Member

minux commented Feb 9, 2012

Comment 25:

I've found a way to both detect OABI and report diagnosis to user.
CL is on the way.
The challenging part is how to do syscall if we don't know whether 
the kernel ABI is OABI or EABI.

@minux
Copy link
Member

minux commented Feb 9, 2012

Comment 26:

I've found a way to both detect OABI and report diagnosis to user.
CL is on the way.
The challenging part is how to do syscall when we don't know whether 
the kernel ABI is OABI or EABI.

@rsc
Copy link
Contributor

rsc commented Feb 9, 2012

Comment 27:

It seems easy: exit for OABI and otherwise syscalls are EABI.  No?

@minux
Copy link
Member

minux commented Feb 9, 2012

Comment 28:

There are complications: EABI kernel might also support OABI syscalls.
So we must do a real EABI syscall to see if the kernel truly supports EABI.
But an EABI syscall on OABI system will trigger a SIGILL, so if we want to
give user some diagnosis, we have to catch that SIGILL on OABI systems. To
do that, we have to make syscalls, then this is a chicken-and-egg problem....
My solution is: switch to thumb mode and make the syscall, because both
OABI and EABI system use the same syscall ABI in thumb mode.

@rsc
Copy link
Contributor

rsc commented Feb 9, 2012

Comment 29:

If you can do this in a few lines of code, then we can put
it in for Go 1.  Otherwise I'd like to wait until after Go 1,
for stability.

@rsc
Copy link
Contributor

rsc commented Feb 9, 2012

Comment 30:

This issue was closed by revision bb40196.

Status changed to Fixed.

@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants