Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: some kind of corruption since 02e5cb24c95a #5621

Closed
alberts opened this issue Jun 3, 2013 · 6 comments
Closed

runtime: some kind of corruption since 02e5cb24c95a #5621

alberts opened this issue Jun 3, 2013 · 6 comments

Comments

@alberts
Copy link
Contributor

alberts commented Jun 3, 2013

What steps will reproduce the problem?

Extract attached code. Set GOPATH.

go test -v bloom

What is the expected output?

tests pass

What do you see instead?

&bloom.basicFilter{bits:(*bloom.BitVector)(0xc2100b3940), m:0xea029, n:0x186a0, k:7,
p:0.01, h1:(*bloom.AdlerHash)(0xc21000a5c0), h2:(*bloom.CityHash)(0xc210072228)}
&bloom.basicFilter{bits:(*bloom.BitVector)(0xc2105280e0), m:0xdeaddead,
n:0xdeaddead, k:7, p:0.01, h1:(*bloom.AdlerHash)(0xc21058e660),
h2:(*bloom.CityHash)(0xc21058ae20)}
--- FAIL: TestGob-7 (0.06 seconds)
bloom_test.go:32: Marshal/Unmarshal failed
FAIL

Which compiler are you using (5g, 6g, 8g, gccgo)?

6g

Which operating system are you using?

linux

Which version are you using?  (run 'go version')

go version devel +02e5cb24c95a Tue May 28 17:59:10 2013 -0700 linux/amd64

also broken at tip

works with 81ccdb178fd7

Please provide any additional information below.

I couldn't get rid of the cgo dependency. I think there's some kind of interaction
between cgo and this change.

hash/city also has tests, which need the .sysos to be built with SSE 4.2 enabled to pass.

Attachments:

  1. bork.tar.bz2 (927984 bytes)
@dvyukov
Copy link
Member

dvyukov commented Jun 3, 2013

Comment 2:

I am pretty sure GC does not scan some argument pointers. I've debugged a similar one
for the preemptive scheduler.
Carl, can you write some unit tests for that code? I think it's possible to call some
functions and then call gentraceback with a custom callback that will dump where it sees
pointer, then check that it sees the pointers in correct locations (and only there). You
can see some runtime C tests if you search by runtime·testSchedLocalQueue.

@lexprfuncall
Copy link

Comment 3:

I am having some problems reproducing this crash.  Can anyone tell me what might be
going wrong?
$ env GOPATH=~/bork go test -v bloom
warning: building out-of-date packages:
    hash/city
    uuid
installing these packages with 'go test -i bloom' will speed future tests.
# hash/city
/usr/bin/ld: error: city.syso: multiple definition of 'CityHash64(char const*, unsigned
long)'
/usr/bin/ld: $WORK/hash/city/_obj/city.cc.o: previous definition here
/usr/bin/ld: error: city.syso: multiple definition of 'CityHash64WithSeed(char const*,
unsigned long, unsigned long)'
/usr/bin/ld: $WORK/hash/city/_obj/city.cc.o: previous definition here
/usr/bin/ld: error: city.syso: multiple definition of 'CityHash64WithSeeds(char const*,
unsigned long, unsigned long, unsigned long)'
/usr/bin/ld: $WORK/hash/city/_obj/city.cc.o: previous definition here
/usr/bin/ld: error: city.syso: multiple definition of 'CityHash128WithSeed(char const*,
unsigned long, std::pair<unsigned long, unsigned long>)'
/usr/bin/ld: $WORK/hash/city/_obj/city.cc.o: previous definition here
/usr/bin/ld: error: city.syso: multiple definition of 'CityHash128(char const*, unsigned
long)'
/usr/bin/ld: $WORK/hash/city/_obj/city.cc.o: previous definition here
/usr/bin/ld: error: cityc.syso: multiple definition of 'CityHash64_'
/usr/bin/ld: $WORK/hash/city/_obj/cityc.cc.o: previous definition here
/usr/bin/ld: error: cityc.syso: multiple definition of 'CityHash128_'
/usr/bin/ld: $WORK/hash/city/_obj/cityc.cc.o: previous definition here
/usr/bin/ld: error: cityc.syso: multiple definition of 'CityHashCrc128_'
/usr/bin/ld: $WORK/hash/city/_obj/cityc.cc.o: previous definition here
collect2: ld returned 1 exit status
FAIL    bloom [build failed]

@lexprfuncall
Copy link

Comment 4:

Okay, a "go test -i bloom" seems to have fixed it.  I am not sure why that would be
related but I might be missing something.  Here is what I see now
$ env GOPATH=~/bork go test bloom
&bloom.basicFilter{bits:(*bloom.BitVector)(0xc21003f940), m:0xea029, n:0x186a0, k:7,
p:0.01, h1:(*bloom.AdlerHash)(0xc21000a5c0), h2:(*bloom.CityHash)(0xc210011228)}
&bloom.basicFilter{bits:(*bloom.BitVector)(0xc2105100e0), m:0xdeaddead, n:0xdeaddead,
k:7, p:0.01, h1:(*bloom.AdlerHash)(0xc21051f630), h2:(*bloom.CityHash)(0xc21051de28)}
--- FAIL: TestGob (0.10 seconds)
    bloom_test.go:32: Marshal/Unmarshal failed
FAIL
FAIL    bloom   0.140s
This is very repeatable, thanks!  I am looking into the root cause now.

Owner changed to @lexprfuncall.

Status changed to Accepted.

@ianlancetaylor
Copy link
Contributor

Comment 5:

Perhaps https://golang.org/cl/9988043 will fix this.  I'm not sure.  Can
somebody give it a try?  Thanks.

@lexprfuncall
Copy link

Comment 6:

The pointer maps are correct, the problem seems to be related to unsafe use of uintptr. 
Here is the offending function
// encodeStruct encodes a single struct value.
func (enc *Encoder) encodeStruct(b *bytes.Buffer, engine *encEngine, basep uintptr) {
    state := enc.newEncoderState(b)
    state.fieldnum = -1
    for i := 0; i < len(engine.instr); i++ {
        instr := &engine.instr[i]
        p := unsafe.Pointer(basep + instr.offset)
        if instr.indir > 0 {
            if p = encIndirect(p, instr.indir); p == nil {
                continue
            }
        }
        instr.op(instr, state, p)
    }
    enc.freeEncoderState(state)
}
I suspect this is not the last bug we will have of this sort.  Fortunately, they are
easy to diagnose.  
This is what I do.... Here is the decoding and pointer scanning loop inside
addframeroots.  You want to start by adding the "IGNORE" printf and run the affected
test.
                for((j = (rem < 32) ? rem : 32); j > 0; j--) {
                    if(w & b)
                        addroot((Obj){ap, sizeof(uintptr), 0});
                    else {
                        runtime·printf("IGNORE %X @ %X in %S\n", *(uintptr*)ap, (uintptr)ap, f->name);
                    }
                    b <<= 1;
                    ap += sizeof(uintptr);
Most of the values being ignored will be junk.  That is not interesting to us.  Ignored
pointer values are interesting.  Grep the test output for "^IGNORE 0xc2" to find the
ignored pointers on 6g compiled code.  5g and 8g will need a different pattern.  You
will see something like this
IGNORE 0xc2000186a0 @ 0x7fc8f7ba6d40 in bloom.genUUIDs
IGNORE 0xc2000186a0 @ 0x7fc8f7ba6d40 in bloom.genUUIDs
IGNORE 0xc210438b2c @ 0x7fc8f7ba6a20 in encoding/gob.(*Encoder).encodeArray
IGNORE 0xc21000a980 @ 0x7fc8f7ba6ac0 in encoding/gob.(*Encoder).encodeStruct
IGNORE 0xc21005fba0 @ 0x7fc8f7ba6b28 in encoding/gob.(*Encoder).encodeStruct
IGNORE 0xc210438b2c @ 0x7fc8f7ba6a20 in encoding/gob.(*Encoder).encodeArray
IGNORE 0xc21000a980 @ 0x7fc8f7ba6ac0 in encoding/gob.(*Encoder).encodeStruct
IGNORE 0xc21005fba0 @ 0x7fc8f7ba6b28 in encoding/gob.(*Encoder).encodeStruct
IGNORE 0xc21002c000 @ 0x7fc8f7ba69b0 in reflect.deepValueEqual
IGNORE 0xc200000002 @ 0x7fc8f7ba6b50 in reflect.deepValueEqual
IGNORE 0xc210011090 @ 0x7fc8f7ba6cf0 in reflect.deepValueEqual
IGNORE 0xc21002c000 @ 0x7fc8f7ba69b0 in reflect.deepValueEqual
IGNORE 0xc200000002 @ 0x7fc8f7ba6b50 in reflect.deepValueEqual
IGNORE 0xc210011090 @ 0x7fc8f7ba6cf0 in reflect.deepValueEqual
In the absence of a hunch about what might be going wrong based on this data alone,
instrument the "else" clause as follows
                    else {
                        if (
                            StringCompare(f->name, (byte*)"bloom.genUUIDs") ||
                            StringCompare(f->name, (byte*)"encoding/gob.(*Encoder).encodeArray") ||
                            StringCompare(f->name, (byte*)"encoding/gob.(*Encoder).encodeStruct") ||
                            StringCompare(f->name, (byte*)"reflect.deepValueEqual) ||
                            false
                            )
                            addroot((Obj){ap, sizeof(uintptr), 0});
                    }
Start commenting out the StringCompare lines one by one until something breaks.
Here is my StringCompare definition
static bool StringCompare(String s1, byte* s2) {
    return s1.len == runtime·findnull(s2) && !runtime·mcmp(s1.str, s2, s1.len);
}

@ianlancetaylor
Copy link
Contributor

Comment 7:

This issue was closed by revision 941db1e.

Status changed to Fixed.

@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants