Export to GitHub

pure-lang - issue #14

shared pure segfaults on startup (0.31 on FreeBSD 7.2 i386 with LLVM 2.5


Posted on Aug 26, 2009 by Grumpy Dog

What steps will reproduce the problem? 1. LDFLAGS=-L/usr/local/lib ./configure --with-libiconv-prefix=/usr/local --enable-debug --prefix=/opt 2. gmake all 3. LD_LIBRARY_PATH=. PURELIB=./lib ./pure

What is the expected output?

Pure 0.31 (i386-unknown-freebsd7.2) ...

What do you see instead?

Assertion failed: (errorcode == 0), function Mutex, file Mutex.cpp, line 85.

What version of the product are you using? On what operating system?

I see this with 0.27 and 0.31 (on FreeBSD RELENG_7_2); didn't test other versions (or trunk).

Please provide any additional information below.

see attached output of a gdb session (r, bt, bt full)

static pure works as advertised

Attachments

Comment #1

Posted on Aug 27, 2009 by Massive Panda

Yes, this looks familiar. It hits a failed assertion in LLVM, so this is most likely an LLVM issue. Unfortunately, I don't have FreeBSD installed so that I can't debug the issue myself. Any further input which helps to resolve this issue will be much appreciated.

Interestingly, NetBSD doesn't exhibit that behaviour, LLVM and Pure works fine there.

For the time being, you might be able to work around this by building a statically linked version of the interpreter (configure --disable-shared). Does that work for you?

Which LLVM version do you use? Have you tried the 2.6 release branch or current trunk of LLVM? (For the latter you need Pure 0.32, due to be released today.)

Also, what's the gcc version? (There are some gcc versions which are known to not work with LLVM.)

Comment #2

Posted on Aug 27, 2009 by Grumpy Dog

compiler is the system one:

g++ (GCC) 4.2.1 20070719 [FreeBSD]

the static build does work ok in both 0.27 and 0.31, sorry for not stressing that enough.

LLVM is 2.5 (built from ports, FreeBSD equivalent of a "source package"), haven't tried other versions yet.

i'll post further info (config.log, etc) later today.

Comment #3

Posted on Aug 27, 2009 by Massive Panda

Oops, you already mentioned that it's LLVM 2.5. So it might be worth looking at the LLVM 2.6 release branch and trunk in svn and see whether one of these fix the problem. Instructions for getting LLVM from svn can be found here: http://llvm.org/docs/GettingStarted.html#checkout

Comment #4

Posted on Aug 27, 2009 by Massive Panda

gcc 4.2.1 should be ok with LLVM.

Comment #5

Posted on Aug 27, 2009 by Massive Panda

The port Makefile looks good, too. So it's not an issue with the build options AFAICS.

Comment #6

Posted on Aug 27, 2009 by Grumpy Dog

the Mutex.cpp assert disappears when I run ./configure with LIBS=-lpthread

about half of the tests still dump core, though. all with:

While deleting: ... An asserting value handle still pointed to this value!

I'm still poking at the second problem...

Comment #7

Posted on Aug 27, 2009 by Grumpy Dog

0.31 + 2.5 works:

LIBS=-lpthread LDFLAGS=-L/usr/local/lib ./configure --prefix=/opt --with-libiconv-prefix=/usr/local --enable-debug && gmake all check

ends in a series of "passed" (save for the expected failure of 020)

2.6.r71086 bombs, as I wrote in #6, will try later revisions tonight.

configure should discover the need to use -lpthread; I'll try to produce a patch to that effect, unless you beat me to it (which I'd really welcome, haven't touched autoconf or m4 in a few years).

Comment #8

Posted on Aug 27, 2009 by Massive Panda

Ok, I fixed up the linker options in r2131. Attached is an updated tarball with the current svn sources. Could you please try again with this version?

Also, can you please post an execution log with those "While deleting:" messages so that I can have a look at them?

Attachments

Comment #9

Posted on Aug 27, 2009 by Grumpy Dog

re "unless you beat me to it": looks like you did in r2131, thanks!

Comment #10

Posted on Aug 27, 2009 by Massive Panda

This is very good news! :) Eddie Rucker and I have worked on that issue a bit some time ago, but I would have never expected it to be a linker issue. Something is very broken with this gcc version if it generates dysfunctional code instead of reporting a linkage error in such a case.

Anyway, I'm happy that it works now (at least with LLVM 2.5).

About the LLVM 2.6 breakage: Could that be related to issue #9? As you're running on a 32 bit system, you should try configuring LLVM >= 2.6 with --disable-pic, as described in Pure's INSTALL file. (--enable-pic is the default since LLVM 2.6, which is unfortunate because AFAICT http://llvm.org/bugs/show_bug.cgi?id=3239 is still unfixed, which causes the LLVM JIT to generate wrong PIC code on x86-32 systems.)

Talking about --enable/disable-pic, in the LLVM 2.5 FreeBSD port I find neither. This probably means that the port won't work on x86-64. Maybe you could resolve that issue with the maintainer of the port? Here are the LLVM configure options that must be used on x86-32/64 systems, respectively:

x86-32: --disable-pic (default with LLVM 2.5) x86-64: --enable-pic (default with LLVM >= 2.6)

Once http://llvm.org/bugs/show_bug.cgi?id=3239 is resolved, --disable-pic should work on either system, at least that's what the LLVM developers told me.

Comment #11

Posted on Aug 27, 2009 by Massive Panda

As luck has it, I can reproduce the "While deleting:" failed assertions on my 32 bit Linux system with LLVM 2.5 now (funny enough, 'make check work ok there, but compiling pure-gen hits it). Will fix asap.

Comment #12

Posted on Aug 28, 2009 by Massive Panda

Well, turns out that those failed assertions I saw were in the batch compiler, so they're probably not related to what you see using LLVM 2.6. Anyway, I fixed those now (svn r2136).

Do you still have those failed assertions on LLVM 2.6? If so, can you please post a backtrace?

Comment #13

Posted on Aug 28, 2009 by Massive Panda

I've just uploaded Pure 0.33, which has all the latest fixes. Please give that version a try, thanks.

http://pure-lang.googlecode.com/files/pure-0.33.tar.gz

Comment #14

Posted on Aug 29, 2009 by Grumpy Dog

i haven't gotten around to rebuilding llvm-2.6.r71086 without PIC yet, but 0.33 passes many more tests with the same llvm-2.6.r71086 build i reported before (test020 failure is expected):

roman@sachmet ~/install/pure-0.33 1003:0 > gmake check Running tests. prelude.pure: passed test001.pure: passed test002.pure: passed test003.pure: passed test004.pure: passed test005.pure: passed test006.pure: passed test007.pure: passed test008.pure: passed test009.pure: passed test010.pure: passed test011.pure: passed test012.pure: passed test013.pure: passed test014.pure: passed test015.pure: Abort trap (core dumped) FAILED test016.pure: passed test017.pure: passed test018.pure: passed test019.pure: passed test020.pure: FAILED test021.pure: passed test022.pure: passed test023.pure: passed test024.pure: passed test025.pure: Abort trap (core dumped) FAILED test026.pure: passed test027.pure: passed test028.pure: Abort trap (core dumped) FAILED test029.pure: passed test030.pure: passed test031.pure: Abort trap (core dumped) FAILED test032.pure: passed test033.pure: passed test034.pure: passed test035.pure: passed test036.pure: Abort trap (core dumped) FAILED test037.pure: passed test038.pure: passed test039.pure: passed test040.pure: passed test041.pure: Abort trap (core dumped) FAILED test042.pure: passed gmake: * [check] Error 1

Attachments

Comment #15

Posted on Aug 30, 2009 by Grumpy Dog

r2155 with llvm-2.7.r80431, same set of failed tests (diffs attached):

Running tests. prelude.pure: passed test001.pure: passed test002.pure: passed test003.pure: passed test004.pure: passed test005.pure: passed test006.pure: passed test007.pure: passed test008.pure: passed test009.pure: passed test010.pure: passed test011.pure: passed test012.pure: passed test013.pure: passed test014.pure: passed test015.pure: Abort trap (core dumped) FAILED test016.pure: passed test017.pure: passed test018.pure: passed test019.pure: passed test020.pure: FAILED test021.pure: passed test022.pure: passed test023.pure: passed test024.pure: passed test025.pure: Abort trap (core dumped) FAILED test026.pure: passed test027.pure: passed test028.pure: Abort trap (core dumped) FAILED test029.pure: passed test030.pure: passed test031.pure: Abort trap (core dumped) FAILED test032.pure: passed test033.pure: passed test034.pure: passed test035.pure: passed test036.pure: Abort trap (core dumped) FAILED test037.pure: passed test038.pure: passed test039.pure: passed test040.pure: passed test041.pure: Abort trap (core dumped) FAILED test042.pure: passed gmake: * [check] Error 1

Attachments

Comment #16

Posted on Aug 30, 2009 by Grumpy Dog

these segfaults affect both dynamic and static build

Comment #17

Posted on Aug 30, 2009 by Massive Panda

Yes, they're not actually segfaults, but some failed assertions in LLVM. But I just can't reproduce these here. Which options did you configure LLVM with?

Comment #18

Posted on Aug 31, 2009 by Grumpy Dog

I'm using the devel/llvm-devel port (with patches other than files/patch-tools_clang_lib_Headers_Makefile and files/patch-tools_clang_utils_scan-build removed; they were incorporated upstream). the port defines just --enable-optimized, and I might have added --disable-pic for the build. OTOH, I just realized I had leftovers from a previous 2.6 install in the same location, which could break stuff as well.

i'll chime in here as soon as i have anything more conclusive.

Comment #19

Posted on Aug 31, 2009 by Massive Panda

I think that --enable-expensive-checks (which is enabled by default) might be the issue. Since I never have this enabled, this would explain why I don't see these failed assertions. Will try that later today.

NB: --enable-optimized alone is not good enough for a production version of LLVM. I recommend configuring LLVM with --enable-optimized --disable-assertions --disable-expensive-checks (unless you need to debug LLVM itself), otherwise generating LLVM IR will be very slow.

Comment #20

Posted on Aug 31, 2009 by Massive Panda

Yes, --enable-expensive-checks was the culprit. The failed assertions seem to be bogus, but anyway I worked around that in r2162. Can you please verify?

(I still recommend that LLVM should be configured with --enable-optimized --disable-assertions --disable-expensive-checks to improve performance.)

Comment #21

Posted on Sep 1, 2009 by Massive Panda

Ok, to the best of my knowledge the remaining issues should be fixed in r2162, so I'm setting the status of this bug report to "Fixed" now. Please let me know if it works for you.

Comment #22

Posted on Sep 1, 2009 by Grumpy Dog

yes, this works, thanks!

Comment #23

Posted on Sep 1, 2009 by Massive Panda

Great, thanks for helping to get these issues sorted out!

Comment #24

Posted on Oct 6, 2009 by Massive Panda

Roman, please note that I had to back out r2162 again, as it caused some serious leaks in the JIT which would eventually cause the JIT to abort when running out of memory for function stubs (which could happen, e.g., when repeatedly calling 'eval' on expressions involving local functions). So the code I disabled in r2162 (in order work around the issues with LLVM's --enable-expensive-checks) is needed after all, to prevent more serious misbehaviour.

The only thing I can recommend right now to deal with this issue is to just build LLVM with --disable-expensive-checks. --enable-expensive-checks is the real culprit here, and doesn't buy you anything (just slows down the compiler) unless you really need to debug LLVM.

Status: Fixed

Labels:
Type-Defect Priority-Medium