Obsolete
Status Update
Comments
en...@google.com <en...@google.com>
ja...@gmail.com <ja...@gmail.com> #2
Note that the bug surfaced when I tried our app on android 5
ja...@gmail.com <ja...@gmail.com> #3
I'm also seeing this problem with our library IOCipher (https://github.com/guardianproject/IOCipher ). It all builds and works fine, until I switch the test suite from targeting android-15 to android-21, then running the tests on a android-19 (4.4.4) ROM. The NDK build uses r10d, and the IOCipher NDK build targeting android-7.
I found this workaround, which seems to work:
LOCAL_LDFLAGS += -fuse-ld=bfd
I found this workaround, which seems to work:
LOCAL_LDFLAGS += -fuse-ld=bfd
eu...@google.com <eu...@google.com> #4
please reopen if you still see this with r12.
ja...@gmail.com <ja...@gmail.com> #5
Its troublesome before __tsan_init because we lack __thread keyword support in bionic. I had to hack in a workaround in TSAN that does not use intercepted calls. (more on this below)
The majority of the problem actually occurs after __tsan_init has been called.
For example, pthread_create() calls a bunch of routines for the *new thread*, before it has had a chance to initialize its ThreadState. All of these routines running on the new thread are intercepted by tsan, and crashes because ThreadState has not been initialized yet.
A purely tsan-only fix is to "globally ignore" interceptors during troublesome times. This works fine for two threads, but doesn't scale to multiple threads - now you can miss events because they are being ignored.
I also tried putting in "global partial ignores" (i.e. ignore interceptors only on the new thread), but that got really hairy as well.
The fundamental problem is that TSAN is intercepting calls that it has no business intercepting.
For example, the same thing happens during thread join time, when one thread is being deconstructed.
Similar things happen during process exit time where ALL threads are being deconstructed.
Playing around with purely TSAN solution, all I managed to do was move around the actual place of the crash/hang/missed events.
The majority of the problem actually occurs after __tsan_init has been called.
For example, pthread_create() calls a bunch of routines for the *new thread*, before it has had a chance to initialize its ThreadState. All of these routines running on the new thread are intercepted by tsan, and crashes because ThreadState has not been initialized yet.
A purely tsan-only fix is to "globally ignore" interceptors during troublesome times. This works fine for two threads, but doesn't scale to multiple threads - now you can miss events because they are being ignored.
I also tried putting in "global partial ignores" (i.e. ignore interceptors only on the new thread), but that got really hairy as well.
The fundamental problem is that TSAN is intercepting calls that it has no business intercepting.
For example, the same thing happens during thread join time, when one thread is being deconstructed.
Similar things happen during process exit time where ALL threads are being deconstructed.
Playing around with purely TSAN solution, all I managed to do was move around the actual place of the crash/hang/missed events.
ja...@gmail.com <ja...@gmail.com> #6
On x86_64/linux, precisely zero calls are intercepted during pthread_create() on the new thread until it has finished initializing its ThreadState.
That does not happen on aarch64/bionic
That does not happen on aarch64/bionic
ja...@gmail.com <ja...@gmail.com> #7
Oh, the reason why I had to hack in a non-interceptible TLS workaround was that intercepted calls happen during pthread_key maintenance that occurs both during pthread_join time as well as at exit time.
What was happening when I was using pthread_get/setspecific() was that after a thread was destroyed, an intercepted call was GENERATING a NEW key for the dead thread. Hilarity ensued afterwards.
What was happening when I was using pthread_get/setspecific() was that after a thread was destroyed, an intercepted call was GENERATING a NEW key for the dead thread. Hilarity ensued afterwards.
ja...@gmail.com <ja...@gmail.com> #8
Back to the topic at hand,
does anyone have any ideas on how to limit the bionic headerfile changes purely for aarch64?
https://code.google.com/p/android/issues/detail?id=180578
I *think* it might be better to limit the new internal symbols to 64bit arm parts, but there does not seem to be a nice way to do this (yet).
While my build currently "works", I don't think it will build for other platforms (i.e. 32bit arm, mips etc...)
Thanks
does anyone have any ideas on how to limit the bionic headerfile changes purely for aarch64?
I *think* it might be better to limit the new internal symbols to 64bit arm parts, but there does not seem to be a nice way to do this (yet).
While my build currently "works", I don't think it will build for other platforms (i.e. 32bit arm, mips etc...)
Thanks
sr...@google.com <sr...@google.com> #9
It looks like __aarch64__ is defined for Clang builds. I am not sure about gcc, nor can I say that the bionic folks are going to be pleased with architecture-specific stuff polluting high-level headers.
en...@google.com <en...@google.com> #10
@11: i think he's confused and expecting __aarch64__ to be defined in code that's built 32-bit.
ja...@gmail.com <ja...@gmail.com> #11
No, not confused (at least I think)
So the crux of the matter is this.
I need to touch some .S files, which are necessarily architecture specific, to add in the private symbols we need to support TSAN.
I can either touch ALL .S files (arm, arch64, x86 ...), or just the ones that gets used while building a 64 bit device (even if they are 32bit libraries)
Now, is it OKAY for me to touch all the .S files?
If that is the case, thenhttps://code.google.com/p/android/issues/detail?id=180578 is moot.
OTOH, If you guys want me to scope my changes purely for 64bit arm devices, then
https://code.google.com/p/android/issues/detail?id=180578 DOES matter IF there are ever any cases where 64bit library/executable calls out to 32bit library
I hope this is clear
So the crux of the matter is this.
I need to touch some .S files, which are necessarily architecture specific, to add in the private symbols we need to support TSAN.
I can either touch ALL .S files (arm, arch64, x86 ...), or just the ones that gets used while building a 64 bit device (even if they are 32bit libraries)
Now, is it OKAY for me to touch all the .S files?
If that is the case, then
OTOH, If you guys want me to scope my changes purely for 64bit arm devices, then
I hope this is clear
ja...@gmail.com <ja...@gmail.com> #12
In other words,
if it is okay for me to touch all the variant memset.S and strlen.S files, then I do NOT need to have arch specific exceptions in the new .h files.
The downside is that its gonna be a bigger patch :-/
What do you guys think?
if it is okay for me to touch all the variant memset.S and strlen.S files, then I do NOT need to have arch specific exceptions in the new .h files.
The downside is that its gonna be a bigger patch :-/
What do you guys think?
ja...@gmail.com <ja...@gmail.com> #13
Hi everyone.
I upload a set of patches, all tagged with this bug.
I have no idea if your gerrit handles stacked patches well. (It didn't before)
If it does not, I can always upload a merged patch.
The rough order of patches are
1. new files + pthreads + mman.h
2. system calls (all the .S files and the gensyscalls.py)
3. string.h and friends
4. malloc.h and friends
5. ioctl/tcgetattr
Its not readily apparent what the order of the patches should be.
Here are the change-ids:
Support for Thread Sanitizer for Bionic
Change-Id: I724fb8d77059d3e01e1c4336d1aa2c0c550bbb80
Add new system call weak aliases, modified so that it is not arm64 specific
Change-Id: I0011bd344ba9ce977e429081a0cb84d8e59463fc
Touching all of string.h
Change-Id: I826f328a1849331dc39d5ff6847be02aff2ff1ba
Now touching malloc/free and friends
Change-Id: I5a2d07bfcf332556aeb0c3734b5ba122c80ee567
Now touching ioctl/tcgetattr
Change-Id: Ia709a5e7acf22239ad57aa22c06a867da6f6ef14
I upload a set of patches, all tagged with this bug.
I have no idea if your gerrit handles stacked patches well. (It didn't before)
If it does not, I can always upload a merged patch.
The rough order of patches are
1. new files + pthreads + mman.h
2. system calls (all the .S files and the gensyscalls.py)
3. string.h and friends
4. malloc.h and friends
5. ioctl/tcgetattr
Its not readily apparent what the order of the patches should be.
Here are the change-ids:
Support for Thread Sanitizer for Bionic
Change-Id: I724fb8d77059d3e01e1c4336d1aa2c0c550bbb80
Add new system call weak aliases, modified so that it is not arm64 specific
Change-Id: I0011bd344ba9ce977e429081a0cb84d8e59463fc
Touching all of string.h
Change-Id: I826f328a1849331dc39d5ff6847be02aff2ff1ba
Now touching malloc/free and friends
Change-Id: I5a2d07bfcf332556aeb0c3734b5ba122c80ee567
Now touching ioctl/tcgetattr
Change-Id: Ia709a5e7acf22239ad57aa22c06a867da6f6ef14
ja...@gmail.com <ja...@gmail.com> #14
Hi.
I just pushed up a new changeset that combines the 5 mentioned above.
https://android-review.googlesource.com/#/c/162104/
(Apparently gerrit still can't handle stacked changesets without manual intervention.
It marked the 5 prior patches as "unmergeable")
I just pushed up a new changeset that combines the 5 mentioned above.
(Apparently gerrit still can't handle stacked changesets without manual intervention.
It marked the 5 prior patches as "unmergeable")
ja...@gmail.com <ja...@gmail.com> #16
en...@google.com <en...@google.com>
sa...@google.com <sa...@google.com> #17
Thank you for your feedback. We assure you that we are doing our best to address the issue reported, however our product team has shifted work priority that doesn't include this issue. For now, we will be closing the issue as won't fix obsolete. If this issue currently still exists, we request that you log a new issue along with latest bug report here https://goo.gl/TbMiIO .
Description
TSAN intercepts way more calls on android than on x86_64 linux.
Before main() starts, TSAN/x86_64 intercepts about a dozen calls.
On AArch64/Android, TSAN intercepts some 4300+ calls before main.
The vast majority of these calls are the pthread API, and malloc and friends.
Even worse, these spurious interception occurs at the most troublesome spots w.r.t. TSAN i.e. before __tsan_init, thread creation, thread join, process exit. They expose races/deadlocks between TSAN and BIONIC in such a way that is not amenable to fixing purely within TSAN without completely re-implementing TSAN from scratch.
Therefore, solution seems to be to introduce a limited set of internal symbols -- something like
__bionic_pthread_mutex_lock()
that is used INSTEAD of the public API for calls within bionic and the linker.