Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGBUS on ARM32 in utf8statetable.cc:517 #11

Closed
ghost opened this issue Jul 27, 2015 · 13 comments
Closed

SIGBUS on ARM32 in utf8statetable.cc:517 #11

ghost opened this issue Jul 27, 2015 · 13 comments

Comments

@ghost
Copy link

ghost commented Jul 27, 2015

Originally reported on Google Code with ID 11

I'm trying to get CLD2 working on ARM32 inside of Chromium, cross-compiling from a linux
x64 host to arm32. The library loads properly, but the following crash occurs when
calling DetectLanguageSummary:

Program received signal SIGBUS, Bus error.
#0  CLD2::UTF8GenericScan (st=0x61a82104, str=<optimized out>, bytes_consumed=0x5f00f88c)
    at ../../third_party/cld_2/src/internal/utf8statetable.cc:518

I'll attach the full trace as a file. Well, minus the Chromium bits. Anyhow, the problem
appears to be with this snippet of code in utf8statetable.cc:

  // Do fast for groups of 8 identity bytes.
  // This covers a lot of 7-bit ASCII ~8x faster than the 1-byte loop,
  // including slowing slightly on cr/lf/ht
  //----------------------------
  const uint8* Tbl2 = &st->fast_state[0];
  uint32 losub = st->losub;
  uint32 hiadd = st->hiadd;
  while (src < srclimit8) {
    uint32 s0123 = (reinterpret_cast<const uint32 *>(src))[0];
    uint32 s4567 = (reinterpret_cast<const uint32 *>(src))[1];
    src += 8;


Inspecting the pointers in the debugger during the crash, and looking at the "src"
variable, seems to reveal the problem:
(gdb) p src
$32 = (
    const CLD2::uint8 *) 0x58de4bee "\n\n\n百度一下\n地图贴吧视频图片hao123\n新闻应用音乐文库更多\n小说游戏下载\n把百度放到桌面上,
搜索最方便\n触屏版极速版\nBaidu 京ICP证030173号"

Specifically, src is located at 0x58de4bee. Since this isn't a 4-byte (32-bit) aligned
address, the SIGBUS presumably comes from trying to read it as a uint32*. Many thanks
to aberent@chromium.org and anton@chromium.org for the help in diagnosing this, I was
a bit lost in the weeds looking at my dynamic data changes, which turn out to be completely
unrelated (this happens with and without dynamic data mode).

The suggested workaround for this case is to %4 the address and do a one-off scan of
the first 0-3 bytes (as necessary), and then descend into the fast loop; the concern
is that there may be other places in CLD2 that have similar behavior and might be time
bombs. It might be a good idea to add some memory churning code to the unit tests,
and then start running the unit tests themselves on ARM to further diagnose other problems
like this that might arise.

Reported by andrewhayden@chromium.org on 2014-03-20 15:11:07


- _Attachment: [crashstack.txt](https://storage.googleapis.com/google-code-attachments/cld2/issue-11/comment-0/crashstack.txt)_
@ghost
Copy link
Author

ghost commented Jul 27, 2015

Setting priority to high, as Chromium can't use CLD2 on Android or (presumably) iOS
until this is resolved.

Reported by andrewhayden@google.com on 2014-03-20 15:12:01

  • Labels added: Priority-High
  • Labels removed: Priority-Medium

@ghost ghost self-assigned this Jul 27, 2015
@ghost
Copy link
Author

ghost commented Jul 27, 2015

PS, we should fix this for 64-bit while we're here, so replace "%4" in my previous comment
with "%8".

Reported by andrewhayden@google.com on 2014-03-20 15:12:54

@ghost
Copy link
Author

ghost commented Jul 27, 2015

For posterity:
$ cat /proc/cpuinfo | grep ARM
Processor       : ARMv7 Processor rev 10 (v7l)


I'm somewhat surprised by this, because the comments in port.h seem to imply that ARMv7+
can do unaligned reads but they're just slow. Not the case apparently, as the SIGBUS
came from an ARM7 device. Anyhow, I guess we're not using port.h here, and if we were
it would (irritatingly) probably be broken as well. But theoretically we could do UNALIGNED_LOAD32
for this as a short-term fix while we get the right thing done. WDYT?

Reported by andrewhayden@google.com on 2014-03-20 15:25:52

@ghost
Copy link
Author

ghost commented Jul 27, 2015

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0491c/CIHCGCFD.html
"To enable unaligned support, you must:
Clear the A bit, bit 1, of CP15 register 1 in your initialization code.
Set the U bit, bit 22, of CP15 register 1 in your initialization code."

Reported by anton@chromium.org on 2014-03-20 15:33:13

@ghost
Copy link
Author

ghost commented Jul 27, 2015

$ grep -R -5 "reinterpret_cast" . | grep -5 int32 > ~/Desktop/unaligned_landmines.txt

Reported by andrewhayden@google.com on 2014-03-20 15:34:52


- _Attachment: [unaligned_landmines.txt](https://storage.googleapis.com/google-code-attachments/cld2/issue-11/comment-5/unaligned_landmines.txt)_

@ghost
Copy link
Author

ghost commented Jul 27, 2015

."/cldutil_shared.cc-// OVERSHOOTS up to 3 bytes"

What could possibly go wrong?

Reported by anton@chromium.org on 2014-03-20 15:38:56

@ghost
Copy link
Author

ghost commented Jul 27, 2015

It looks to me like we need to make repairs in (at least) these spots:
1. UTF8GenericScan in utf8statetable.cc
2. UTF8GenericScanFastAscii in utf8statetable.cc
3. BiHashV2 in cldutil_shared.cc
4. QuadHashV2Mix in cldutil_shared.cc
5. OctaHash40Mix in cldutil_shared.cc

The data loader has the same problem, but it should be immune because the tables that
are being loaded are already 64-bit aligned by the data extractor. Hooray, forward
thinking (thanks, Dick!). Wouldn't hurt to drop a note in CLD2DynamicDataLoader::loadDataInternal
to this effect, though. Dick, do you see any other spots that need repairs?

Reported by andrewhayden@google.com on 2014-03-20 15:41:34

@ghost
Copy link
Author

ghost commented Jul 27, 2015

Here's a proposed patch that fixes #1. I've tested this on ARM, and it works. The unit
tests pass, and I threw in some logging (which I took out before generating the diff)
that shows that the unit tests as-is very thoroughly execute the range of values 0-7
as initial offsets. Hooray! This seems reasonable to me, but Dick, PTAL.

Reported by andrewhayden@google.com on 2014-03-21 18:09:35


- _Attachment: [cld2_issue11_patch1.diff](https://storage.googleapis.com/google-code-attachments/cld2/issue-11/comment-8/cld2_issue11_patch1.diff)_

@ghost
Copy link
Author

ghost commented Jul 27, 2015

Checking on the remaining functions:
1. UTF8GenericScan in utf8statetable.cc
   Fixed by the patch posted here.

2. UTF8GenericScanFastAscii in utf8statetable.cc
   Not called in any code that I can see. Can we just delete this?

3. BiHashV2 in cldutil_shared.cc
   Uses the UNALIGNED_LOAD macro, and should therefore not have a problem.

4. QuadHashV2Mix in cldutil_shared.cc
   Also uses the UNALIGNED_LOAD macro, and should therefore not have a problem.

5. OctaHash40Mix in cldutil_shared.cc
   Also uses the UNALIGNED_LOAD macro, and should therefore not have a problem.

So I think all we actually need is the patch I've posted here, or a better equivalent.
Dick, does this sound good?

Reported by andrewhayden@google.com on 2014-03-21 19:36:51

@ghost
Copy link
Author

ghost commented Jul 27, 2015

I had a long discussion offline with several subject matter experts about ARM7+ support
for unaligned access, and the takeaway was that under certain circumstances unaligned
access on ARM will cause a SIGBUS or other issues even if unaligned access is supported
and enabled. Their recommendation was that NO code should rely upon unaligned access
if it's going to run on ARM.

The rest of the code here uses the UNALIGNED_LOAD but it's my impression from the comments
in utf8statetable.cc that we need to be fast, so I see no obvious solution to fix this
other than the proposed patch.

Reported by andrewhayden@google.com on 2014-03-27 10:57:41

@ghost
Copy link
Author

ghost commented Jul 27, 2015

Dick made a fix to use UNALIGNED_LOAD:
https://code.google.com/p/cld2/source/detail?r=158

Sadly this doens't work. I believe the reason is that port.h incorrectly assumes that
we don't need aligned loads on ARM7:
https://code.google.com/p/cld2/source/browse/trunk/internal/port.h#57

The crash still occurs in exactly the same place with r158. We either need to fix port.h
or apply my patch to this method (and the others using UNALIGNED_LOAD).

Reported by andrewhayden@google.com on 2014-04-02 15:43:45

@ghost
Copy link
Author

ghost commented Jul 27, 2015

Here's a patch that fixes the problem for ARM. I'm dubious that we should be doing this
at all; we should probably just turn off unaligned access for all of ARM and be done
with it. Subsequently we can investigate the possibility of incrementally killing off
the use of UNALIGNED_LOAD in any critical-path code.

Reported by andrewhayden@google.com on 2014-04-02 15:58:16


- _Attachment: [port.h_patch_cld2_issue_11](https://storage.googleapis.com/google-code-attachments/cld2/issue-11/comment-12/port.h_patch_cld2_issue_11)_

@ghost
Copy link
Author

ghost commented Jul 27, 2015

port.h patch submitted as r159 after consultation with Dick.

Reported by andrewhayden@google.com on 2014-04-03 07:17:53

  • Status changed: Fixed

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

0 participants