Export to GitHub

chromium-os - issue #7327

Chrome_ChromeOS: Crash Report - Stack Signature: -334A56C


Posted on Oct 4, 2010 by Happy Camel

Product: Chrome_ChromeOS Stack Signature: -334A56C New Signature Label: BrowserRenderProcessHost::OnMessageReceived(IPC::Message const&) New Signature Hash: 4df708ba_4ae5f666_15c98d24_cd0f40d6_80a8c3d6

Report link: http://go/crash/reportdetail?reportid=3812b2a6891f5b4a

Meta information: Product Name: Chrome_ChromeOS Product Version: 7.0.531.0 Report ID: 3812b2a6891f5b4a Report Time: 2010/10/04 10:59:17, Mon Uptime: 329625 sec Cumulative Uptime: 0 sec OS Name: Linux OS Version: 0.0.0 Linux 2.6.32.21+drm33.7 #1 SMP Fri Sep 24 15:00:51 EDT 2010 i686 CPU Architecture: x86 CPU Info: GenuineIntel family 6 model 23 stepping 10

Thread 0 CRASHED ( SIGSEGV @ 0x00000000 )

0x6cf43b90 [chrome - ./base/tuple.h:547] BrowserRenderProcessHost::OnMessageReceived(IPC::Message const&) 0x6d9d828b [chrome - ipc/ipc_channel_proxy.cc:211] IPC::ChannelProxy::Context::OnDispatchMessage(IPC::Message const&) 0x6d9d8497 [chrome - ./base/tuple.h:547] RunnableMethod<IPC::ChannelProxy::Context, void (IPC::ChannelProxy::Context::)(IPC::Message const&), Tuple1<IPC::Message> >::Run() 0x6d33a2ba [chrome - base/message_loop.cc:410] MessageLoop::RunTask(Task) 0x6d33b1ad [chrome - base/message_loop.cc:419] MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) 0x6d33b5eb [chrome - base/message_loop.cc:526] MessageLoop::DoWork() 0x6d36e409 [chrome - base/message_pump_glib.cc:215] base::MessagePumpForUI::RunWithDispatcher(base::MessagePump::Delegate*, base::MessagePumpForUI::Dispatcher*) 0x6d33af8a [chrome - base/message_loop.cc:253] MessageLoop::RunInternal() 0x6d33afe8 [chrome - base/message_loop.cc:230] MessageLoopForUI::Run(base::MessagePumpForUI::Dispatcher*) 0x6ca5bcd1 [chrome - chrome/browser/browser_main.cc:468] BrowserMain(MainFunctionParams const&) 0x6ca54c22 [chrome - chrome/app/chrome_dll_main.cc:947] ChromeMain 0x6ca551e8 [chrome - chrome/app/chrome_exe_main_gtk.cc:49] main 0x0117da95 [libc-2.10.1.so + 0x00016a95]
0x6ca52860 [chrome + 0x00199860]
0x6ca5519f [chrome + 0x0019c19f]
0x009708af [ld-2.10.1.so + 0x0000e8af]

Comment #1

Posted on Oct 5, 2010 by Helpful Rhino

Took a look. This one looks hard.

http://src.chromium.org/viewvc/chrome/branches/531/src/chrome/browser/renderer_host/browser_render_process_host.cc?annotate=60177#l808

If I understand right, the crash occurred in BrowserRenderProcessHost::OnMessageReceived(), but Tuple class code is inlined, hence we cannot tell where in the function that the crash occurred. Anyway, I guess the crash was probably caused by some broken Task.

Looking at the crash report, the issue seems to have started from version 7.0.531.0 (i.e. no older reports found):

http://crash/search?query=product:%22Chrome_ChromeOS%22++crashed_thread_function_name:%22BrowserRenderProcessHost::OnMessageReceived(IPC::Message+const%26)%22+crashed_thread_function_name:%22IPC::ChannelProxy::Context::OnDispatchMessage(IPC::Message+const%26)%22

The issue seems to be Chrome OS specific (i.e. no reports from other platforms).

http://crash/search?query=+crashed_thread_function_name:%22BrowserRenderProcessHost::OnMessageReceived(IPC::Message+const%26)%22+crashed_thread_function_name:%22IPC::ChannelProxy::Context::OnDispatchMessage(IPC::Message+const%26)%22

Comment #2

Posted on Oct 5, 2010 by Helpful Rhino

Not much progress, but just for the record, here's some code reading memo:

  1. Check how base/tuple.h:547 looks like first:

template inline void DispatchToMethod(ObjT* obj, Method method, const Tuple1& arg) { (obj->*method)(arg.a); }

  1. OK, but Tuple doesn't seem to be used in BrowserRenderProcessHost::OnMessageReceived(), right?

I's not obvious that Tuple is used in this function. The trick is that the input parameter IPC::Message& msg is actually, a MessageWithTuple, defined in http://src.chromium.org/viewvc/chrome/branches/531/src/ipc/ipc_message_utils.h?revision=60177&view=markup

For instance, ViewHostMsg_UpdatedCacheStats is define like:

http://src.chromium.org/viewvc/chrome/branches/531/src/chrome/common/render_messages_internal.h?revision=60177&view=markup

IPC_MESSAGE_CONTROL1(ViewHostMsg_UpdatedCacheStats, WebKit::WebCache::UsageStats /* stats */)

The macro is defined like: http://src.chromium.org/viewvc/chrome/branches/531/src/ipc/ipc_message_macros.h?revision=60177&view=markup

define IPC_MESSAGE_CONTROL1(msg_class, type1) \

class msg_class : public IPC::MessageWithTuple< Tuple1 > { \ public: \ enum { ID = msg_class##__ID }; \ msg_class(const type1& arg1); \ ~msg_class(); \ static void Log(const Message* msg, std::string* l); \ };

  1. How is DispatchToMethod() called?

Macros in OnMessageReceived() are expanded like:

void BrowserRenderProcessHost::OnMessageReceived(const IPC::Message& msg) { mark_child_process_activity_time(); if (msg.routing_id() == MSG_ROUTING_CONTROL) {

bool msg_is_ok = true;
{ typedef BrowserRenderProcessHost _IpcMessageHandlerClass; const IPC::Message& ipc_message__ = msg; bool& msg_is_ok__ = msg_is_ok; switch (ipc_message__.type()) {
  case ViewHostMsg_UpdatedCacheStats::ID: msg_is_ok__ = ViewHostMsg_UpdatedCacheStats::Dispatch(&ipc_message__, this, &_IpcMessageHandlerClass::OnUpdatedCacheStats); break;

Dispatch is defined in ipc_message_util.h:

// Generic dispatcher. Should cover most cases.
template static bool Dispatch(const Message* msg, T* obj, Method func) { Param p; if (Read(msg, &p)) { DispatchToMethod(obj, func, p); return true; } return false; }

As shown, DispatchToMethod is called, which it the function defined in base/tuple.h.

Comment #3

Posted on Oct 5, 2010 by Helpful Rhino

This one SEGV'ed at 0xce419716 rather than 0x00000000.

http://crash/reportdetail?reportid=38a5d76d86436f48

Thread 0 CRASHED ( SIGSEGV @ 0xce419716 )

Comment #4

Posted on Oct 5, 2010 by Massive Bird

It looks like this could be a race condition on the BrowserRenderProcessHost.

There are three similar crashes: http://go/crash/reportdetail?reportid=3812b2a6891f5b4a http://crash/reportdetail?reportid=38a5d76d86436f48 http://crash/reportdetail?reportid=7f0a397081981847

All of these follow the same code path, but vary slightly. Two SIGSEGV on NULL, but one gets one function call further. The third SIGSEGV's at a random address. This implies a race condition to me.

RenderProcessHost is refcounted via Attach() and Release(), which do not appear to be called when a message is sent. Generally, Attach() and Release() seem to be called on non-UI threads. Since the IPC callback is occurring on the UI thread, it seems that RenderProcessHost, which is the "this" pointer, can be deleted while the message is being processed.

Comment #5

Posted on Oct 5, 2010 by Helpful Rhino

Maybe something like this was happening?

  1. the browser process tell the last render process to quit by sending a message
  2. however, the message is asynchronous hence the render process may keep sending messages to the browser process
  3. the browser process thinks that there are no render processes, so it's safe to delete the render process host (BrowserRenderProcessHost), which is not the case, and deletes it
  4. the browser process crashes while processing messages from the render process, as the render process host is gone.

If I understand code right, ref-counting of the render process host is done in render_widget_host.cc:

RenderWidgetHost::RenderWidgetHost(RenderProcessHost* process, int routing_id) ... process_->Attach(this, routing_id_);

RenderWidgetHost::~RenderWidgetHost() { ... process_->Release(routing_id_);

The object seems to tell the render process to quit, by sending ViewMsg_Close message:

void RenderWidgetHost::Shutdown() { if (process_->HasConnection()) { // Tell the renderer object to close.
process_->ReportExpectingClose(routing_id_); bool rv = Send(new ViewMsg_Close(routing_id_));

Comment #6

Posted on Oct 6, 2010 by Helpful Rhino

Similar crash occurred with 7.0.536.4

http://crash/reportdetail?reportid=b1e6a640a2871215

Comment #7

Posted on Oct 8, 2010 by Helpful Rhino

(No comment was entered for this change.)

Comment #8

Posted on Oct 8, 2010 by Happy Camel

Issue 7321 has been merged into this issue.

Comment #9

Posted on Oct 8, 2010 by Happy Camel

(No comment was entered for this change.)

Comment #10

Posted on Oct 18, 2010 by Happy Camel

(No comment was entered for this change.)

Comment #11

Posted on Oct 19, 2010 by Grumpy Wombat

Still there for 8.0.552.7: http://crash/reportdetail?reportid=315d52fe86304c60

Thread 21 CRASHED ( SIGSEGV @ 0x00000000 )

0x75a7a9df [chrome - ./base/tuple.h:537] RunnableMethod::Run() 0x75243eaa [chrome - base/message_loop.cc:410] MessageLoop::RunTask(Task*) 0x75244d9d [chrome - base/message_loop.cc:419] MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) 0x752451db [chrome - base/message_loop.cc:526] MessageLoop::DoWork() 0x75247367 [chrome - base/message_pump_default.cc:23] base::MessagePumpDefault::Run(base::MessagePump::Delegate*) 0x75244b4f [chrome - base/message_loop.cc:258] MessageLoop::RunInternal() 0x75244c59 [chrome - base/message_loop.cc:230] MessageLoop::RunAllPending() 0x75a7bf30 [chrome - jingle/notifier/listener/mediator_thread_impl.cc:78] notifier::MediatorThreadImpl::Logout() 0x75bab93d [chrome - chrome/browser/sync/notifier/server_notifier_thread.cc:54] sync_notifier::ServerNotifierThread::Logout() 0x75a82ed2 [chrome - jingle/notifier/listener/talk_mediator_impl.cc:54] notifier::TalkMediatorImpl::Logout() 0x75a8379f [chrome - jingle/notifier/listener/talk_mediator_impl.cc:28] notifier::TalkMediatorImpl::~TalkMediatorImpl() 0x75a83a5d [chrome - jingle/notifier/listener/talk_mediator_impl.cc:30] notifier::TalkMediatorImpl::~TalkMediatorImpl() 0x751fd214 [chrome - ./base/scoped_ptr.h:84] sync_api::SyncManager::SyncInternal::InitializeTalkMediator() 0x751fd52c [chrome - chrome/browser/sync/engine/syncapi.cc:1993] sync_api::SyncManager::SyncInternal::TalkMediatorLogin(std::string const&, std::string const&) 0x751fd655 [chrome - chrome/browser/sync/engine/syncapi.cc:1476] sync_api::SyncManager::SyncInternal::UpdateCredentials(sync_api::SyncCredentials const&) 0x74f238a2 [chrome - chrome/browser/sync/glue/sync_backend_host.cc:521] browser_sync::SyncBackendHost::Core::DoUpdateCredentials(sync_api::SyncCredentials const&) 0x74f21ef7 [chrome - ./base/tuple.h:547] RunnableMethod >::Run() 0x75243eaa [chrome - base/message_loop.cc:410] MessageLoop::RunTask(Task*) 0x75244d9d [chrome - base/message_loop.cc:419] MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) 0x752451db [chrome - base/message_loop.cc:526] MessageLoop::DoWork() 0x75247367 [chrome - base/message_pump_default.cc:23] base::MessagePumpDefault::Run(base::MessagePump::Delegate*) 0x75244b4f [chrome - base/message_loop.cc:258] MessageLoop::RunInternal() 0x75244cd9 [chrome - base/message_loop.cc:230] MessageLoop::Run() 0x752665e8 [chrome - ./base/thread.h:140] base::Thread::Run(MessageLoop*) 0x75266e9b [chrome - base/thread.cc:164] base::Thread::ThreadMain() 0x7524f8ed [chrome - base/platform_thread_posix.cc:35] ThreadFunc(void*) 0x73cab6de [libpthread-2.10.1.so + 0x000056de]
0x738e0aad [libc-2.10.1.so + 0x000ccaad]

Comment #12

Posted on Oct 21, 2010 by Happy Bird

I feel like I'm punting this, but the mediator is sync related, right?

Comment #13

Posted on Oct 25, 2010 by Happy Camel

(No comment was entered for this change.)

Comment #14

Posted on Oct 27, 2010 by Happy Ox

Just FYI. The stack trace in the comment 11 seems to be a different bug fixed in http://src.chromium.org/viewvc/chrome?view=rev&revision=62825 and merged to 552: http://src.chromium.org/viewvc/chrome?view=rev&revision=63119

For instance, RunAllPending() is now gone from MediatorThreadImpl::Logout().

http://codereview.chromium.org/3837002/diff/10/3005

0x75244c59 [chrome - base/message_loop.cc:230] MessageLoop::RunAllPending() 0x75a7bf30 [chrome - jingle/notifier/listener/mediator_thread_impl.cc:78] notifier::MediatorThreadImpl::Logout()

Comment #15

Posted on Nov 1, 2010 by Happy Camel

need to figure out if this still happens in the most recent builds

Comment #16

Posted on Nov 1, 2010 by Happy Camel

(No comment was entered for this change.)

Comment #17

Posted on Nov 1, 2010 by Happy Camel

(No comment was entered for this change.)

Comment #18

Posted on Nov 2, 2010 by Swift Wombat

Here is the stats from recent builds

8.0.552.26: 0 8.0.552.21: 2 8.0.552.7: 13 8.0.552.1: 8

Zel, I'm leaning towards lowering the priority of this bug and working on network update issue first for R9. Would you agree?

Comment #19

Posted on Nov 3, 2010 by Happy Camel

i agree

Comment #20

Posted on Nov 6, 2010 by Swift Wombat

Could be crbug.com/53991

Matt, what do you think?

Comment #21

Posted on Nov 8, 2010 by Grumpy Wombat

(No comment was entered for this change.)

Comment #22

Posted on Nov 8, 2010 by Happy Giraffe

@20, the first stack seems like a similar - but not quite the same - issue as 53991. In 53991, the Profile object was being deleted and later referenced by the BrowserRenderProcessHost. Here, based on the crash dump, some NULL pointer is being dereferenced. It could still be a shutdown-related problem, though.

Comment #23

Posted on Nov 9, 2010 by Swift Wombat

quick update on crash numbers.

8.0.552.101 1 8.0.552.47: 0 8.0.552.40: 8 8.0.552.26: 1

Comment #24

Posted on Nov 10, 2010 by Swift Wombat

I've been looking at code path and I just couldn't find any condition that can lead to NULL segfault.

It's failing at the tuple.h:547 which is

template inline void DispatchToMethod(ObjT* obj, Method method, const Tuple1& arg) { (obj->*method)(arg.a); }

In order for this line to fail with NULL segfault, either method or arg has to be NULL. (obj can be NULL as method is not virtual in this case)

Method are defined in macros in BrowserRenderProcessHost::OnMessageReceived, so they can't be NULL. arg is defined on stack in ipc_message_util.h:

// Generic dispatcher. Should cover most cases.
template static bool Dispatch(const Message* msg, T* obj, Method func) { Param p; // This is Tuple if (Read(msg, &p)) { DispatchToMethod(obj, func, p); return true; } return false; }

so arg should't be NULL either. Other possibility is that stack trace is wrong, or it could be memory corruption.

I'm going to add CHECK to see if I can identify which variable is causing this.

Comment #25

Posted on Nov 15, 2010 by Happy Camel

(No comment was entered for this change.)

Comment #26

Posted on Nov 19, 2010 by Swift Wombat

Since I added CHECKs to BrowserRenderProcessHost::OnMessageReceived, there is no crashes that has the same stack nor the one at CHECKs, although there are still crashes in OnMessageReceived. Satoru, can you check this too?

Comment #27

Posted on Nov 19, 2010 by Happy Camel

(No comment was entered for this change.)

Comment #28

Posted on Nov 29, 2010 by Happy Camel

(No comment was entered for this change.)

Comment #29

Posted on Nov 30, 2010 by Swift Wombat

There is one crash report in 552.324

http://crash/reportdetail?reportid=accf4c24431527ce#crashing_thread

It crashes on obj->*method(arg.a) rather than CHECKs, and this could be because i have "&& defined(CHECK), and linker may be picking wrong object that is compiled without CHECK. I'm going to simply include base/logging.h and see if we can better crash report.

Comment #30

Posted on Dec 3, 2010 by Happy Camel

I am not seeing this crash anymore in anything older than 8.0.552.105.

Do you want to close this one?

Comment #31

Posted on Dec 22, 2010 by Quick Giraffe

hi, if this hasn't been seen in a while, please revert r65729

Comment #32

Posted on Jan 10, 2011 by Quick Giraffe

ping?

Comment #33

Posted on Jan 15, 2011 by Quick Giraffe

Oshima: r65729 isn't still used, so can you please revert it? The description said "This is chromeos only and will be removed once I correct data.". We need to remove testing code once it's not needed anymore.

Comment #34

Posted on Jan 15, 2011 by Happy Ox

We haven't seen this and I'll revert it. Sorry for not responding sooner. Somehow email from tracker has been stop and haven't received any update.

Comment #35

Posted on Jan 26, 2011 by Grumpy Wombat

Bulk edit to add ForMerge-648 to 'Iteration:22 Mstone=R10 Team=UI' issues. Please update this label if the fix ends up to be somewhere else.

Comment #36

Posted on Jan 31, 2011 by Swift Wombat

I'm closing for now because we haven't seen this for a while. Please reopen if this happens again.

Comment #37

Posted on Sep 26, 2011 by Happy Ox

Re-opening for De-Hackathon. Comment + hack is in: chromium/src/base/tuple.h

Comment #38

Posted on Sep 26, 2011 by Swift Wombat

I reverted the change, but seems like there is one more change I had to revert. I'm on it.

Comment #39

Posted on Sep 26, 2011 by Grumpy Giraffe

(No comment was entered for this change.)

Comment #40

Posted on Sep 28, 2011 by Swift Wombat

(No comment was entered for this change.)

Comment #41

Posted on Sep 28, 2011 by Swift Wombat

Change was removed in 103063.

Comment #42

Posted on Mar 7, 2013 by Grumpy Hippo

(No comment was entered for this change.)

Comment #43

Posted on Mar 10, 2013 by Quick Rabbit

(No comment was entered for this change.)

Comment #44

Posted on Mar 13, 2013 by Happy Horse

Moved to: Issue chromium:189562

Status: Moved

Labels:
Type-Bug Pri-1 Area-DesktopUI Sev-1 Iteration-15 Iteration-16 Iteration-17 Iteration-18 ForMerge-648 HackRemoval OS-Chrome Cr-Stability M-16