| Issue 81449: | Chrome: Crash Report - Stack Signature: -19BBA34 | |
| 12 people starred this issue and may be notified of changes. | Back to list |
|
Sign in to add a comment
|
Product: Chrome Stack Signature: -19BBA34 New Signature Label: RunnableFunction<GpuProcessHostUIShim * (*)(int),Tuple1<int> >::Run() New Signature Hash: b1ffce06_b49643f6_9d2a5c68_06020831_dedb35da Report link: http://go/crash/reportdetail?reportid=4b8261fef862d40e Meta information: Product Name: Chrome Product Version: 13.0.754.0 Report ID: 4b8261fef862d40e Report Time: 2011/05/03 19:22:53, Tue Uptime: 11 sec Cumulative Uptime: 0 sec OS Name: Windows NT OS Version: 5.1.2600 Service Pack 3 CPU Architecture: x86 CPU Info: GenuineIntel family 15 model 4 stepping 9 Thread 9 *CRASHED* ( EXCEPTION_ACCESS_VIOLATION_READ @ 0x00000000 ) 0x0254c3c2 [chrome.dll - task.h:459 RunnableFunction<GpuProcessHostUIShim * (*)(int),Tuple1<int> >::Run() 0x021ba1ad [chrome.dll - message_loop.cc:100 `anonymous namespace'::TaskClosureAdapter::Run() 0x021baa20 [chrome.dll - message_loop.cc:458 MessageLoop::RunTask(MessageLoop::PendingTask const &) 0x021baaa5 [chrome.dll - message_loop.cc:476 MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const &) 0x021bae46 [chrome.dll - message_loop.cc:666 MessageLoop::DoWork() 0x7c802607 [kernel32.dll + 0x00002607] WaitForSingleObjectEx 0x021bb88e [chrome.dll - bind_internal.h:1065 base::internal::InvokerStorage1<void ( `anonymous namespace'::TaskClosureAdapter::*)(void),A0x4166a960::TaskClosureAdapter *>::~InvokerStorage1<void ( `anonymous namespace'::TaskClosureAdapter::*)(void),A0x4166a960::TaskClosureAdapter *>() 0x021d1a00 [chrome.dll - utf_string_conversion_utils.cc:101 base::WriteUnicodeCharacter(unsigned int,std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> > *) 0x021d1b23 [chrome.dll - message_pump_default.cc:50 base::MessagePumpDefault::Run(base::MessagePump::Delegate *) 0x01c7492c [chrome.dll - generic_allocators.cc:16 generic_cpp_alloc 0x021ba8e4 [chrome.dll - message_loop.cc:406 MessageLoop::RunHandler() 0x021c652f [chrome.dll - thread.cc:128 base::Thread::Run(MessageLoop *) 0x021c6642 [chrome.dll - thread.cc:164 base::Thread::ThreadMain()
,
May 5, 2011
This crash actually appeared in 13.0.751.0. It has a different signature in each build, probably because the optimizer is merging template code that is the same. 751 -> RunnableFunction<void (*)(net::URLRequestContextGetter *),Tuple1<scoped_refptr<net::URLRequ ... 752 -> RunnableFunction<void (*)(void *),Tuple1<Profile *> >::Run() 753 -> RunnableFunction<void (*)(IOThread *),Tuple1<IOThread *> >::Run() 754 -> RunnableFunction<GpuProcessHostUIShim * (*)(int),Tuple1<int> >::Run() 755 -> RunnableFunction<void (*)(`anonymous namespace'::PluginsDOMHandler::ListWrapper *),Tuple1< ... 756 -> RunnableFunction<void (*)(MessageLoop *),Tuple1<MessageLoop *> >::Run() In all cases it involves the invocation of a single argument callback function that leads to a null dereference. It always happens in the browser process. The regression range was probably 83287:83481. I have no leads.
,
May 9, 2011
Looking at the intersection of plugins and GPU, could it be http://src.chromium.org/viewvc/chrome?view=rev&revision=83442? ----- piman@google.com Rework FlushSync to return early if commands have been processed since the last update BUG=80480
Cc: piman@chromium.org eroman%chromium.org@gtempaccount.com
,
May 9, 2011
I'm guessing the current top browser crasher on the Canary, http://crash/reportdetail?reportid=01feccecee80e026#crashing_thread is a relative of this issue.
,
May 9, 2011
None of the changes in 83442 appear to affect code that runs in the browser process so I think it is unlikely to be that. I have inspected every change to code that runs on windows in the 83287:83481 probable regression and range and nothing stands out. I particularly looked for code that runs only on windows and code that invokes callbacks, especially through a variable that might be null. laforge, would it be possible to push a canary build with the compiler / linker setting that merges identical ganerated code disabled so that we can get an accurate signature for the related callback? I expect the binary would be larger than usual.
,
May 11, 2011
I landed this at r84869: http://codereview.chromium.org/6982004 It will make PostTask assert in Release builds if null is passed to any of the variants of PostTask. It is not firing in 13.0.762.0, which is built from r84939. Since r84939 contains the new assert, this bug is not caused by null being passed to PostTask. In 13.0.762.0, the signature of the crashing function has changed to this: RunnableFunction<void (*)(int),Tuple1<int> >::Run() I still have no leads.
,
May 13, 2011
The following revision refers to this bug:
http://src.chromium.org/viewvc/chrome?view=rev&revision=85359
------------------------------------------------------------------------
r85359 | apatrick@chromium.org | Fri May 13 18:06:54 PDT 2011
Changed paths:
M http://src.chromium.org/viewvc/chrome/trunk/src/base/task.h?r1=85359&r2=85358&pathrev=85359
Added release build assert on attempt to create a RunnableFunction for a function pointer with address 1.
This is actually happening. See http://crbug.com/81449. The generated code to invoke the callback puts the address of the function in the EAX register before doing CALL EAX. I see 0x00000001 in the EAX register when it crashes in the reported minidumps.
I'll revert this after the next Canary.
TEST=run locally and verify no assertion
BUG=81449
Review URL: http://codereview.chromium.org/7013014
------------------------------------------------------------------------
,
May 16, 2011
The following revision refers to this bug:
http://src.chromium.org/viewvc/chrome?view=rev&revision=85547
------------------------------------------------------------------------
r85547 | apatrick@chromium.org | Mon May 16 15:21:28 PDT 2011
Changed paths:
M http://src.chromium.org/viewvc/chrome/trunk/src/base/task.h?r1=85547&r2=85546&pathrev=85547
Revert 85359 because it did not reveal the site where the crashing task was posted.
Original message:
Added release build assert on attempt to create a RunnableFunction for a function pointer with address 1.
This is actually happening. See http://crbug.com/81449. The generated code to invoke the callback puts the address of the function in the EAX register before doing CALL EAX. I see 0x00000001 in the EAX register when it crashes in the reported minidumps.
I'll revert this after the next Canary.
TEST=run locally and verify no assertion
BUG=81449
Review URL: http://codereview.chromium.org/7013014
TEST=compiles
BUG=81449
------------------------------------------------------------------------
,
May 19, 2011
I checked in a change with r85991 that will provide more information about the site the crashing task was posted from in minidumps. This will not fix the bug and the crashes will look the same but hopefully once the next canary goes out I will have more information about this bug.
,
May 20, 2011
The following revision refers to this bug:
http://src.chromium.org/viewvc/chrome?view=rev&revision=86172
------------------------------------------------------------------------
r86172 | apatrick@chromium.org | Fri May 20 16:29:23 PDT 2011
Changed paths:
M http://src.chromium.org/viewvc/chrome/trunk/src/base/debug/alias.cc?r1=86172&r2=86171&pathrev=86172
Try another way to alias a variable in optimized builds.
The previous way did not fool LTCG optimization.
I tested that this works by doing and LTCG build without this change and verified that the compiler strips out the assignment to program_counter in MessageLoop::RunTask. Then I repeated with this change and verified that the compiler did not strip it out.
TEST=compiles plus the above
BUG=81449
Review URL: http://codereview.chromium.org/7054025
------------------------------------------------------------------------
,
May 23, 2011
I tracked down where the task that crashes is posted. It is here in child_process_launcher.cc:
void Terminate() {
if (!process_.handle())
return;
// On Posix, EnsureProcessTerminated can lead to 2 seconds of sleep! So
// don't this on the UI/IO threads.
BrowserThread::PostTask(
BrowserThread::PROCESS_LAUNCHER, FROM_HERE,
NewRunnableFunction(
&ChildProcessLauncher::Context::TerminateInternal,
#if defined(OS_LINUX)
zygote_,
#endif
process_.handle()));
process_.set_handle(base::kNullProcessHandle);
}
So the good news is this probably isn't causing any harm because it happens at process termination. The bad news is it is not clear why the task calls a "random" address when it crashes since the function is a constant. This code has been around in this form for a long time and there was no problem before.
It is possible the task was destroyed before it was run, perhaps the recent task / closure refactoring is involved.
Cc: ajw...@chromium.org
,
May 24, 2011
The following revision refers to this bug:
http://src.chromium.org/viewvc/chrome?view=rev&revision=86447
------------------------------------------------------------------------
r86447 | apatrick@chromium.org | Tue May 24 10:57:46 PDT 2011
Changed paths:
M http://src.chromium.org/viewvc/chrome/trunk/src/base/debug/alias.h?r1=86447&r2=86446&pathrev=86447
M http://src.chromium.org/viewvc/chrome/trunk/src/base/task.h?r1=86447&r2=86446&pathrev=86447
Store information about invoked RunnableFunction on stack to aid debugging of canary channel crashes.
TEST=compiles
BUG=81449
Review URL: http://codereview.chromium.org/7066006
------------------------------------------------------------------------
,
May 24, 2011
The following revision refers to this bug:
http://src.chromium.org/viewvc/chrome?view=rev&revision=86448
------------------------------------------------------------------------
r86448 | apatrick@chromium.org | Tue May 24 10:57:55 PDT 2011
Changed paths:
M http://src.chromium.org/viewvc/chrome/trunk/src/base/tracked.cc?r1=86448&r2=86447&pathrev=86448
Prevent MSVC from inlining GetProgramCounter for LTCG builds.
This should ensure that it gets the frame where FROM_HERE is used, rather than its caller.
TEST=compiles
BUG=81449
Review URL: http://codereview.chromium.org/7067004
------------------------------------------------------------------------
,
May 27, 2011
The following revision refers to this bug:
http://src.chromium.org/viewvc/chrome?view=rev&revision=87052
------------------------------------------------------------------------
r87052 | apatrick@chromium.org | Fri May 27 11:30:07 PDT 2011
Changed paths:
M http://src.chromium.org/viewvc/chrome/trunk/src/content/browser/child_process_launcher.cc?r1=87052&r2=87051&pathrev=87052
Turn off optimization for ChildProcessLauncher::Context::TerminateInternal.
This is to try and get more information about a crash.
BUG=81449
Review URL: http://codereview.chromium.org/6976042
------------------------------------------------------------------------
,
May 31, 2011
I'm fairly certain the crash is in ChildProcessLauncher::Context::TerminateInternal now. This function does not do all that much. I believe the crash results from TerminateInternal making a system call, I think most likely TerminateProcess, which has been hooked by a third party DLL called "utcclb.dll": 0x00af7957 [utcclb.dll + 0x00017957] 0x00b8b564 [utcclb.dll + 0x000ab564] 0x00b8b889 [utcclb.dll + 0x000ab889] 0x5f040009 0x026a2aab [chrome.dll - child_process_launcher.cc:256] ChildProcessLauncher::Context::TerminateInternal(void *) 0x02229d4e [chrome.dll - task.h:458] RunnableFunction<void (*)(int),Tuple1<int> >::Run() 0x0213bb66 [chrome.dll - message_loop.cc:367] MessageLoop::RunTask(Task *) 0x0213bbed [chrome.dll - message_loop.cc:376] MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const &) 0x0213bf9a [chrome.dll - message_loop.cc:569] MessageLoop::DoWork() 0x02151bc4 [chrome.dll - message_pump_default.cc:50] base::MessagePumpDefault::Run(base::MessagePump::Delegate *) 0x0213bae7 [chrome.dll - message_loop.cc:342] MessageLoop::RunInternal() 0x0213ba6c [chrome.dll - message_loop.cc:315] MessageLoop::RunHandler() 0x0213b960 [chrome.dll - message_loop.cc:239] MessageLoop::Run() 0x021505e8 [chrome.dll - thread.cc:128] base::Thread::Run(MessageLoop *) 0x021506fb [chrome.dll - thread.cc:164] base::Thread::ThreadMain() 0x02142aed [chrome.dll - platform_thread_win.cc:37] base::`anonymous namespace'::ThreadFunc(void *) 0x7c80b50a [kernel32.dll + 0x0000b50a] BaseThreadStart In this example the DLL is still loaded. I think in other cases the DLL might be unloaded before the crash, which would prevent it from showing up on the call stack or the modules list. The third party DLL is part of "Internet Explorer Security Pro". http://www.mybestsoft.com/products.html There are indications that the DLL might be associated with a key logger: http://www.emsisoft.de/en/malware/Adware.Win32.Parental_Control_Tool_7.2-remove.aspx http://pchomesoft.com/
Cc: fin...@chromium.org
,
Jun 27, 2011
Issue 87619 has been merged into this issue.
,
Jul 28, 2011
Any updates? This crash is happening in 14.0.835.8 and its one of the top crashes. http://crash/reportdetail?reportid=050fbe14293e24dd
,
Jul 28, 2011
I have been unable to determine the cause of the crash. What progress I have made is noted above.
,
Aug 5, 2011
The signature changed again: 0x62baf0ca [chrome.dll - task.h:474] RunnableFunction<void (*)(MessageLoop *),Tuple1<MessageLoop *> >::Run() 0x6303b308 [chrome.dll - child_process_launcher.cc:257] ChildProcessLauncher::Context::SetProcessBackgrounded(bool) 0x62b6c1a7 [chrome.dll - task.cc:57] base::subtle::TaskClosureAdapter::Run() 0x62b630d1 [chrome.dll - message_loop.cc:486] MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const &) 0x6303b282 [chrome.dll - child_process_launcher.cc:250] ChildProcessLauncher::Context::Terminate() 0x62b6344c [chrome.dll - message_loop.cc:677] MessageLoop::DoWork() 0x629e6635 [chrome.dll - bind.h:57] base::Bind<void ( remoting::ChromotingInstance::*)(std::basic_string<char,std::char_traits<char>,std::allocator<char> > const &),base::internal::UnretainedWrapper<remoting::ChromotingInstance>,std::basic_string<char,std::char_traits<char>,std::allocator<char> > >(void ( remoting::ChromotingInstance::*)(std::basic_string<char,std::char_traits<char>,std::allocator<char> > const &),base::internal::UnretainedWrapper<remoting::ChromotingInstance> const &,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const &) 0x6303b282 [chrome.dll - child_process_launcher.cc:250] ChildProcessLauncher::Context::Terminate() 0x62b7b3bd [chrome.dll - message_pump_default.cc:42] base::MessagePumpDefault::Run(base::MessagePump::Delegate *) 0x62b7b411 [chrome.dll - message_pump_default.cc:50] base::MessagePumpDefault::Run(base::MessagePump::Delegate *) 0x6299bab1 [chrome.dll - allocator_shim.cc:124] malloc 0x62b62f3c [chrome.dll - message_loop.cc:410] MessageLoop::RunHandler() 0x62b73485 [chrome.dll - thread.cc:128] base::Thread::Run(MessageLoop *) 0x62b73598 [chrome.dll - thread.cc:164] base::Thread::ThreadMain() This part is a red herring: 0x629e6635 [chrome.dll - bind.h:57] base::Bind<void ( remoting::ChromotingInstance::*)(std::basic_string<char,std::char_traits<char>,std::allocator<char> > const &),base::internal::UnretainedWrapper<remoting::ChromotingInstance>,std::basic_string<char,std::char_traits<char>,std::allocator<char> > >(void ( remoting::ChromotingInstance::*)(std::basic_string<char,std::char_traits<char>,std::allocator<char> > const &),base::internal::UnretainedWrapper<remoting::ChromotingInstance> const &,std::basic_string<char,std::char_traits<char>,std::allocator<char> > const &) It only runs in the plugin process whereas the crash only happens in the browser process. The call stack is not to be believed :(
,
Aug 9, 2011
Digging into this some more, just before crashing, it seems to jump to some code residing on the stack. This might just mean the chain of frame pointers is corrupt though. Call stack:
00000000()
034ef79c()
> chrome.dll!RunnableFunction<void (__cdecl*)(v8::Persistent<v8::Context>),Tuple1<v8::Persistent<v8::Context> > >::Run() Line 474 + 0x5 bytes C++
chrome.dll!base::subtle::TaskClosureAdapter::Run() Line 58 C++
chrome.dll!MessageLoop::RunTask(const MessageLoop::PendingTask & pending_task={...}) Line 472 C++
chrome.dll!MessageLoop::DeferOrRunPendingTask(const MessageLoop::PendingTask & pending_task={...}) Line 489 C++
chrome.dll!MessageLoop::DoWork() Line 677 + 0xb bytes C++
The second code address from the top of the call stack is on the stack, as can be seen from the values of the ESP and EBP registers at the time of the crash.
ESP = 034EF78C
EBP = 034EF79C
,
Aug 10, 2011
A little bit of statistical data for the "SetProcessBackgrounded" crashes in 13.0.782.112. Probably the most interesting anomaly is the correlation to "import". Comment #15 is interesting, I will look at some sample minidumps next and see if I can confirm that theory. (a) This is the single biggest browser crash signature, accounting for 9.27% of crashes (7.25% if you count by user). (b) The crash happens *very* quickly. In other words, this isn’t some passive cruft/memory corruption that happens over time. That is good news for debugging ;) 83.69% of the crashes happen in under 30 seconds 52.69% of the crashes happen in under 8 seconds 20.80% of the crashes happen in under 4 seconds 10.83% of the crashes happen in under 2 seconds 4.76% of the crashes happen in under 1 second 1.67% of the crashes happen in under 500 milliseconds (c) There appears to be a bias towards this happening during import. I can see this in the distribution of command line flags. Specifically, 4.79% of these crashes have the flag --import=*, which indicates it happened during an import process (rather than a regular browser session). That may not sound significant, but note that accross all the other browser crashes, --import=* is only seen in 0.47% of the crashes. Stated differently, 49% of all the crashes during import are in "SetProcessBackgrounded". This doesn't prove that it is caused by import, but it does suggest that it is more easily hit during the import codepath. (d) 8% of these crashes occur during shutdown. (e) This mainly is happening on Windows XP. Note however, that this isn't very far off from the ordinary distribution of crashes by platform, so I wouldn't read too much into it. 62.27% on WinXP 30.05% on Win7 7.68% on WinVista (f) It is not extension related (I don’t see any meaningful clustering of chrome extensions; in fact the majority have no extensions). (g) For some users, this is a highly reproducible, chronic crash. For instance, looking at top crasher 5A51363FC5294D3ABC1668646E145671, they hit this crash at the following times today. 2011/08/10 19:31:58, Wed 2011/08/10 19:31:41, Wed 2011/08/10 19:22:32, Wed 2011/08/10 19:22:10, Wed 2011/08/10 18:39:49, Wed 2011/08/10 18:39:25, Wed 2011/08/10 18:38:53, Wed 2011/08/10 18:31:25, Wed 2011/08/10 18:31:23, Wed 2011/08/10 18:23:32, Wed 2011/08/10 18:23:26, Wed 2011/08/10 18:02:52, Wed 2011/08/10 17:54:26, Wed 2011/08/10 17:53:52, Wed 2011/08/10 17:53:22, Wed 2011/08/10 17:39:38, Wed 2011/08/10 17:38:56, Wed 2011/08/10 17:38:50, Wed 2011/08/10 17:30:58, Wed 2011/08/10 17:30:42, Wed 2011/08/10 17:23:50, Wed 2011/08/10 17:23:22, Wed 2011/08/10 17:02:28, Wed 2011/08/10 16:54:30, Wed 2011/08/10 16:53:50, Wed 2011/08/10 16:53:38, Wed 2011/08/10 16:38:53, Wed 2011/08/10 16:38:14, Wed 2011/08/10 16:38:00, Wed 2011/08/10 15:53:44, Wed 2011/08/10 15:53:42, Wed 2011/08/10 15:53:29, Wed 2011/08/10 14:41:48, Wed The timings above are definitely fishy (suggesting this is the most patient user ever, restarting the browser seconds after crashing). In fact many of those timings are impossible, since the process uptime suggests more time elapsed than what the starttime of next crash implies. It is important to realize that these timestamps are actually the timestamp when the crashserver *processed* the report, and not necessarily when the crash was generated by the client.. I opened a couple of the minidumps and explored the PID. On Windows process IDs is monotonically increasing per session so this gives a better sense of the relative timings. In fact the order above does not match the PIDs, so clearly this listing is out of order. Whatever the case, it is clear that some users are hitting this crash very frequently, hence it is highly reproducible for them. We could probably get more help by prompting users that hit the crash (sorta like I did for the cookiemonster memory corruption in the past).
,
Aug 10, 2011
I think I determined what causes the crash. TerminateInternal calls Process::Terminate. In turn Process::Terminate calls the Win32 function TerminateProcess, which is stdcall. When TerminateProcess returns, the address in the ESP register is 4-bytes to low, and points to the frame pointer for the callers frame. The RET instruction in Process::Terminate then reads the frame pointer off the stack instead of the address it should jump to to return. It then attempts to jump to the caller's frame, which explains why there is a stack address on the call stack when it crashes. A possibility is that the entry for TerminateProcess in the dispatch table has been hooked to point to a function in a third party DLL and the implementation of the replacement is buggy. It might, for example, not be stdcall or it might not take two 32-bit arguments. To determine if this is the case I will check in some code to read the contents of that entry in the dispatch table and record it on the stack prior to the crash. If it does not point to an address in kernel32.dll, that would indicate it has been hooked.
,
Aug 10, 2011
@apatrick: thanks! I definitely agree with your analysis that the problem is happening somewhere in TerminateInternal, probably due to hooking of terminate process. Capturing that information would be great. To add to the great analysis you have already done, I noticed that we usually have the code for one of the mysterious frames on the callstack. It isn't mapped to any of the loaded DLLs, nor any in the list of unloaded DLLs. However it is valid code, albeit hockey looking. It generally looks the same, something like this: 039bfd6c a0fd9b038a mov al,byte ptr ds:[8A039BFDh] 039bfd71 4f dec edi 039bfd72 c401 les eax,fword ptr [ecx] 039bfd74 90 nop 039bfd75 fd std 039bfd76 9b wait 039bfd77 030e add ecx,dword ptr [esi] 039bfd79 5b pop ebx 039bfd7a c401 les eax,fword ptr [ecx] 039bfd7c f0236c0568 lock and ebp,dword ptr [ebp+eax+68h] 039bfd81 fe ??? 039bfd82 9b wait 039bfd83 0370fe add esi,dword ptr [eax-2] Lastly I sampled a number of minidumps, and found that they almost always contain a keyboard DLL in the recently unloaded modules list. For instance, these are the top names I saw: KBDLA.DLL KBDFR.DLL KBDSP.DLL KBDUS.DLL I don't know enough about windows to know if this is abnormal, but it sounds a bit fishy and I can't explain it. Cheers.
,
Aug 11, 2011
This is now our top browser crash on 14.
Labels: ReleaseBlock-Stable Mstone-14
,
Aug 11, 2011
BTW, apatrick's latest instrumentation code is now live on the windows canary. So far there has been just 1 crash report: http://crash/reportdetail?reportid=6e591eaf344c0102 According to that report, the address for TerminateProcess (i.e [chrome.dll!_imp__TerminateProcess]) is in fact pointing to kernel32!TerminateProcess... so no smoking gun yet in terms of hooking. A couple ideas: - It could be that the code for kernel32!TerminateProcess is being patched directly. We could try copying a couple bytes of it into our minidump in case we can spot some re-writing. - TerminateProcess is really just a wrapper around NtTerminateProcess. We can similarly instrument NtTerminateProcess in case that is the one getting hooked. - We are getting the address for TerminateProcess as part of the thread start, perhaps the hooking is simply happening later on. However I looked at all the threads in the dump, and they all had the same value, so this is less likely.
,
Aug 11, 2011
This is the series of jumps and calls between Process::Terminate and entry into the kernel. chrome.dll!base::Process::Terminate: ... call dword ptr [__imp__TerminateProcess@8 (6046E18Ch)] ... kernel32.dll!_TerminateProcessStub@8: 75419DE1 mov edi,edi 75419DE3 push ebp 75419DE4 mov ebp,esp 75419DE6 pop ebp 75419DE7 jmp _TerminateProcess@8 (754010A2h) kernel32.dll!_TerminateProcess@8: 754010A2 jmp dword ptr [__imp__TerminateProcess@8 (75400874h)] KernelBase.dll!_TerminateProcess@8: 7638E804 mov edi,edi 7638E806 push ebp 7638E807 mov ebp,esp 7638E809 cmp dword ptr [ebp+8],0 7638E80D jne _TerminateProcess@8+15h (7638E819h) 7638E80F push 6 7638E811 call dword ptr [__imp__RtlSetLastWin32Error@4 (76381044h)] 7638E817 jmp _TerminateProcess@8+3Bh (7638E83Fh) 7638E819 push dword ptr [ebp+0Ch] 7638E81C push dword ptr [ebp+8] 7638E81F call _RtlReportSilentProcessExit@8 (763B68A4h) 7638E824 push dword ptr [ebp+0Ch] 7638E827 push dword ptr [ebp+8] 7638E82A call dword ptr [__imp__NtTerminateProcess@8 (763811DCh)] 7638E830 test eax,eax 7638E832 jl _TerminateProcess@8+35h (7638E839h) 7638E834 xor eax,eax 7638E836 inc eax 7638E837 jmp _TerminateProcess@8+3Dh (7638E841h) 7638E839 push eax 7638E83A call _BaseSetLastNTError@4 (763B6CE2h) 7638E83F xor eax,eax 7638E841 pop ebp 7638E842 ret 8 ntdll.dll!_ZwTerminateProcess@8: 7700FC40 mov eax,29h 7700FC45 xor ecx,ecx 7700FC47 lea edx,[esp+4] 7700FC4B call dword ptr fs:[0C0h] 7700FC52 add esp,4 7700FC55 ret 8 I think TerminateProcess could be hooked in a number of places. 1) The chrome.dll!__imp__TerminateProcess@8 import entry could be hooked. This is relatively easy to check. 2) The code of kernel32.dll!_TerminateProcessStub@8 could be modified. This is also relatively easy to check by copying the six bytes referenced by __imp__TerminateProcess on to the stack or by looking up the entry with GetProcAddress. 3) The code of kernel32.dll!_TerminateProcess@8 could be modified. This could be checked by looking up the entry with GetProcAddress. 3) The kernel32.dll!__imp__TerminateProcess@8 import entry could be hooked. Checking this would involved locating the kernel32.dll import table, which is more complicated. 4) The code of KernelBase.dll!_TerminateProcess@8 could be modified. Again, the code could be copied to the stack with GetProcAddress. 5) The KernelBase.dll!__imp__NtTerminateProcess@8 import entry could be hooked. Same problem as with 3). I think this is less likely though as the function that appears to be hooked in does not clean up the stack correctly and if that is the case, KernelBase.dll!_TerminateProcess@8 would fail to return. 6) The code of ntdll.dll!_ZwTerminateProcess@8 could be modified. I think this is unlikely for the same reason as 5). If I were trying to intercept calls to TerminateProcess, I would tend to do 3) or 5) because it makes no assumptions about code layout it different versions of DLLs and because it saves hooking the import table of every loaded DLL; only the import tables of kernel32.dll or KernelBasel.dll respectively would need to be hooked. I think it is not 5) because of the symtoms of the crash. 3) seems the more likely candidate to me. As eroman noted, checking these things in base::ThreadFunc might be too early; the hooking might take place later. The latest time we could collect this information and have it visible on the stack at the time of the crash would be TaskClosureAdapter::Run I think. However, the above is a lot of work to do in a function that is called relatively frequently. I have tried to do some of these things and I have been unable to prevent the optimizer from stripping out the diagnostic information.
,
Aug 11, 2011
I suspect we may never get much more useful information from these crashes (as to point the user to some solution, other than to scan the computer for malware). However, it looks like the dump from comment 28 may be actually happening right after returning from CloseHandle, and not TerminateProcess. Of course it is still possible that the a hook on TerminateProcess could corrupt the stack enough but the pattern of this crash is easier to explain if the corruption happens inside CloseHandle, because it looks like the first two stack positions are fine, but there is a null at the third one, and we write that zero at the end of Process::Close() 03d7fd4c 03d7fd64 03d7fd50 01ea6f3a chrome_1c30000!RunnableFunction<void (__cdecl*) 03d7fd54 00000000 So I'd say that we crash attempting to return from Process::Close. In any case, if we want to gather more data, the place to do that would be TerminateInternal... we could check the IAT for CloseHandle and TerminateProcess, and maybe grab a few bytes from the preamble of the target code... but most likely it will just point to a random address not part of any loaded DLL :(.
,
Aug 11, 2011
I'll see what effect this has if any. If it has no effect then I think it rules out hooking of either TerminateProcess or CloseHandles. http://codereview.chromium.org/7640008/
,
Aug 15, 2011
I landed the patch in #31.
,
Aug 15, 2011
(No comment was entered for this change.)
Cc: kbr@chromium.org
,
Aug 16, 2011
(No comment was entered for this change.)
Labels: Stability-CodeYellow
,
Aug 17, 2011
Looking at the crashes reported for 15.0.854.0 and 15.0.854.1000, which contain the patch mentioned in #31, there have been 315 and 416 browser process crashes respectively at the time of writing. I don't see any instances of this crash. If this is indeed fixed, I still don't know whether it is CloseHandle or TerminateProcess that is being hooked but, given severity of the crash, I think the patch could still potentially be merged into other branches before narrowing it down further.
,
Aug 17, 2011
Merge requested for this: http://codereview.chromium.org/7640008/
Labels: Merge-Requested
,
Aug 17, 2011
(No comment was entered for this change.)
Labels: -Merge-Requested Merge-Approved
,
Aug 17, 2011
(No comment was entered for this change.)
Status: Fixed
,
Aug 22, 2011
I don't see this merged yet, moving back to started.
Status: Started
,
Aug 22, 2011
Nevermind, I see it now.
Status: Fixed
Labels: -Merge-Approved Merge-Merged
,
Aug 23, 2011
Follow up. Only a potential TerminateProcess intercept is bypassed at this point (r97407). The patch appears to be holding with CloseHandle called in the regular way. I still don't know what is hooking TerminateProcess or why.
,
Aug 23, 2011
Issue 73215 has been merged into this issue.
Cc: ana...@chromium.org tommi@chromium.org darin@chromium.org wtc@chromium.org j...@chromium.org willchan@chromium.org
|
| ► Sign in to add a comment |
Cc: jam@chromium.org amarinichev%chromium.org@gtempaccount.com vangelis...@gtempaccount.com