| Issue 35374: | Dead tabs after restoring a multi-tab Chrome session | |
| 19 people starred this issue and may be notified of changes. | Back to list |
Restricted
Sign in to add a comment
|
Chrome Version : 5.0.307.5 dev URLs (if applicable) : OS version : 10.6.2 Behavior in Safari 3.x/4.x (if applicable): Tabs reopen without problems Behavior in Firefox 3.x (if applicable): Tabs reopen without problems Behavior in Chrome for Windows: *Untested* What steps will reproduce the problem? 1. Enable "On startup: Reopen the pages that were open last" under preferences. 2. Open a window with multiple tabs, say 8 to 10 or so. Then quit Chrome. 3. Reopen Chrome. What is the expected result? Expect all tabs to load / reopen properly. What happens instead? Very often, one (or more?) of the tabs reopened will not load properly. It will instead remain blank, and the "spinner" will continuously spin slowly in the counterclockwise direction. As far as I can tell, which tab goes bad in this way seems to be random. If you attempt to close this dead tab, the entire browser crashes!
Feb 15, 2010
#1
kr...@chromium.org
Cc:
kr...@chromium.org
Feb 15, 2010
Hmm not that I'm aware of, and not intentionally. Is there some way I can check to be sure? I'm currently on 5.0.322.2 dev and still have the problem.
Feb 16, 2010
Ok can you please send us the crash reports from the sad tab: Launch Chrome > Preferences > Under the Hood > Enable "Help make Google Chrome better by automatically sending..." Quit Chrome Reproduce the sad tab crashes Open ~/Library/Breakpad/Chrome_Mac/ Attach the newest files to this bug with the version number of Chrome you are using.
Feb 16, 2010
I just now recreated this issue (using 5.0.322.2 dev), and no new files were created in the location you specified above. Also, to clarify, the "dead tab" does *not* show any sad face. It is simply blank white with the loading spinner moving slowly and continuously in the counterclockwise direction. Closing said tab causes what I had interpreted as a crash...but since there is no error report, maybe it is simply quitting the browser?
Feb 16, 2010
It sounds like you are not crashing, but DNS lookups are taking a very long time. When you launch Chrome how many tabs are dead and if you look in Activity Monitory how many Google Chrome Helper processes do you have? Have you rebooted today?
Feb 16, 2010
It seems to be just one tab usually. I just now rebooted, reopened my previous Chrome session with six tabs, and one of them is dead. There are six Google Chrome Helper processes running. If this is a DNS lookup issue I would expect it to time out eventually (e.g. stop the backwards-spinning progress indicator, print an error message to the window), or at least not quit the entire browser when I attempt to close that tab. Neither of those is the case, however. Also if it were connectivity-related, I would expect to be able to refresh the tab and attempt a reload, but this is not possible. In fact, I just now tried to click the Home button on this dead tab and that crashed (or quit) Chrome entirely in the same way closing the tab would have. (Luckily my input into this text field was repopulated when I reopened. +1!)
Feb 18, 2010
If you did get a crash when you clicked the Home button can I get that crash report? If you can reproduce with 1 tab can you get a sample of that process and attach it?
Feb 18, 2010
No, a crash report was not generated when the browser disappeared after clicking Home on the dead tab. However, I reproduced a session with a dead tab, then closed all other non-dead tabs in order to isolate which Google Chrome Helper process was responsible for the dead one. When I attempted to take a sample of this process, I got an error saying the process could not be sampled because it did not exist, after which the process disappeared from Activity Monitor.
Feb 18, 2010
Here is some console output given on startup of a Chrome session that had a dead tab. Based on a couple of quick tests, I only see this output generated on restored sessions that contain a dead tab. I'm not sure if it's helpful though. 2/18/10 13:10:24 [0x0-0xc10c1].com.google.Chrome[806] [806:21511:25873378289101:ERROR:/b/slave/chrome-official- mac/build/src/base/process_util_posix.cc(341)] parent WaitForMessage() failed: 0x10004003 (ipc/rcv) timed out
Mar 4, 2010
I spoke too soon... I'm still seeing this after today's update to 5.0.342.1 dev. Is there any other information you need or any suggestions on things I could try?
Mar 7, 2010
I have the same issue with sometimes getting a dead tab on startup that brings down the whole browser if I try to close it or terminate its process in the task manager. It also happens occasionally with extension processes for me. If I notice that a particular extension hasn't loaded and I try to kill its process with the task manager, the browser crashes. The process shows in the task manager as active but using 0 bytes of memory. I've experienced this with numerous dev builds, including the current 5.0.342.1 (on Mac OS X 10.6.2).
Mar 14, 2010
I deleted my profile data and got a fresh start on Chrome but am still able to reproduce this problem on 5.0.342.3 dev.
Mar 17, 2010
I tried a complete uninstall and reinstall of Chrome, but the issue still occurs. Since it doesn't appear to be a very common problem, I'm starting to think that maybe an external factor is the cause. I had almost forgotten that I have SIMBL installed for A-Dock, and I guess it's possible that it has a conflict with Chrome. I'm going to try uninstalling SIMBL to see if it has any effect.
Mar 17, 2010
Fwiw, I don't have SIMBL installed nor do I have any kexts or other low-level OS mods going on. And yes I can still reproduce this on 5.0.342.5 dev as of just now.
Mar 18, 2010
Not sure what is going on here. If there are no crash dumps then the tabs aren't crashing. Maybe they are failing to initialize or for some reason they are loosing a connection to the browser process
Status:
Untriaged
Mar 18, 2010
Yeah, I quickly figured out that SIMBL has nothing to do with it. I'm going to try a process of elimination over the next few days and try to rule out anything I have installed that could conceivably interfere with Chrome.
Mar 18, 2010
PS, I really think the Console.app output I posted above (comment 9) is worth looking at...maybe the devs already have. I reproduced that error again several times just now -- on launches where no tabs were dead, the error is not output. On launches with one tab dead, it is output once; two dead tabs I see it twice, etc. Only notable diff is that the line number has changed to 344.
Mar 19, 2010
Can we find out what tabs are being restored, and if blowing away prefs helps?
Status:
Assigned
Owner: kr...@chromium.org
Mar 27, 2010
What do you mean by "what tabs are being restored" ? And blowing away prefs doesn't seem to have helped (comments 13, 14). Thanks for looking into this one though!
Mar 27, 2010
If I'm interpreting the question correctly, he's asking if the specific sites in the tabs that are being restored have any effect on whether the tab process starts up dead? My answer to that question would be no. It seems to be random which tab it affects. It also happens with extension processes, and not always the same one. In addition to a complete uninstall and reinstall, I've tried disabling all extensions to no effect. I've also tried shutting off every external application that affects other apps in some way (i.e., Default Folder, FinderPop, Zooom, etc.) to no effect. I have two apps that install third-party kernel extensions, SteerMouse and Virus Barrier X6. deactivating these apps has no effect. I don't know if there's any point in unloading the kernel extensions to see if that does anything, but if dfoosh1 has neither of these apps, then it's probably safe to say they're not responsible.
Mar 28, 2010
I would agree that which tab process starts up dead is random. I just quit/restarted a six-tab session four times; three of those four times I had a single dead tab, and each time it was a different one.
Mar 28, 2010
Are the dead tabs still crashing the browser when you try to close them? I've definitely seen renderers crash en masse when I restart a debug build of Chromium. And I've seen the forever-backwards-spinner in a few tabs here and there, but closing those tabs has never caused a crash.
Mar 28, 2010
It's not a crash per se -- at least, there is no .dmp file generated in ~/Library/Breakpad/Chrome_Mac -- but yes, the browser quits completely when you close a dead tab. Reopening the browser restores a session with all previous tabs except the one that you attempted to close.
Mar 28, 2010
Yes, I have precisely the same behaviour as dfoosh1. Also, if it's an extension process that's dead, closing it with the Task Manager will bring down the whole browser.
Apr 8, 2010
Took four attempts to restore a 6-tab session without any dead tabs after installing today's update. This isn't necessarily unusual, just pointing out that it is still happening. The Console.app output unique to a restore with dead tabs has changed line numbers again too: 4/8/10 20:42:26 [0x0-0x138138].com.google.Chrome[2342] [2342:22275:109843580654330:ERROR:/b/slave/chrome-official- mac/build/src/base/process_util_posix.cc(325)] parent WaitForMessage() failed: 0x10004003 (ipc/rcv) timed out
Apr 26, 2010
This issue seems to be fixed in 5.0.375.17 dev. It's been a few days, and I haven't had any problems with dead processes yet. I noticed that the names of the Chrome helper processes changed, so I'm assuming there were also some changes to the way the processes work themselves. I hope I haven't spoken too soon, but I'm glad to finally have this issue apparently resolved.
Apr 26, 2010
Unfortunately I am still getting dead tabs; there's one spinning above as I type this. Hmm.
May 3, 2010
@morepower~, have you still not encountered any dead tabs since that particular release? i'm still having them. just had two chrome launches in a row with dead tabs, in fact.
May 3, 2010
Actually, I spoke too soon. I'm still seeing the problem, though it seemed to occur less frequently with 375.17. It's now happening a bit more frequently again with 375.28. However, lately I'm getting mostly dead extension processes rather than tabs. I still get a dead tab once in a while, but in the vast majority of cases it's an extension.
May 20, 2010
Still getting plenty of dead tabs. Additionally noticed that attempting to type an address into the address bar of a dead tab and pressing return leads to the same browser "crash" as attempting to close the dead tab.
Jun 3, 2010
FWIW, I have now actually seen this in person. One of my tabs was unresponsive and spinning last night, and when I went to close it, the browser crashed. No idea what triggered it, or how to reproduce. Restarting and restoring all tabs worked fine. Also, there was no crash dump. I wonder if getting official symbols and attaching to the process in gdb will give us anything useful.
Cc:
rohit...@chromium.org
Labels: -Area-Undefined Area-UI Feature-Browser
Jun 17, 2010
Saw it again and caught it in the debugger. gdb came back with "Program exited normally," which would explain why there is no crash dump. No idea how we're managing to quit the app instead of close the tab. Until next time :)
Jun 17, 2010
Thanks for the update and thanks for checking into it!
Jun 22, 2010
Rohit is having more progress with this, so I will give it to him.
Owner:
rohit...@chromium.org
Cc: -rohit...@chromium.org
Jul 23, 2010
Just saw this in a local Chromium build, but I wasn't restoring tabs. Chromium was simply opening to my home page as usual when I got a forever backwards spinner. I actually had this in the debugger, but I thought it was a deadlock and didn't realize what was going on until after I had tried to close the tab =( Once again, the program seems to have exited normally rather than crashed.
Aug 12, 2010
This happens habitually to me on Mac 5.0.375.126
Aug 27, 2010
Gah, I was able to consistently reproduce this for half an hour, but then xcode beachballed and crashed and now I can't even get Chromium to run under gdb without crashing. I am not optimistic that I will be able to reproduce it anytime soon. While I had it in the debugger, it looked like pressing the close tab button ended up triggering CloseAllBrowsersAndExit(), but this function was being called through the message loop rather than directly. I think the only place where CloseAllBrowsersAndExit() is scheduled is in the ShutdownDetector, but that doesn't really make much sense.
Aug 27, 2010
Rohit saw the child (renderer) process stuck in bootstrap_look_up called by base::MachPortSender::MachPortSender in between its fork and exec. It’s a little bit suspicious that in thakisfork(), the ReceivePort (parent_recv_port) is created before the fork, as opposed to in the parent after the fork.
Cc:
m...@chromium.org
Labels: allays
Aug 30, 2010
Issue 53833 has been merged into this issue.
Sep 3, 2010
The following revision refers to this bug:
http://src.chromium.org/viewvc/chrome?view=rev&revision=58558
------------------------------------------------------------------------
r58558 | rohitrao@chromium.org | 2010-09-03 16:09:41 -0700 (Fri, 03 Sep 2010) | 10 lines
Changed paths:
M http://src.chromium.org/viewvc/chrome/trunk/src/base/process_util_posix.cc?r1=58558&r2=58557
[Mac] Move the reset of signal handlers to be very soon after the fork, before we do any mach IPC.
The MachPortSender constructor can sometimes hang forever (gets stuck in bootstrap_look_up()), so it is important to reset the child's signal handlers as early as possible.
Nico would like me to mention that this was his idea.
BUG=35374
TEST=If in the forever backwards spinner state, closing the tab should not quit the browser.
TEST=In general, renderers and extensions and plugins should still work.
Review URL: http://codereview.chromium.org/3302009
------------------------------------------------------------------------
Sep 3, 2010
The following revision refers to this bug:
http://src.chromium.org/viewvc/chrome?view=rev&revision=58560
------------------------------------------------------------------------
r58560 | rohitrao@chromium.org | 2010-09-03 16:38:16 -0700 (Fri, 03 Sep 2010) | 13 lines
Changed paths:
M http://src.chromium.org/viewvc/chrome/trunk/src/base/process_util_posix.cc?r1=58560&r2=58559
Revert 58558 - [Mac] Move the reset of signal handlers to be very soon after the fork, before we do any mach IPC.
The MachPortSender constructor can sometimes hang forever (gets stuck in bootstrap_look_up()), so it is important to reset the child's signal handlers as early as possible.
Nico would like me to mention that this was his idea.
BUG=35374
TEST=If in the forever backwards spinner state, closing the tab should not quit the browser.
TEST=In general, renderers and extensions and plugins should still work.
Review URL: http://codereview.chromium.org/3302009
TBR=rohitrao@chromium.org
Review URL: http://codereview.chromium.org/3322013
------------------------------------------------------------------------
Sep 3, 2010
The following revision refers to this bug:
http://src.chromium.org/viewvc/chrome?view=rev&revision=58572
------------------------------------------------------------------------
r58572 | rohitrao@chromium.org | 2010-09-03 19:32:59 -0700 (Fri, 03 Sep 2010) | 18 lines
Changed paths:
M http://src.chromium.org/viewvc/chrome/trunk/src/base/process_util_posix.cc?r1=58572&r2=58571
[Mac] Move the reset of signal handlers to be very soon after the fork, before we do any mach IPC.
The MachPortSender constructor can sometimes hang forever (gets stuck in bootstrap_look_up()), so it is important to reset the child's signal handlers as early as possible.
Nico would like me to mention that this was his idea.
This is take two. I couldn't reproduce the unit_tests failures either locally or on the release try bots.
BUG=35374
TEST=If in the forever backwards spinner state, closing the tab should not quit the browser.
TEST=In general, renderers and extensions and plugins should still work.
Review URL: http://codereview.chromium.org/3302009
TBR=rohitrao@chromium.org
Review URL: http://codereview.chromium.org/3322013
TBR=rohitrao@chromium.org
Review URL: http://codereview.chromium.org/3360009
------------------------------------------------------------------------
Sep 6, 2010
I just submitted a CL that resets all existing signal handlers immediately after the renderer forks. In theory, closing a dead tab will no longer cause the browser to quit. Once r58572 makes its way to the dev channel, the browser should no longer quit. The root cause of the quit was that the renderers inherited the signal handlers from the browser and we had not yet reset them. Therefore, when you tried to close the dead tab, we somehow ended up sending a SIGTERM to the browser (I don't really understand why). My CL resets the signal handlers immediately after the fork, so there is a smaller chance of this happening. The root cause of the dead tabs appears to be in the bootstrap_look_up() call, which for some reason never returns. As a result, the renderer hangs forever, in this weird state where it has already fork()ed but has yet to exec(). I have no idea what is causing bootstrap_look_up() to hang.
Sep 9, 2010
Issue 54876 has been merged into this issue.
Sep 9, 2010
Mark and I figured out what the problem code is: bootstrap_look_up2() in http://www.opensource.apple.com/source/launchd/launchd-329.3/launchd/src/libbootstrap.c This function tries to grab a mutex lock, but that mutex is used on other threads as well (for one example, getaddrinfo() is run on the network thread and calls bootstrap_look_up2()). If we get unlucky, the fork() happens while another thread is holding the mutex. In that case, we effectively deadlock, because other threads are not copied or run in the child.
Sep 9, 2010
http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them I hesitate to suggest it, but maybe we should consider the Linux zygote? fork() before spawning any threads. Except that OSX seems to like to inject threads for lots of reasons, so it would have to be VERY early.
Sep 9, 2010
I'm assuming that why this comes up at tab-restore time is because we're busy spawning renderers and the first renderer starts doing a name resolution and at some point we hit the lock. It's a general-purpose edge case, but we could maybe mitigate it by hanging the I/O thread until all of the renderers are spawned. And plug-ins, and extensions, but you get my drift. OK, so it won't work. Could add a histogram (to browser) to track how often the child fails to connect. If it's not that often, have the parent kill the child and respawn. This sounds gross, because it is, but if it's only ever doing the respawn once or twice, does that really matter? It's a pretty reasonable amount of code, I think. [I'm assuming the child cannot exit, because it's hung waiting for a lock.] On IRC, we discussed overriding the bootstrap port. This has the flaw of other threads requesting the bootstrap port while we're forking. This could be addressed by replacing the bootstrap port on a full-time basis, and having a thread which forwards to the real bootstrap port, diverting the child messages to somewhere (handling them directly seems dicey, but forwarding them to a different mach port might be reasonable, as it already has to be able to do that operation). This thread could be constrained to only make Mach calls, perhaps from a limited set of primitives. I love it! Since we're the only thread in the child process, could we inline bootstrap_look_up2()?
Sep 10, 2010
Register to handle breakpoint exceptions, and have the child break. The parent snags the child task's port and returns success (so that it continues).
Sep 14, 2010
Fixing this would kill some flakiness. Adding Yet Another Label.
Labels:
GreenTreeTaskForce
Sep 17, 2010
The following revision refers to this bug:
http://src.chromium.org/viewvc/chrome?view=rev&revision=59782
------------------------------------------------------------------------
r59782 | rohitrao@chromium.org | Fri Sep 17 05:28:32 PDT 2010
Changed paths:
M http://src.chromium.org/viewvc/chrome/trunk/src/chrome/browser/child_process_launcher.cc?r1=59782&r2=59781&pathrev=59782
M http://src.chromium.org/viewvc/chrome/trunk/src/base/process_util.h?r1=59782&r2=59781&pathrev=59782
M http://src.chromium.org/viewvc/chrome/trunk/src/base/process_util_posix.cc?r1=59782&r2=59781&pathrev=59782
M http://src.chromium.org/viewvc/chrome/trunk/src/chrome/app/chrome_dll_main.cc?r1=59782&r2=59781&pathrev=59782
M http://src.chromium.org/viewvc/chrome/trunk/src/chrome/browser/mach_broker_mac_unittest.cc?r1=59782&r2=59781&pathrev=59782
M http://src.chromium.org/viewvc/chrome/trunk/src/chrome/browser/mach_broker_mac.cc?r1=59782&r2=59781&pathrev=59782
M http://src.chromium.org/viewvc/chrome/trunk/src/chrome/browser/mach_broker_mac.h?r1=59782&r2=59781&pathrev=59782
M http://src.chromium.org/viewvc/chrome/trunk/src/base/mach_ipc_mac.h?r1=59782&r2=59781&pathrev=59782
M http://src.chromium.org/viewvc/chrome/trunk/src/base/mach_ipc_mac.mm?r1=59782&r2=59781&pathrev=59782
[Mac] Replace the existing browser-child mach ipc with a long-lived listener on a well-known port.
Before this CL:
Before fork()ing a child, the browser process creates a mach receive port with a random name. After the fork() but before exec(), the child uses mach ipc to transmit send rights to its task port. The child has access to the random name because it inherits it from the browser process. Unfortunately, some of the library functions involved in sending a mach message are not safe to call after fork().
After this CL:
Before forking the first child, the browser spins off a new thread that listens on a well-known port for mach ipc from any process. This well-known port is "com.google.Chrome.<browserpid>". When a child process starts up, it sends a mach message to its parent browser's well-known port. On the browser side, we listen for said message, extract the pid of the sending process, and ignore any messages from processes we did not personally fork(). This check is necessary because any arbitrary process on the system could send mach ipc to that port.
BUG=35374
TEST=Browser should still start up. The task manager should still show correct cpu/memory data. There should be no perf regressions.
TEST=Mac ui_tests and browser_tests should be less flaky.
Review URL: http://codereview.chromium.org/3443002
------------------------------------------------------------------------
Sep 17, 2010
Fixed!
Status:
Fixed
Sep 17, 2010
Verified label updated by AutoAllocator, contact AmolK or KrisR for details
Labels:
Verifier-Deepakg
Sep 17, 2010
Wow, what a ride. Congrats and thanks!
Sep 17, 2010
(No comment was entered for this change.)
Labels:
rohitfork thakisfork
Sep 21, 2010
Verified in 7.0.529.0 (Official Build 59911) dev.
Status:
Verified
Mar 18, 2011
Chrome Version : 5.0.307.5 dev <b>URLs (if applicable) :</b> OS version : 10.6.2 Behavior in Safari 3.x/4.x (if applicable): Tabs reopen without problems Behavior in Firefox 3.x (if applicable): Tabs reopen without problems Behavior in Chrome for Windows: *Untested* <b>What steps will reproduce the problem?</b> 1. Enable "On startup: Reopen the pages that were open last" under preferences. 2. Open a window with multiple tabs, say 8 to 10 or so. Then quit Chrome. 3. Reopen Chrome. <b>What is the expected result?</b> Expect all tabs to load / reopen properly. <b>What happens instead?</b> Very often, one (or more?) of the tabs reopened will not load properly. It will instead remain blank, and the "spinner" will continuously spin slowly in the counterclockwise direction. As far as I can tell, which tab goes bad in this way seems to be random. If you attempt to close this dead tab, the entire browser crashes!
Labels:
-GreenTreeTaskForce bulkmove TaskForce-GreenTree
Oct 12, 2012
This issue has been closed for some time. No one will pay attention to new comments. If you are seeing this bug or have new data, please click New Issue to start a new bug.
Labels:
Restrict-AddIssueComment-Commit
Blocking: -chromium:52858 chromium:52858
Mar 10, 2013
(No comment was entered for this change.)
Labels:
-Area-UI -Feature-Browser Cr-UI-Browser-Core Cr-UI
|
||||||||||
| ► Sign in to add a comment | |||||||||||