My favorites | Sign in
Project Home Downloads Wiki Issues Code Search
New issue   Search
for
  Advanced search   Search tips   Subscriptions
Issue 201372: ProfileImpl may call a callback on LoginUtuil object which has been deleted by AtExitManager (as it's singleton)
5 people starred this issue and may be notified of changes. Back to list
 
Project Member Reported by oshima@chromium.org, Jul 26, 2011
Chrome OS Version  :  tot

This happens when chrome gets SIGTERM right during startup time. Looks like ProfileImpl is calling
OnProfileCreated on deleted delegate_ object (LoginUtilsImpl).

#0  0x73c2f686 in chromeos::LoginUtilsImpl::OnProfileCreated(Profile*, ProfileManagerObserver::Status) ()
#1  0x73ef2aa9 in ProfileManager::OnProfileCreated(Profile*, bool) ()
#2  0x73ed91e8 in ProfileImpl::DoFinalInit() ()
#3  0x73ed979f in ProfileImpl::OnPrefsLoaded(bool) ()
#4  0x73ed99b0 in ProfileImpl::Observe(int, NotificationSource const&, NotificationDetails const&) ()
#5  0x74bd11d5 in NotificationService::Notify(int, NotificationSource const&, NotificationDetails const&) ()
#6  0x74413ffb in PrefNotifierImpl::OnInitializationCompleted(bool) ()
#7  0x73ea2a3a in PrefValueStore::CheckInitializationCompleted() ()
#8  0x748e3777 in JsonPrefStore::OnFileRead(base::Value*, PersistentPrefStore::PrefReadError, bool) ()
#9  0x748e457f in (anonymous namespace)::FileThreadDeserializer::ReportOnOriginThread() ()
#10 0x748e0800 in RunnableMethod<(anonymous namespace)::FileThreadDeserializer, void ((anonymous namespace)::FileThreadDeserializer::*)(), Tuple0>::Run() ()
#11 0x763e59c1 in (anonymous namespace)::TaskClosureAdapter::Run() ()
#12 0x763e5c3b in base::internal::Invoker1<false, base::internal::InvokerStorage1<void ((anonymous namespace)::TaskClosureAdapter::*)(), (anonymous namespace)::TaskClosureAdapter*>, void ((anonymous namespace)::TaskClosureAdapter::*)()>::DoInvoke(base::internal::InvokerStorageBase*)
    ()
#13 0x763e8d5b in MessageLoop::RunTask(MessageLoop::PendingTask const&) ()
#14 0x763eb594 in MessageLoop::DoWork() ()
#15 0x764312f6 in base::MessagePumpGlib::RunWithDispatcher(base::MessagePump::Delegate*, base::MessagePumpDispatcher*) ()
#16 0x763ed08b in MessageLoop::RunInternal() ()
#17 0x763ed2d9 in MessageLoopForUI::Run(base::MessagePumpDispatcher*) ()
#18 0x73bc2a8e in BrowserMain(MainFunctionParams const&) ()
#19 0x73bbac08 in ChromeMain ()
#20 0x73bbba25 in main ()

Mar 16, 2013
#1 bugdroid1@chromium.org
(No comment was entered for this change.)
Blocking: chromium-os:17606 chromium-os:17898 chromium-os:19491 chromium-os:20100
Jul 26, 2011
#2 cmasone@chromium.org
I figure this is due to some of Zel's recent changes.  Maybe Nikita can find someone to look at it in the next couple of hours?
Status: Assigned
Owner: nkostylev@chromium.org
Cc: zelidrag@chromium.org
Jul 27, 2011
#3 oshima@chromium.org
(No comment was entered for this change.)
Summary: ProfileImpl may call a callback on LoginUtuil object which has been deleted by AtExitManager (as it's singleton)
Jul 27, 2011
#4 oshima@chromium.org
(No comment was entered for this change.)
Blocking: 17606
Jul 27, 2011
#5 oshima@chromium.org
Repro step
1) Apply http://codereview.chromium.org/7464013/ to chrome
2) Apply http://gerrit.chromium.org/gerrit/#change,4579 to login_manager
3) run_remote_tests.sh <necessary arguments here> suite_Smoke

I made the patch http://codereview.chromium.org/7470039/ to prove that this is an issue. With this patch,
test no longer crashes, although desktopui_ChromeFirstRender timesout and logging_UserCrash still fails.
It could be different reason or maybe the patch is causing different type of issue. 

Jul 27, 2011
#6 nkostylev@chromium.org
(No comment was entered for this change.)
Status: Started
Labels: Iteration-35 Mstone-R14
Jul 27, 2011
#7 nkostylev@chromium.org
(No comment was entered for this change.)
Status: Assigned
Jul 29, 2011
#8 nkostylev@chromium.org
(No comment was entered for this change.)
Owner: avayvod@chromium.org
Labels: -Pri-2 Pri-1
Aug 9, 2011
#9 avayvod@chromium.org
(No comment was entered for this change.)
Status: Started
Aug 9, 2011
#10 avayvod@chromium.org
(No comment was entered for this change.)
Owner: altimof...@chromium.org
Aug 9, 2011
#11 altimof...@chromium.org
oshima@: Do you understand, why UI thread is processing tasks when AtExitManager has been destroyed? 

As for the patch, we can use Singleton with Traits::RAE equals to false (one line fix).
Cc: oshima@chromium.org
Aug 9, 2011
#12 davemo...@chromium.org
(No comment was entered for this change.)
Blocking: 17898
Aug 9, 2011
#13 davidjames@chromium.org
This seems to be causing a lot of crashes. It crashed twice during b840 of the x86-mario canary and once during zgb canary.

Here are two stack traces, attached. (for http://chromeos-botmaster.mtv.corp.google.com:8026/builders/x86-mario%20canary/builds/840 )


stack-mario-b840.txt
85.3 KB   View   Download
stack-zgb-b630.txt
40.5 KB   View   Download
Labels: -Pri-1 Pri-0 TreeCloser
Aug 9, 2011
#14 tlamb...@google.com
This bug looks like it needs some love? This problem is getting more prevalent.
Aug 10, 2011
#15 de...@google.com
[reposting with a typo corrected]

Both stack traces from #12 show a segfault dereferencing 0x0 at:

 0  chrome!chromeos::LoginUtilsImpl::OnProfileCreated [login_utils.cc : 561 + 0x0]

which is:

  delegate_->OnProfilePrepared(user_profile);

I know nothing about the order in which the login code is usually executed*, but...

- LoginUtilsImpl's c'tor initializes |delegate_| to NULL.
- LoginUtilsImpl::PrepareProfile() takes a |delegate| pointer param and sets |delegate_| to it.

Any chance that OnProfileCreated() is getting called before PrepareProfile() (which looks like it gets run by ExistingUserController::OnLoginSuccess())?  I'll add an "if (delegate_)" conditional in OnProfileCreated(); it seems like it can't hurt, at least.

* What's the deal with ExistingUserController?  It's kind of a singleton but not really: it has a static |current_controller_| member that gets |this| assigned to it in its c'tor.  This seems very fragile from an object-lifetime / memory-ownership perspective.
Aug 10, 2011
#16 derat@chromium.org
(No comment was entered for this change.)
Owner: derat@chromium.org
Aug 10, 2011
#17 derat@chromium.org
(Evan said basically the same thing yesterday in  issue 17899  comment #42 that I just said here.)
Aug 10, 2011
#18 derat@chromium.org
I committed http://codereview.chromium.org/7590026/ as r96208.  Moving a conversation from the review to this bug:

altimofeev wrote:
> LoginUtilsImpl::PrepareProfile() has set |delegate_| when Chrome is 
> killed Actually, only LoginUtilsImpl:: PrepareProflie does invoke
> CreateDefaultProfileAsync(this), which starts profile loading and makes
> ProfileManager to callback in the end (note, it passes 'this' as
> ProfileManager::Observer).  So |delegate_| should be initialized before
> ProfileManager even gets to know about 'this' observer.
>
> Why do you think, that Oshima's suggestion is wrong - singleton is 
> deleted in the main thread, but UI thread is alive and tries to use it?

What do you mean by "main thread" vs. "UI thread"?  The browser process's UI thread runs main().

The segfault appears to be happening as a result of dereferencing 0x0, and I didn't see anything that would be zeroing |delegate_|.  I agree with you that LoginUtilsImpl::PrepareProfile() looks like it should be getting called before LoginUtilsImpl::OnProfileCreated(), but there are parts of this code that confuse me enough that I would be hesitant to make guarantees about its behavior in uncommonly-exercised cases.  For example, ExistingUserController::OnLoginSuccess() passes itself to LoginUtils::Get()->PrepareProfile() as a delegate, but the delegate never gets unregistered.  Maybe ExistingUserController's lifetime is always longer than LoginUtil's, but depending on relationships like this between quasi-singleton objects (with no documentation or asserts to enforce them, as far as I can see) seems like it could introduce bugs.
Owner: altimof...@chromium.org
Cc: derat@chromium.org
Aug 10, 2011
#19 altimof...@chromium.org
derat wrote:
> What do you mean by "main thread" vs. "UI thread"?  The browser process's UI thread runs main().

By 'main thread' I mean thread, which has AtExitManager, which in turn owns the singletons. And UI thread is the thread on which ProfileManager::OnProfileCreated is called. AFAIU, these two threads are different. Moreover, main thread creates all other threads. Correct me please, if I'm wrong.

Anyway, if your fix has helped then great - I'll investigate why delegate was NULL. If not, I'll sugest to also try Oshima's fix - http://codereview.chromium.org/7470039/.
Aug 11, 2011
#20 altimof...@chromium.org
BTW, I can't reproduce this crash locally (I'm running 'login_CryptohomeMounted' test). It crashes with sig 11, but the stack trace is way different.
Aug 11, 2011
#21 derat@chromium.org
If this is still happening on the builders after r96208, please post more stack traces.
Aug 11, 2011
#22 oshima@chromium.org
UI thread and main threads are the same. See browser_main.cc:BrowserMain.
I agree that it may not be the case that object is deleted by AtExitManager. (It was my best guess). Another possibility is that profile might have already been deleted by the time that it gets callback. You may need to listen to PROFILE_DESTORYED notification and skip processing callback if that happens.
Aug 11, 2011
#23 bugdroid1@chromium.org
Commit: 6db3f80ab78c4cb2d3b09378add10236dd412153
 Email: derat@chromium.org

autotest: Wait for login to finish before logging out.

This makes login.attempt_logout() call
wait_for_initial_chrome_window() before asking the session
manager to stop the session.  I'm hoping that this will
avoid a bunch of Chrome and window manager crashes that
we've seen when the UI job is stopped soon after it's been
started.

BUG=chromium-os:17899,chromium-os:18269
TEST=ran suite_Smoke a few times without seeing any chrome crashes (i was seeing them consistently in login_CryptohomeMounted and login_CryptohomeUnmounted before)

Change-Id: I80d3d6aaa0db5b7a12f973d114270a7f23d324a1
Reviewed-on: http://gerrit.chromium.org/gerrit/5797
Reviewed-by: Dave Moore <davemoore@chromium.org>
Tested-by: Daniel Erat <derat@chromium.org>

M	client/cros/login.py
Aug 11, 2011
#24 falken@chromium.org
I'm attaching a stack trace from a crash on x86-pineview-full  
http://chromegw/i/chromiumos/builders/x86%20pineview%20full/builds/82 that seems to show this signature.
b82.txt
51.0 KB   View   Download
Aug 12, 2011
#25 derat@chromium.org
The stack trace from #24 is from a version of Chrome that doesn't include the change described in #18, r96208 (judging by the line number that it's reporting for the segfault).
Aug 12, 2011
#26 su...@chromium.org
Bulk edit, adding iteration=36, as all these bugs are still marked R14. If Iteration-36/Mstone=R14 is not the right labels, please change accordingly.
Aug 12, 2011
#27 su...@chromium.org
(No comment was entered for this change.)
Labels: Iteration-36
Aug 15, 2011
#28 derat@chromium.org
Zel, can I merge r96208 to R14?
Labels: Merge-Requested
Aug 15, 2011
#29 davemo...@chromium.org
(No comment was entered for this change.)
Labels: Merge-Approved
Aug 15, 2011
#30 bugdroid1@chromium.org
Commit: dd1ecd80d2b320ac25cb382342321f84fc098283
 Email: derat@chromium.org

autotest: Wait for login to finish before logging out.

This makes login.attempt_logout() call
wait_for_initial_chrome_window() before asking the session
manager to stop the session.  I'm hoping that this will
avoid a bunch of Chrome and window manager crashes that
we've seen when the UI job is stopped soon after it's been
started.

BUG=chromium-os:17899,chromium-os:18269
TEST=ran suite_Smoke a few times without seeing any chrome crashes (i was seeing them consistently in login_CryptohomeMounted and login_CryptohomeUnmounted before)

Change-Id: I80d3d6aaa0db5b7a12f973d114270a7f23d324a1
Reviewed-on: http://gerrit.chromium.org/gerrit/5797
Reviewed-by: Dave Moore <davemoore@chromium.org>
Tested-by: Daniel Erat <derat@chromium.org>
Reviewed-on: http://gerrit.chromium.org/gerrit/6022
Reviewed-by: Daniel Erat <derat@chromium.org>

M	client/cros/login.py
Aug 15, 2011
#31 derat@chromium.org
Merged r96208 to 835 as r96847.

Alexey, any progress?
Labels: -Pri-0 -Merge-Requested -Merge-Approved Pri-1 Merge-Merged
Aug 16, 2011
#32 tbroch@chromium.org
Looks like this is still occuring ( x86-generic PFQ w/ chrome 854 )
http://chromegw.corp.google.com/i/chromiumos/builders/x86%20generic%20PFQ/builds/373/steps/Test/logs/stdio


----------------------------------------------------
Total PASS: 17/17 (100%)

Crashes detected during testing:
----------------------------------------------------
chrome sig 11
  suite_Smoke/login_CryptohomeUnmounted
----------------------------------------------------
Total unique crashes: 1
>>> Details stored under /tmp/cbuildbot_ZRhPc/test_harness/SimpleTestVerify/1_verify
Persist requested.  Use --ssh_port 9227  --kvm_pid /tmp/kvm.9227 to re-connect to it.

Aug 16, 2011
#33 derat@chromium.org
ITYM "looks like there is still at least one bug still in Chrome". :-P

At least, all I can find in the build artifacts is the following (SimpleTestVerify/1_verify/suite_Smoke/login_CryptohomeUnmounted/sysinfo/iteration.1/var/spool/crash/chrome.20110816.172033.11549.dmp.txt).  Is there a symbolized stack trace somewhere else?

Operating system: Linux
                  0.0.0 Linux 2.6.38.3+ #1 SMP Tue Aug 16 15:12:11 PDT 2011 i686
CPU: x86
     GenuineIntel family 6 model 2 stepping 3
     1 CPU

Crash reason:  SIGSEGV
Crash address: 0x0

Thread 0 (crashed)
 0  chrome + 0x1133aa8
    eip = 0x7499faa8   esp = 0x7096ea60   ebp = 0x7096ea78   ebx = 0x7798aff4
    esi = 0x000006d9   edi = 0x00000db2   eax = 0x7150a91e   ecx = 0x7096eadc
    edx = 0x00000000   efl = 0x00210246
    Found by: given as instruction pointer in context
 1  0xabababaa
    eip = 0xabababab   esp = 0x7096ea80   ebp = 0xabababab
    Found by: previous frame's frame pointer
Aug 18, 2011
#34 altimof...@chromium.org
Daniel: sorry, but no luck yet.
Aug 22, 2011
#35 nkostylev@chromium.org
(No comment was entered for this change.)
Labels: Iteration-37
Aug 22, 2011
#36 dmazzoni@chromium.org
(No comment was entered for this change.)
Blocking: 19491
Aug 25, 2011
#37 altimof...@chromium.org
Clarification.
This stack trace can't be found on the crash server, so probably it was test-related issue. So, hopefully, Daniel's patch fixed the problem. Also, hot-fix with NULL checking before calling should give double guarantee, that it shouldn't fail in that place again. 

My feeling is that this bug wasn't related to LoginUtilsImpl, but to the unnatural test case invocation.

Thereby, I'm closing the issue.
Status: Fixed
Aug 25, 2011
#38 oshima@chromium.org
Could you please test using the repro step in #4?
Status: Assigned
Aug 25, 2011
#39 derat@chromium.org
#37: Just to clarify, was there really actually something unnatural about the way that the test was using this code?  Wasn't it just running a regular, unmodified build of Chrome?  I think that the only possibly-unnatural thing about it was that it could stop the UI job abruptly, but Chrome ought to be able to handle that without crashing.
Aug 25, 2011
#40 oshima@chromium.org
#37: Additional info. Please run the test using VM instead of device. This happened only when I run on VM probably because it runs test faster.
Aug 30, 2011
#41 oshima@chromium.org
ping? Please run repro step and give us an update. UMA fix is blocked by this and we'd like to include this in R14.

Thanks,
Aug 31, 2011
#42 altimof...@chromium.org
Sorry for the delay.

Update: I've tried to reproduce using Oshima's steps, but it doesn't crashes in LoginUtils. At the same time some tests crash with the signal 5. Since this issue blocks R14, probably someone more experienced with Chrome shutdown process should take a look.
Also, bug summary is confusing, so probably it should be changed.
Owner: oshima@chromium.org
Sep 2, 2011
#43 davidjames@google.com
No longer a TreeCloser thanks to http://gerrit.chromium.org/gerrit/5797 which worked around the issue (even if still a bug in Chrome)
Labels: -TreeCloser
Sep 6, 2011
#44 oshima@chromium.org
(No comment was entered for this change.)
Blocking: 20100
Sep 6, 2011
#45 oshima@chromium.org
This crash has been fixed, but i've got new problem. Please see 20100.
Status: Fixed
Sep 12, 2011
#47 nkostylev@chromium.org
(No comment was entered for this change.)
Status: Assigned
Cc: altimof...@chromium.org
Sep 13, 2011
#48 altimof...@chromium.org
davidjames: am I correct, that stumpy build bot crashed only once, and since then (3 days) it works correctly? Probably, it is not TreeCloser anymore.

Also I tried to analyse the core dump, but symbols for stumpy-0.16.1019.0 didn't fit. Don't you know the place where one can find exact debug symbols for your coredump?
Cc: davidjames@chromium.org
Sep 13, 2011
#49 oshima@chromium.org
Alex, looks like dan's fix wasn't enough, which is understandable as Dan's change was to wait a bit more instead of fixing real cause. Could you please look into this again?
Owner: altimof...@chromium.org
Sep 16, 2011
#50 oshima@chromium.org
steven, have you seen this crash while investigating shutdown issue?

alex, do you have any update?
Cc: stevenjb@chromium.org
Sep 16, 2011
#51 stevenjb@chromium.org
I have not seen this crash, and I'm not sure it's directly related to anything I've been looking at. 

The stack from comment #46 indicates that despite the addition of a NULL check, we are crashing on calling delegate_->OnProfilePrepared. This implies to me that this code is getting called after ExistingUserController is getting destroyed, possibly when BaseLoginDisplayHost is getting destroyed? Maybe we need to be calling something like LoginUtils::Get()->set_delegate(NULL) when we destroy ExistingUserController?

Sep 16, 2011
#52 oshima@chromium.org
steven, thank you for look into it! Will you be able to create a hot fix for this?
Sep 16, 2011
#53 stevenjb@chromium.org
I could, although I'm not super familiar with the LoginUtils code. Alexey, go ahead and assign this to me if you're plate is full.

Sep 16, 2011
#54 oshima@chromium.org
That's fair. Alexey, could you please work on the fix?
Sep 19, 2011
#55 altimof...@chromium.org
Sorry, for the silence - I was OOO.
I like Steven's proposal. Will make corresponding fix ASAP.

I have wrongly connected #46 crash with earlier crashes, where |delegate_| was NULL. So I thought that memory was corrupted in some way, that also invalidated |delegate_|. But I have failed to get symbols for the core dump, so I got stuck.
Sep 19, 2011
#57 nkostylev@chromium.org
Let's merge latest fix into R15 branch.
Labels: -Mstone-R14 -Merge-Merged Mstone-R15 Merge-Requested ReleaseBlock-Stable Crash
Sep 19, 2011
#58 kar...@google.com
(No comment was entered for this change.)
Labels: -Merge-Requested Merge-Approved
Sep 20, 2011
#59 altimof...@chromium.org
Merged: http://src.chromium.org/viewvc/chrome?view=rev&revision=101937
Status: Fixed
Labels: Merge-Merged
Sep 21, 2011
#60 aban...@chromium.org
Any Tests for QA team?
Labels: PendingDevFeedback
Sep 26, 2011
#61 bhaves...@chromium.org
Closing it considering its fixed. Reopen if issue persist. 
Status: Verified
Sep 26, 2011
#62 chromeos...@chromium.org
(No comment was entered for this change.)
Labels: FixedIn-0.15-884.0 FixedIn-0.14-811.46
Oct 5, 2011
#63 or...@chromium.org
Claiming work to iteration-39
Labels: Iteration-39
Oct 25, 2011
#64 chromeos...@chromium.org
(No comment was entered for this change.)
Labels: -FixedIn-0.15-884.0 -FixedIn-0.14-811.46 FixedIn-884.0.0 FixedIn-811.46.0
Jan 20, 2012
#65 chromeos...@chromium.org
(No comment was entered for this change.)
Labels: FixedInIndex-1e FixedInIndex-32b_4
Mar 6, 2013
#66 bugdroid1@chromium.org
(No comment was entered for this change.)
Labels: OS-Chrome
Blocking: -chromium-os:17606 -chromium-os:17898 -chromium-os:19491 -chromium-os:20100 chromium-os:17606 chromium-os:17898 chromium-os:19491 chromium-os:20100
Mar 9, 2013
#67 bugdroid1@chromium.org
(No comment was entered for this change.)
Labels: -Area-Auth -TreeCloser -Crash Cr-Stability Cr-UI-SignIn Hotlist-TreeCloser
Mar 10, 2013
#68 bugdroid1@chromium.org
(No comment was entered for this change.)
Labels: -Mstone-R15 M-15
Mar 20, 2013
#69 lafo...@google.com
(No comment was entered for this change.)
Labels: -Cr-UI-SignIn Cr-Services-SignIn
Aug 10, 2013
#70 lafo...@google.com
Removing Merge-Approval for requests not in current milestones (28, 29, 30)
Labels: -Merge-Approved
Sign in to add a comment

Powered by Google Project Hosting