|Issue 6478:||Screen Freezes on Dev Phone with a SIGSEGV|
|23 people starred this issue and may be notified of changes.||Back to list|
- Steps to reproduce the problem. 1. Cannot reproduce. Had instances where the phone rebooted while playing the game Bebbled, but never an instance where it went into a recursive loop and stay there. - What happened. Screen Froze while playing the game Abduction. - What you think the correct behavior should be. It should relaunch the home screen if constrained by memory or kill the effecting app/process, but not freeze the screen. It seems like it went into an infinite loop as it keeps reloading the native libs (Native.c) and throws up a SIGSEGV. Bug report and trace.txt attached. Cheers Chander
Feb 5, 2010
Decoded stack trace is attached. The adb bugreport output didn't capture a crash, but it appears several times in the "checkin service" area. It appears to be a failure during initialization of the ICU library as the zygote process is starting up. Failures here are usually a bad sign -- the zygote init is just preloading classes, so if something is crashing the world is in a strange state. Could be damaged values in the linux file cache. I'm guessing this cleared up when the device was rebooted? Build fingerprint is android-devphone1/dream_devphone/dream/trout:1.6/DRC83/14721:userdebug/adp,test-keys
Feb 5, 2010
It did clear up on reboot via."adb reboot".
Feb 5, 2010
BTW, thanks for the decoded trace. I generated the bug report several hours later and not at the time of crash, as I had to dash off somewhere. But I assumed ,the kernel reports are preserved across crashes, and I could see the same messages I had seen during the crash in the report. So I guess thats ok !!! Is there anything I could do, the next time this happens that might help you in identifying the issue more accurately ? Or is this a sign of more severe problem with my device itself ?
Feb 5, 2010
Could be a cosmic ray struck a piece of RAM just the right way and knocked a bit loose in RAM. You shouldn't worry about it unless weird stuff keeps happening -- if your hardware is going bad it will continue to happen, probably more and more often. If it was a random event you might not see it again.
Apr 22, 2010
I can confirm that this happens at least one a week on my Milestone with 2.0.1 if I play a lot. Typically it will just hang. Pressing the Home key will "work" in the sense that it gives the usual tactile feedback the first few times, but then stops even that. Several times the phone has auto-rebooted after a short time, once I was distracted and when I came back 10 minutes lated the phone seemed OK again - not rebooted. Several other times I've had to pull the battery. I should mention that I have only ever seen this since I started playing in "online" mode and that I typically lose the server connection a few times a day - very often almost midnight sharp which to me is a bit suspect of an auto restart. So, there MIGHT just be a correlation with lost connection. Another suspect is that a couple of times the game has gone seriously slower. On these occassions (too few to be sure really) it appears that if I quit and restat the game all is well, but if I insist on going on it will hang solid after a while. I've not really found a pattern whether I can resume the hung game or not after reboot. I seems that it doesn't work most of the time, but I have a feeling that it has worked a couple of times (though it might actually have been a previous game). I should also mention that I've never experienced this on my ADP1 with 1.6. Hopefully I'll get my Nexus One free from Google soon - I'll let you know if there is a difference. I also hope to get the 2.1 upgrade on the Milestone any day now. Hope this can help in finding the issue. The game is great when it works. Best / Jonas
Apr 24, 2010
Incidently, I just had the semi-hung state and managed to take a look in logcat - it contained sh*tloads of lines like this one: W/AudioTrack( 5686): obtainBuffer timed out (is the CPU pegged?) 0x3765e8 user=00005000, server=00003000 Also a fair number of these: W/WindowManager( 1174): No window to dispatch pointer action 1 I/InputDevice( 1174): Dropping bad point #0: newY=259 closestDy=751 maxDy=420 Possibly an indication of my theory was a couple of these: D/NotificationMgr( 1235): updateNetworkSelection()...state = 0 new network D/NetworkLocationProvider( 1174): onDataConnectionStateChanged 10 D/PhoneApp( 1235): mReceiver: ACTION_ANY_DATA_CONNECTION_STATE_CHANGED D/PhoneApp( 1235): - state: CONNECTED D/PhoneApp( 1235): - reason: null D/NotificationMgr( 1235): hideDataDisconnectedRoaming()... ... plus a handful of these: E/JavaBinder( 1235): java.lang.RuntimeException: No memory in memObj (and of course heaps of GC lines as always) After a while things calmed down and I managed to exit the game, so no reboot was needed. Sadly I was unable to get anything useful from /data/anr/traces.txt Was this useful at all? To me it suggests a bad low memory state. Best / Jonas
Apr 27, 2010
Today I was able to capture the entire log from when Bebbled started to act slowly, then seemingly hung for quite some time and then (about when I pressed the Home key) decided to reboot my Milestone (still running 2.0.1). As before there "CPU pegged?" message is repeated a lot during the hang/slowness. Maybe Bebbled is stressing the system with sound effects? A quick look in the source tree seems to indicate that the error message happens in: frameworks/base/media/libmedia/AudioTrack.cpp AudioTrack::obtainBuffer()
Apr 27, 2010
"obtainBuffer timed out (is the CPU pegged?)" This happens because the audioflinger thread has not read the data from the driver and did not signal the lock causing this logs to come. This happens because your driver is having some bug not because of android framework.
May 7, 2010
I don't know if this is related but I get freezes and this message on 2.1-update1: W/SharedBufferStack(29487): waitForCondition(LockCondition) timed out (identity=1310, status=0). CPU may be pegged. trying again. Phones: Nexus One 2.1-update1 and VZW Droid 2.1-update1 Here is what I posted on the dev group: It just happened again to me in the middle of testing a new game, which is the first time I've seen it happen during one of my games. The game had been loaded for about a minute and I was playing and everything was fine and then the phone locked up and this was in the log: I/ActivityThread(29551): Publishing provider com.android.deskclock: com.android.deskclock.AlarmProvider I/ActivityManager( 74): Process com.amazon.mp3 (pid 26508) has died. W/BackupManagerService( 74): dataChanged but no participant pkg='com.android.providers.settings' uid=10025 I/ActivityManager( 74): Process com.google.android.apps.uploader (pid 26714) has died. W/SharedBufferStack(29487): waitForCondition(LockCondition) timed out (identity=1310, status=0). CPU may be pegged. trying again. W/SharedBufferStack(29487): waitForCondition(LockCondition) timed out (identity=1310, status=0). CPU may be pegged. trying again. W/SharedBufferStack( 74): waitForCondition(LockCondition) timed out (identity=4, status=0). CPU may be pegged. trying again. W/SharedBufferStack(29487): waitForCondition(LockCondition) timed out (identity=1310, status=0). CPU may be pegged. trying again. W/SharedBufferStack( 74): waitForCondition(LockCondition) timed out (identity=4, status=0). CPU may be pegged. trying again. The waitForCondition messages kept logging every second until the phone (Nexus One) rebooted itself. This continued until the device rebooted. As you can imagine, process 29487 is my game. 74 is (I think) system service. Both got hung up and my game certainly didn't do anything to cause this. I just use a GLSurfaceView and a normal logic thread. Nothing special, no hacks. This is not the first time I've seen it - it can be reliably reproduced when flipping orientations on a live wallpaper service and doing the proper context/display destroy and new EGL init on orientation change. It happens almost instantly there. It's actually plaguing my live wallpaper because I haven't found a workaround for it yet. I've also seen this happen once before to RenderScript, so perhaps it is OpenGL-related? Here are discussion threads revolving around the issue I'm having: http://groups.google.com/group/android- developers/browse_thread/thread/63e012edd3a714b3/45cd348c4139fe97? lnk=gst&q=waitForCondition#45cd348c4139fe97 http://groups.google.com/group/android- developers/browse_thread/thread/f7b6b17b04f93ef3/f43da16d6d1c038e? lnk=gst&q=waitForCondition#f43da16d6d1c038e
Aug 28, 2010
Dear rbgrn.net, I am having the same problem in Froyo. It happens during booting up and running CTS test) https://code.google.com/p/android/issues/detail?id=10505&q=waitforcondition&colspec=ID%20Type%20Status%20Owner%20Summary%20Stars Have you experienced those cases before?
Oct 6, 2010
This issue can occur with no particular prompting in any application using glSurfaceView -- possibly any application using OpenGL. This is a rare, difficult to reproduce bug that's catastrophic when it happens. It happens most often on HTC devices in my experience. So far as I can tell it's below the level I can do anything about for my Live Wallpapers, which is terrifying in the extreme -- no matter how much error checking or thread-safety I try to reinforce, it's still there. The easiest to reproduce case I have is on the T-Mobile G2, when switching between portrait and landscape mode with one of my live wallpapers active. It might take a couple hundred tries, but eventually you'll start getting the waitForCondition(LockCondition) error and the wallpaper will start rendering. I've done extensive logging and it's always at one of two locations when it locks: eglSurface = egl.eglCreateWindowSurface(display, config, nativeWindow, null); or mEgl.eglSwapBuffers(mEglDisplay, mEglSurface); PLEASE someone in a position to investigate, take a look at this bug. It's plaguing the Android OpenGL development scene and hasn't seen any progress in over a year. Note that this happens with a bog-standard usage of glSurfaceView, it doesn't require any sort of hackery.
Oct 6, 2010
Perhaps we should create a new bug with good evidence and the target set to 2.2? They may just be ignoring all 1.6 bugs with the idea that "it's probably fixed now" though this is totally catastrophic (I get an alarming number of reports of this happening on many devices) and it's all over 2.1 and 2.2.
Dec 19, 2010
I took top-down and bottom-up approach to debug the issue. From the frame buffer driver, i could able to put a break point at fb_ioctl and fb_update. Both of them are hitting at regular intervals to update the frame buffer. This shows that there is no issue with the SurfaceFlinger server side. That is the SurfaceFlinger is able to grab the front buffer and renders the same onto the LCD. The second approach is the top-down approach, where i could able to attach gdb to the respective threads. While debugging that i found the application is requesting a surface i.e the SurfaceComposeClient is requesting createSurfac(), which will call new Layer() in SurfaceFlinger. The call to createNormalSurfaceLocked() is finally reaching to eglCreateWindowSurface() function. The function call is not returning. The client is waiting to get a Surface from the egl hardware and locked on this. If possible we can make eglCreateWindowSurface() or any function create surface function leading to access the underlying hardware should have a timed wait. Once the timeout happens, the function should return and the error code should be passed to the higher level. Also i tried to disable the hardware rendering by setting the debug property mask setprop debug.egl.hw 0 but the issue still reproducible My debugging is on Froyo. Cheers, Srinivas Kalbhavi
Apr 2, 2011
a possible workaround can be calling glfinish just before the eglswap. http://email@example.com/2010-09/01612/(android-developers)-Re-OpenGL-lockups-in-2.2.html
May 27, 2011
glFinish is not an option for games because it tanks the frame rate. This is a serious issue that still hasn't been resolved.
May 27, 2011
What fixed this for me on HTC phones was make sure to delete old textures.
May 27, 2011
Sorry, can you clarify what you mean by deleting old textures?
May 27, 2011
in my experience, this issue start happening when you have loaded a lot of textures ( never happened to me, even in the stress tests, when i used some tiny placeholder textures...but once i have loaded all the real textures, the bug has increased the rate exponentially ( with ~150mb worth of textures, this bug start happens almost once every 20 minute). glFinish it's not the solution, but worked flawlessly to me so far, with only a small noticeable fps drop on lower devices. obviously it's not even close to be enough to flag this as solved....but until then (and sadly, this bug is still open, a year later... :\ ) this can be a life saver, imho. (sorry for any engRish )
Jun 4, 2011
I found that it correlates to non-square textures. We removed the only asset in our game with a not-square, rectangular texture and the problem went away.
Apr 17, 2012
By non-square, do you mean non-power-of-two textures? I'm only using power-of-two textures and I'm experiencing this issue.
May 18, 2012
Non-square textures, glFinish, VOB v nonVOB code are all merely masking the problem, which is some form of locking race condition in the operating system. By changing relative timings you can go from "never seeing" to "happens a lot". I have spent a huge amount of time on this, I believe there is no client-side workaround that actually circumvents this < in the general case >. If your application has a very predictable OpenGL execution and an even load then sticking in the odd glFinish, or making a small change in textures can be sufficient to hide the bug. The problem is that, even then, putting the app on a different device can cause the bug to pop up again. THIS NEEDS TO BE FIXED - anything that can cause your expensive smart phone to hang or reboot is VERY serious.
May 31, 2012
"Non-square textures, glFinish, VOB v nonVOB code are all merely masking the problem" Correct, after 3 months of research this is my conclusion as well. "some form of locking race condition in the operating system. By changing relative timings you can go from "never seeing" to "happens a lot"." I think you hit the nail on the head, see bugs 7432 and 20833 Problem seems to have been in linux kernel futexes implementation, with no possibility of work-around. I think from android 4.0 it should no longer be a issue (linux kernel >3), for older devices I think it is unlikely they'll ever receive the patch at all, let alone within a usefull timespan.
|► Sign in to add a comment|