My favorites | Sign in
Project Home Issues
New issue   Search
for
  Advanced search   Search tips   Subscriptions
Issue 6478: Screen Freezes on Dev Phone with a SIGSEGV
23 people starred this issue and may be notified of changes. Back to list
Status:  New
Owner:  ----


Sign in to add a comment
 
Reported by ChanderS...@gmail.com, Feb 5, 2010
- Steps to reproduce the problem.
1. Cannot reproduce. Had instances where the phone rebooted while playing 
the game Bebbled, but never an instance where it went into a recursive loop 
and stay there.
- What happened.
  Screen Froze while playing the game Abduction.

- What you think the correct behavior should be.

  It should relaunch the home screen if constrained by memory or kill the 
effecting app/process, but not freeze the screen. It seems like it went 
into an infinite loop as it keeps reloading the native libs (Native.c) and 
throws up a SIGSEGV.


Bug report and trace.txt attached.

Cheers
Chander

bug_report_and_traces.zip
206 KB   Download
Feb 5, 2010
#1 fadden%a...@gtempaccount.com
Decoded stack trace is attached.

The adb bugreport output didn't capture a crash, but it appears several times in the
"checkin service" area.

It appears to be a failure during initialization of the ICU library as the zygote
process is starting up.  Failures here are usually a bad sign -- the zygote init is
just preloading classes, so if something is crashing the world is in a strange state.
 Could be damaged values in the linux file cache.  I'm guessing this cleared up when
the device was rebooted?


Build fingerprint is
android-devphone1/dream_devphone/dream/trout:1.6/DRC83/14721:userdebug/adp,test-keys

decoded.txt
5.0 KB   View   Download
Feb 5, 2010
#2 ChanderS...@gmail.com
It did clear up on reboot via."adb reboot". 
Feb 5, 2010
#3 cspeche...@gmail.com
BTW, thanks for the decoded trace. I generated the bug report several hours later and
not at the time of crash, as I had to dash off somewhere. But I assumed ,the kernel
reports are preserved across crashes, and I could see the same messages I had seen
during the crash in the report. So I guess thats ok !!!

Is there anything I could do, the next time this happens that might help you in
identifying the issue more accurately ? 

Or is this a sign of more severe problem with my device itself ? 


Feb 5, 2010
#4 fadden%a...@gtempaccount.com
Could be a cosmic ray struck a piece of RAM just the right way and knocked a bit
loose in RAM.  You shouldn't worry about it unless weird stuff keeps happening -- if
your hardware is going bad it will continue to happen, probably more and more often.
 If it was a random event you might not see it again.

Apr 22, 2010
#5 catel...@gmail.com
I can confirm that this happens at least one a week on my Milestone with 2.0.1 if I
play a lot. Typically it will just hang. Pressing the Home key will "work" in the
sense that it gives the usual tactile feedback the first few times, but then stops
even that. Several times the phone has auto-rebooted after a short time, once I was
distracted and when I came back 10 minutes lated the phone seemed OK again - not
rebooted. Several other times I've had to pull the battery.

I should mention that I have only ever seen this since I started playing in "online"
mode and that I typically lose the server connection a few times a day - very often
almost midnight sharp which to me is a bit suspect of an auto restart. So, there
MIGHT just be a correlation with lost connection.

Another suspect is that a couple of times the game has gone seriously slower. On
these occassions (too few to be sure really) it appears that if I quit and restat the
game all is well, but if I insist on going on it will hang solid after a while.

I've not really found a pattern whether I can resume the hung game or not after
reboot. I seems that it doesn't work most of the time, but I have a feeling that it
has worked a couple of times (though it might actually have been a previous game).

I should also mention that I've never experienced this on my ADP1 with 1.6. 

Hopefully I'll get my Nexus One free from Google soon - I'll let you know if there is
a difference. I also hope to get the 2.1 upgrade on the Milestone any day now.

Hope this can help in finding the issue. The game is great when it works.

                         Best / Jonas

Apr 24, 2010
#6 catel...@gmail.com
Incidently, I just had the semi-hung state and managed to take a look in logcat - it
contained sh*tloads of lines like this one:

W/AudioTrack( 5686): obtainBuffer timed out (is the CPU pegged?) 0x3765e8
user=00005000, server=00003000 

Also a fair number of these:
W/WindowManager( 1174): No window to dispatch pointer action 1 
I/InputDevice( 1174): Dropping bad point #0: newY=259 closestDy=751 maxDy=420 

Possibly an indication of my theory was a couple of these:
D/NotificationMgr( 1235): updateNetworkSelection()...state = 0 new network          
                                                        
D/NetworkLocationProvider( 1174): onDataConnectionStateChanged 10
D/PhoneApp( 1235): mReceiver: ACTION_ANY_DATA_CONNECTION_STATE_CHANGED
D/PhoneApp( 1235): - state: CONNECTED
D/PhoneApp( 1235): - reason: null
D/NotificationMgr( 1235): hideDataDisconnectedRoaming()...

... plus a handful of these:
E/JavaBinder( 1235): java.lang.RuntimeException: No memory in memObj 


(and of course heaps of GC lines as always)

After a while things calmed down and I managed to exit the game, so no reboot was needed.

Sadly I was unable to get anything useful from /data/anr/traces.txt

Was this useful at all? To me it suggests a bad low memory state.

                           Best / Jonas
Apr 27, 2010
#7 catel...@gmail.com
Today I was able to capture the entire log from when Bebbled started to act slowly,
then seemingly hung for quite some time and then (about when I pressed the Home key)
decided to reboot my Milestone (still running 2.0.1).
As before there "CPU pegged?" message is repeated a lot during the hang/slowness.
Maybe Bebbled is stressing the system with sound effects? A quick look in the source
tree seems to indicate that the error message happens in:
frameworks/base/media/libmedia/AudioTrack.cpp  AudioTrack::obtainBuffer()


bebbledrebootlog.zip
23.7 KB   Download
Apr 27, 2010
#8 anish198...@gmail.com
"obtainBuffer timed out (is the CPU pegged?)"
This happens because the audioflinger thread has not read the data from the driver
and did not signal the lock causing this logs to come.
This happens because your driver is having some bug not because of android framework.

May 7, 2010
#9 rbgrn....@gmail.com
I don't know if this is related but I get freezes and this message on 2.1-update1:

W/SharedBufferStack(29487): waitForCondition(LockCondition) timed out 
(identity=1310, status=0). CPU may be pegged. trying again. 

Phones:  Nexus One 2.1-update1 and VZW Droid 2.1-update1

Here is what I posted on the dev group:

 It just happened again to me in the middle of testing a new 
game, which is the first time I've seen it happen during one of my 
games.  The game had been loaded for about a minute and I was playing 
and everything was fine and then the phone locked up and this was in 
the log: 
I/ActivityThread(29551): Publishing provider com.android.deskclock: 
com.android.deskclock.AlarmProvider 
I/ActivityManager(   74): Process com.amazon.mp3 (pid 26508) has died. 
W/BackupManagerService(   74): dataChanged but no participant 
pkg='com.android.providers.settings' uid=10025 
I/ActivityManager(   74): Process com.google.android.apps.uploader 
(pid 26714) has died. 
W/SharedBufferStack(29487): waitForCondition(LockCondition) timed out 
(identity=1310, status=0). CPU may be pegged. trying again. 
W/SharedBufferStack(29487): waitForCondition(LockCondition) timed out 
(identity=1310, status=0). CPU may be pegged. trying again. 
W/SharedBufferStack(   74): waitForCondition(LockCondition) timed out 
(identity=4, status=0). CPU may be pegged. trying again. 
W/SharedBufferStack(29487): waitForCondition(LockCondition) timed out 
(identity=1310, status=0). CPU may be pegged. trying again. 
W/SharedBufferStack(   74): waitForCondition(LockCondition) timed out 
(identity=4, status=0). CPU may be pegged. trying again. 
The waitForCondition messages kept logging every second until the 
phone (Nexus One) rebooted itself. 
This continued until the device rebooted.  As you can imagine, process 
29487 is my game.  74 is (I think) system service.  Both got hung up 
and my game certainly didn't do anything to cause this.  I just use a 
GLSurfaceView and a normal logic thread.  Nothing special, no hacks. 
This is not the first time I've seen it - it can be reliably 
reproduced when flipping orientations on a live wallpaper service and 
doing the proper context/display destroy and new EGL init on 
orientation change.  It happens almost instantly there.  It's actually 
plaguing my live wallpaper because I haven't found a workaround for it 
yet. 
I've also seen this happen once before to RenderScript, so perhaps it 
is OpenGL-related? 

Here are discussion threads revolving around the issue I'm having:
http://groups.google.com/group/android-
developers/browse_thread/thread/63e012edd3a714b3/45cd348c4139fe97?
lnk=gst&q=waitForCondition#45cd348c4139fe97
http://groups.google.com/group/android-
developers/browse_thread/thread/f7b6b17b04f93ef3/f43da16d6d1c038e?
lnk=gst&q=waitForCondition#f43da16d6d1c038e
Aug 28, 2010
#10 ericgo1...@gmail.com
Dear rbgrn.net,

I am having the same problem in Froyo.
It happens during booting up and running CTS test)
https://code.google.com/p/android/issues/detail?id=10505&q=waitforcondition&colspec=ID%20Type%20Status%20Owner%20Summary%20Stars

Have you experienced those cases before?
Oct 6, 2010
#11 jst...@gmail.com
This issue can occur with no particular prompting in any application using glSurfaceView -- possibly any application using OpenGL.  This is a rare, difficult to reproduce bug that's catastrophic when it happens.

It happens most often on HTC devices in my experience.  So far as I can tell it's below the level I can do anything about for my Live Wallpapers, which is terrifying in the extreme -- no matter how much error checking or thread-safety I try to reinforce, it's still there.

The easiest to reproduce case I have is on the T-Mobile G2, when switching between portrait and landscape mode with one of my live wallpapers active.  It might take a couple hundred tries, but eventually you'll start getting the waitForCondition(LockCondition) error and the wallpaper will start rendering.  I've done extensive logging and it's always at one of two locations when it locks:

eglSurface = egl.eglCreateWindowSurface(display, config, nativeWindow, null);

or

mEgl.eglSwapBuffers(mEglDisplay, mEglSurface);

PLEASE someone in a position to investigate, take a look at this bug.  It's plaguing the Android OpenGL development scene and hasn't seen any progress in over a year.  Note that this happens with a bog-standard usage of glSurfaceView, it doesn't require any sort of hackery.
Oct 6, 2010
#12 rbgrn....@gmail.com
Perhaps we should create a new bug with good evidence and the target set to 2.2?  They may just be ignoring all 1.6 bugs with the idea that "it's probably fixed now" though this is totally catastrophic (I get an alarming number of reports of this happening on many devices) and it's all over 2.1 and 2.2.
Dec 19, 2010
#13 Srinivas...@gmail.com
I took top-down and bottom-up approach to debug the issue. From the frame buffer driver, i could able to put a break point at fb_ioctl and fb_update. Both of them are hitting at regular intervals to update the frame buffer. This shows that there is no issue with the SurfaceFlinger server side. That is the SurfaceFlinger is able to grab the front buffer and renders the same onto the LCD.

The second approach is the top-down approach, where i could able to attach gdb to the respective threads. While debugging that i found the application is requesting a surface i.e the SurfaceComposeClient is requesting createSurfac(), which will call new Layer() in SurfaceFlinger. The call to createNormalSurfaceLocked() is finally reaching to eglCreateWindowSurface() function. The function call is not returning. 

The client is waiting to get a Surface from the egl hardware and locked on this. If possible we can make eglCreateWindowSurface() or any function create surface function leading to access the underlying hardware should have a timed wait. Once the timeout happens, the function should return and the error code should be passed to the higher level.

  Also i tried to disable the hardware rendering by setting the debug property mask
    setprop debug.egl.hw 0
but the issue still reproducible

My debugging is on Froyo.

Cheers,
Srinivas Kalbhavi
May 27, 2011
#15 ja...@citizen12.com
glFinish is not an option for games because it tanks the frame rate.  This is a serious issue that still hasn't been resolved.
May 27, 2011
#16 vqc...@gmail.com
What fixed this for me on HTC phones was make sure to delete old textures.
May 27, 2011
#17 ja...@citizen12.com
Sorry, can you clarify what you mean by deleting old textures?
May 27, 2011
#18 cpt.barb...@gmail.com
in my experience, this issue start happening when you have loaded a lot of textures ( never happened to me, even in the stress tests, when i used some tiny placeholder textures...but once i have loaded all the real textures, the bug has increased the rate exponentially  ( with ~150mb worth of textures, this bug start happens almost once every 20 minute).

glFinish it's not the solution, but worked flawlessly to me so far, with only a small noticeable fps drop on lower devices. obviously it's not even close to be enough to flag this as solved....but until then (and sadly, this bug is still open, a year later... :\ ) this can be a life saver, imho.  (sorry for any engRish )
Jun 4, 2011
#19 ja...@citizen12.com
I found that it correlates to non-square textures.  We removed the only asset in our game with a not-square, rectangular texture and the problem went away.
Apr 17, 2012
#20 timo.hei...@gmail.com
By non-square, do you mean non-power-of-two textures? I'm only using power-of-two textures and I'm experiencing this issue.
May 18, 2012
#21 alancald...@googlemail.com
Non-square textures, glFinish, VOB v nonVOB code are all merely masking the problem, which is some form of locking race condition in the operating system. By changing relative timings you can go from "never seeing" to "happens a lot". I have spent a huge amount of time on this, I believe there is no client-side workaround that actually circumvents this < in the general case >. If your application has a very predictable OpenGL execution and an even load then sticking in the odd glFinish, or making a small change in textures can be sufficient to hide the bug. The problem is that, even then, putting the app on a different device can cause the bug to pop up again. THIS NEEDS TO BE FIXED - anything that can cause your expensive smart phone to hang or reboot is VERY serious.
May 31, 2012
#22 bartnl...@gmail.com
"Non-square textures, glFinish, VOB v nonVOB code are all merely masking the problem"
Correct, after 3 months of research this is my conclusion as well.

"some form of locking race condition in the operating system. By changing relative timings you can go from "never seeing" to "happens a lot"."

I think you hit the nail on the head, see bugs 7432 and 20833

Problem seems to have been in linux kernel futexes implementation, with no possibility of work-around. I think from android 4.0 it should no longer be a issue (linux kernel >3), for older devices I think it is unlikely they'll ever receive the patch at all, let alone within a usefull timespan.

Sign in to add a comment

Powered by Google Project Hosting