Beside of 0xdroid on beagleboard and Devkit8000, I tested some other platforms and find out actually almost all of them having this problem. Therefore I suspected it's not a hardware related problem. Maybe a framework or HAL issue. We noticed that when the screen is flipping, the logcat message will complain as following
E/SurfaceFlinger( 768): eglSwapBuffers: EGL error 0x3002 (EGL_BAD_ACCESS) E/gralloc ( 768): handle 0x13f8c0 not locked ~E/gralloc ( 768): handle 0x13f8c0 already locked for write E/libagl ( 768): eglSwapBuffers() failed to lock buffer 0x1368e0 (640x480) E/SurfaceFlinger( 768): eglSwapBuffers: EGL error 0x3002 (EGL_BAD_ACCESS) E/gralloc ( 768): handle 0x13f8c0 not locked E/gralloc ( 768): handle 0x13f8c0 already locked for write E/libagl ( 768): eglSwapBuffers() failed to lock buffer 0x1368e0 (640x480) E/SurfaceFlinger( 768): eglSwapBuffers: EGL error 0x3002 (EGL_BAD_ACCESS) E/gralloc ( 768): handle 0x13f8c0 not locked E/gralloc ( 768): handle 0x13f8c0 already locked for write E/libagl ( 768): eglSwapBuffers() failed to lock buffer 0x1368e0 (640x480) E/SurfaceFlinger( 768): eglSwapBuffers: EGL error 0x3002 (EGL_BAD_ACCESS) E/gralloc ( 768): handle 0x13f8c0 not locked
Therefore I checked the libgralloc and adding some debug message. The libgralloc plugin 0xdroid used is branched from original eclair source tree. I took few hours created the omap3/libgralloc at the first day when I got eclair source code months ago. Since it works well for the most of time, I didn't pay too much attention to it, until I found the deadlock issue goes crazy on frozen bubble.
After noticing the lock log and swap error, I took a close look of the gralloc_lock and gralloc_unlock in hardware/omap3/libgralloc/mapper.c
After noticing the lock log and swap error, I took a close look of the gralloc_lock and gralloc_unlock in hardware/omap3/libgralloc/mapper.c
int gralloc_lock(gralloc_module_t const* module, buffer_handle_t handle, int usage, int l, int t, int w, int h, void** vaddr) { if (private_handle_t::validate(handle) < 0) return -EINVAL; int err = 0; private_handle_t* hnd = (private_handle_t*)handle; int32_t current_value, new_value; int retry; do { current_value = hnd->lockState; new_value = current_value; if (current_value & private_handle_t::LOCK_STATE_WRITE) { // already locked for write LOGE("handle %p already locked for write", handle); return -EBUSY; } else if (current_value & private_handle_t::LOCK_STATE_READ_MASK) { // already locked for read if (usage & (GRALLOC_USAGE_SW_WRITE_MASK | GRALLOC_USAGE_HW_RENDER)) { LOGE("handle %p already locked for read", handle); return -EBUSY; } else { // this is not an error //LOGD("%p already locked for read... count = %d", // handle, (current_value & ~(1<<31))); } } // not currently locked if (usage & (GRALLOC_USAGE_SW_WRITE_MASK | GRALLOC_USAGE_HW_RENDER)) { // locking for write new_value |= private_handle_t::LOCK_STATE_WRITE; } new_value++; retry = android_atomic_cmpxchg(current_value, new_value, } while (retry); if (new_value & private_handle_t::LOCK_STATE_WRITE) { // locking for write, store the tid hnd->writeOwner = gettid(); } if (usage & (GRALLOC_USAGE_SW_READ_MASK | GRALLOC_USAGE_SW_WRITE_MASK)) { if (!(current_value & private_handle_t::LOCK_STATE_MAPPED)) { // we need to map for real pthread_mutex_t* const lock = &sMapLock; pthread_mutex_lock(lock); if (!(hnd->lockState & private_handle_t::LOCK_STATE_MAPPED)) { err = gralloc_map(module, handle, vaddr); if (err == 0) { android_atomic_or(private_handle_t::LOCK_STATE_MAPPED, (volatile int32_t*)&(hnd->lockState)); } } pthread_mutex_unlock(lock); } *vaddr = (void*)hnd->base; } return err; } int gralloc_unlock(gralloc_module_t const* module, buffer_handle_t handle) { if (private_handle_t::validate(handle) < 0) return -EINVAL; private_handle_t* hnd = (private_handle_t*)handle; int32_t current_value, new_value; do { current_value = hnd->lockState; new_value = current_value; if (current_value & private_handle_t::LOCK_STATE_WRITE) { // locked for write if (hnd->writeOwner == gettid()) { hnd->writeOwner = 0; new_value &= ~private_handle_t::LOCK_STATE_WRITE; } } if ((new_value & private_handle_t::LOCK_STATE_READ_MASK) == 0) { LOGE("handle %p not locked", handle); return -EINVAL; } new_value--; } while (android_atomic_cmpxchg(current_value, new_value, (volatile int32_t*)&hnd->lockState)); return 0; }
The code looks reasonably for the first look. Lock and unlock pair looks good. However there is a very tricky part "android_atomic_cmpxchg may fail". Understanding this, it is not hard to see there is a potential bug in gralloc_unlock. If android_atomic_cmpxchg fails, it will run the do while loop for more than once. However for the first run, the hnd->writeOwner will be changed to 0. And then the new_value will not be changed anymore. This lock will goes crazy here after.
The patch solves this problem.
The patch solves this problem.
diff --git a/mapper.cpp b/mapper.cpp index 16ebcc2..1f3e722 100644 --- a/mapper.cpp +++ b/mapper.cpp @@ -267,13 +267,13 @@ int gralloc_unlock(gralloc_module_t const* module, if (current_value & private_handle_t::LOCK_STATE_WRITE) { // locked for write if (hnd->writeOwner == gettid()) { - hnd->writeOwner = 0; new_value &= ~private_handle_t::LOCK_STATE_WRITE; } } if ((new_value & private_handle_t::LOCK_STATE_READ_MASK) == 0) { LOGE("handle %p not locked", handle); + hnd->writeOwner = 0; return -EINVAL; } @@ -282,5 +282,6 @@ int gralloc_unlock(gralloc_module_t const* module, } while (android_atomic_cmpxchg(current_value, new_value, (volatile int32_t*)&hnd->lockState)); + hnd->writeOwner = 0; return 0; }
It make sure the value hnd->writeOwner is the same as the first loop, if android_atomic_cmpxchg fails.
This issue comes from the original eclair source tree, and it is still there, and had been inherited to many different platforms. If you encounter two frames crazily flipping and having the lock message, you may try to take a look of your libgralloc.
This issue comes from the original eclair source tree, and it is still there, and had been inherited to many different platforms. If you encounter two frames crazily flipping and having the lock message, you may try to take a look of your libgralloc.