1.0 The Mystery
Why does the ServiceRecord
object representing a non-existent Service still exist in the Java heap of the System process ?
2.0 The Object Graph: A Recap
The object graph shows two references to the ServiceRecord
object [0x406f89b8
] from other objects in the graph but they are themselves only referenced from the ServiceRecord
object itself and therefore cannot be responsible for its continued existence.
3.0 MAT: A Recap
Using the MAT
Path To GC Roots > with all references
option on the errant ServiceRecord
object produces this
There is a reference to the ServiceRecord
object from a Native Stack GC Root.
The MAT Garbage Collection Roots documentation says
- Native Stack
In or out parameters in native code, such as user defined JNI code or JVM internal code. This is often the case as many methods have native parts and the objects handled as method parameters become GC roots. For example, parameters used for file/network I/O methods or reflection.
The implication is that the ServiceRecord
object has been passed to a native method as an argument which is executing at the point the Java heap dump was taken. This is unlikely but not impossible. However we have seen exactly the same thing in a different process. Again this is not impossible but it is suspicious and it does suggest that it might be a good idea to try and find out what exactly is going on in this particular case. One way to find out is to look at the contents of the hprof file directly.
4.0 The Heap Dump vs. MAT
Looking directly at the contents of the hprof file containing the dump of the Java heap turns up a single JNI Global reference (HPROF_GC_ROOT_JNI_GLOBAL
) rather than a native stack reference (HPROF_GC_ROOT_NATIVE_STACK
) which is odd.
A JNI Global reference is not the same as a reference from a native stack frame and in fact there is a separate entry for JNI Global
in the MAT documentation for Garbage Collection Roots which reads
Global variable in native code, such as user defined JNI code or JVM internal code.
so why has MAT decided to conflate the two ?
The answer appears to lie in the following method which is defined in the class org.eclipse.mat.hprof.Pass1Parser
(code slightly re-formatted)
private void readDumpSegments(long length) throws IOException, SnapshotException
{
long segmentStartPos = in.position();
long segmentsEndPos = segmentStartPos + length;
while (segmentStartPos < segmentsEndPos)
{
long workDone = segmentStartPos / 1000;
if (this.monitor.getWorkDone() < workDone)
{
if (this.monitor.isProbablyCanceled())
throw new IProgressListener.OperationCanceledException();
this.monitor.totalWorkDone(workDone);
}
int segmentType = in.readUnsignedByte();
if (verbose)
System.out.println(" Read heap sub-record type "+segmentType+" at position 0x"+ \
Long.toHexString(segmentStartPos)); //$NON-NLS-1$ //$NON-NLS-2$
switch (segmentType)
{
case Constants.DumpSegment.ROOT_UNKNOWN:
readGC(GCRootInfo.Type.UNKNOWN, 0);
break;
case Constants.DumpSegment.ROOT_THREAD_OBJECT:
readGCThreadObject(GCRootInfo.Type.THREAD_OBJ);
break;
case Constants.DumpSegment.ROOT_JNI_GLOBAL:
readGC(GCRootInfo.Type.NATIVE_STACK, idSize);
break;
case Constants.DumpSegment.ROOT_JNI_LOCAL:
readGCWithThreadContext(GCRootInfo.Type.NATIVE_LOCAL, true);
break;
case Constants.DumpSegment.ROOT_JAVA_FRAME:
readGCWithThreadContext(GCRootInfo.Type.JAVA_LOCAL, true);
break;
case Constants.DumpSegment.ROOT_NATIVE_STACK:
readGCWithThreadContext(GCRootInfo.Type.NATIVE_STACK, false);
break;
case Constants.DumpSegment.ROOT_STICKY_CLASS:
readGC(GCRootInfo.Type.SYSTEM_CLASS, 0);
break;
case Constants.DumpSegment.ROOT_THREAD_BLOCK:
readGC(GCRootInfo.Type.THREAD_BLOCK, 4);
break;
case Constants.DumpSegment.ROOT_MONITOR_USED:
readGC(GCRootInfo.Type.BUSY_MONITOR, 0);
break;
case Constants.DumpSegment.CLASS_DUMP:
readClassDump(segmentStartPos);
break;
case Constants.DumpSegment.INSTANCE_DUMP:
readInstanceDump(segmentStartPos);
break;
case Constants.DumpSegment.OBJECT_ARRAY_DUMP:
readObjectArrayDump(segmentStartPos);
break;
case Constants.DumpSegment.PRIMITIVE_ARRAY_DUMP:
readPrimitiveArrayDump(segmentStartPos);
break;
default:
throw new SnapshotException(MessageUtil.format( \
Messages.Pass1Parser_Error_InvalidHeapDumpFile,
segmentType, segmentStartPos));
}
segmentStartPos = in.position();
}
...
}
Curiously the class org.eclipse.mat.snapshot.model.GCRootInfo
does not define a NATIVE_GLOBAL
constant for use in this case which would seem to imply that this is intentional if rather unhelpful behaviour.
5.0 Hunt The JNI Global Reference
So the ServiceRecord
object is still in the Java heap because there is a JNI Global reference to it. Why ?
At this point it looks as though we may be stuck. There is obviously no record in the heap dump of how or why individual JNI Global references were created by native code nor could there be.
On the other hand for native code to have created a JNI Global reference to the ServiceRecord
object, it must have been passed to a native method at some point. None of the ActivityManagerService
methods discussed to date have been native. What about the ServiceRecord
class ? It does not have any native methods, on the other hand its super class android.os.Binder
does so what about them ?
6.0 android.os.Binder Native Methods
The first native method to be invoked in the lifecycle of a Binder object is the init()
method which is invoked at the very start of the default constructor
public Binder()
It is declared as follows
private native final void init();
The actual implementation can be found in the file frameworks/base/core/jni/android_util_Binder.cpp
, lines 633-643, and it looks like this
static void android_os_Binder_init(JNIEnv* env, jobject clazz)
{
JavaBBinderHolder* jbh = new JavaBBinderHolder(env, clazz);
if (jbh == NULL) {
jniThrowException(env, "java/lang/OutOfMemoryError", NULL);
return;
}
LOGV("Java Binder %p: acquiring first ref on holder %p", clazz, jbh);
jbh->incStrong(clazz);
env->SetIntField(clazz, gBinderOffsets.mObject, (int)jbh);
}
Note that init()
is an instance method, so the jobject
argument clazz
is the Java Binder object itself which is neither a class nor a clazz.
At first glance it does not look very promising. Certainly there is no direct call to the NewGlobalRef()
JNI function which would result in the creation of a JNI Global reference. On the other hand the Java Binder object is being passed to the constructor of the C++ class JavaBBinderHolder
. Maybe that creates one.
The C++ class JavaBBinderHolder
is defined in the file frameworks/base/core/jni/android_util_Binder.cpp
, lines 322-359.
Although none of the JavaBBinderHolder
methods invoke NewGlobalRef()
a JavaBBinderHolder
object does retain the pointer to the Java Binder object passed to the constructor, so there is still hope.
In the get()
method the pointer to the Java Binder object is passed to the constructor of the C++ class JavaBBinder
r.
The C++ class JavaBBinder
is also defined in the file frameworks/base/core/jni/android_util_Binder.cpp
, lines 234-318.
The constructor is defined like this
JavaBBinder(JNIEnv* env, jobject object)
: mVM(jnienv_to_javavm(env)), mObject(env->NewGlobalRef(object))
{
LOGV("Creating JavaBBinder %p\n", this);
android_atomic_inc(&gNumLocalRefs);
incRefsCreated(env);
}
and as we can see it duly creates a JNI Global reference to the Java Binder object.
Note
NewGlobalRef()
is defined to returnNULL
if the system runs out of memory, so in general the usage in this method is bad practice. However the Dalvik implementation will blow up the VM if it cannot allocate the reference (seedalvik/vm/Jni.c
NewGlobalRef()
lines 2195-2207 andaddGlobalReference()
lines 742-866) so this code is safe for some value of safe.
Copyright (c) 2012 By Simon Lewis. All Rights Reserved.