groovy

MetaClassRegistryImpl constantMetaClasses map is leaking resources

Details

  • Type: Bug Bug
  • Status: Closed Closed
  • Priority: Blocker Blocker
  • Resolution: Fixed
  • Affects Version/s: 1.5.6
  • Fix Version/s: 1.5.7, 1.6-beta-2
  • Component/s: groovy-jdk
  • Labels:
    None
  • Environment:
    java 1.5.0_14 and java 1.6.0_04
  • Testcase included:
    yes
  • Number of attachments :
    0

Description

The GroovySystem is a class that resides in the AppClassLoader. It's final and only contains static attributes and methods.
Inside it, we find a MetaClassRegistry static property, that is initialized with a MetaClassRegistryImpl instance.
The MetaClassRegistryImpl contains a ConcurrentReaderHashMap wich holds references to all Expando classes, and metaclasses.

And that's the problem. Since the ExpandoClasses and MetaClasses are usually created via script by a GroovyClassLoader, this map prevents them be garbage collected, as well as its GroovyClassLoader.

In our application, we use a schema similar to a web server. We create a different class loader for each executed script. This is necessary since users want to modificate scripts and see the results imediatelly. We've even tried to use the GroovyScriptEngine in the past, but it has a very big flaw. its performance is terrible in a network (I even opened another issue for this). Also, we like the benefit of having a different class loader per script, since we may have different execution environments.

As a result, we are suffering the "the dreaded "java.lang.OutOfMemoryError: PermGen space" exception", as described in this article:
http://blogs.sun.com/fkieviet/entry/classloader_leaks_the_dreaded_java

To simulate the problem, run the above java program with -Xmx500m option:

Test.java
public class Test {
    public static void main(String[] args) throws Exception {
        for (int i = 0; i < Integer.MAX_VALUE; i++) {
            GroovyClassLoader gcl = new GroovyClassLoader();
            Leak leak = (Leak)gcl.parseClass(new File("ResourceLeak.groovy")).newInstance();
            leak.leakResources();
            if (i % 1000 == 0) {
                System.out.println(i);
                System.gc();
            }
        }
    }
    
    public static interface Leak {
        void leakResources();
    }
}

It uses this script:

ResourceLeak.groovy
class ResourceLeak implements Teste.Leak {
    void leakResources() {
		ResourceLeak.metaClass.'static'.test = { 
			//Do nothing 
		}
		ResourceLeak.test()		
    }
}

The problem will occur after 7000 executions. In this sample, we need 500Mb heap, otherwise, the heap space will go out of memory before the permgen space. But in our real situation (with bigger classes, and more classes in the class loader) 200Mb heap is enough, since the permgen space is consumed faster.

Activity

Hide
blackdrag blackdrag added a comment -

This is a non trivial problem in respect to many cases. Currently the MetaClassImpl is designed to keep the default MetaClass as long as it is needed, that means the default can be collected. But as soon as a user sets a custom MetaClass the default is to keep it until the dead come back or the VM is terminated.

Our problem is that we can not know if the MetaClass will be needed or not. The standard case in Groovy is that you execute many scripts that somehow interact with each other. So there is no defined process of script creation, execution and removing resources which I could use to clear the registry. I could now for example say, that the MetaClass should not prevent ResourceLeak from being collected and the MetaClass should be collected along with it. That's true, but for this we would need a reference from the class to the MetaClass. Which could be done for classes created by Groovy, but not for classes from Java. So you could for example create a Java class each time the script is run and change its MetaClass like above and we would get the same problem again. Any solution that tries some kind of tracking or referencing the class as well as the MetaClass failed in more earlier attempts.

But there are solutions to your problem as well. One way would be to use different class loaders loading the Groovy classes with them, so that you can discard all of them together with the registry. Another solution would be to remove each MetaClass you added. To keep track of this I would suggest to exchange the getMetaClass method on Class with a version that allows you to track the added classes. The second version is not threadsafe of course.

ah, yes... why don't you cache the class? There wouldn't be a leak in that case.

Show
blackdrag blackdrag added a comment - This is a non trivial problem in respect to many cases. Currently the MetaClassImpl is designed to keep the default MetaClass as long as it is needed, that means the default can be collected. But as soon as a user sets a custom MetaClass the default is to keep it until the dead come back or the VM is terminated. Our problem is that we can not know if the MetaClass will be needed or not. The standard case in Groovy is that you execute many scripts that somehow interact with each other. So there is no defined process of script creation, execution and removing resources which I could use to clear the registry. I could now for example say, that the MetaClass should not prevent ResourceLeak from being collected and the MetaClass should be collected along with it. That's true, but for this we would need a reference from the class to the MetaClass. Which could be done for classes created by Groovy, but not for classes from Java. So you could for example create a Java class each time the script is run and change its MetaClass like above and we would get the same problem again. Any solution that tries some kind of tracking or referencing the class as well as the MetaClass failed in more earlier attempts. But there are solutions to your problem as well. One way would be to use different class loaders loading the Groovy classes with them, so that you can discard all of them together with the registry. Another solution would be to remove each MetaClass you added. To keep track of this I would suggest to exchange the getMetaClass method on Class with a version that allows you to track the added classes. The second version is not threadsafe of course. ah, yes... why don't you cache the class? There wouldn't be a leak in that case.
Hide
Vinícius Godoy de Mendonça added a comment -

Okay, I see that this problem is really tricky.

But, the second situation you mentioned above (creating a Java class, changing it's metaclass) is requires a much more explicit action by the programmer. The sample I gave you when I created this issue is a very simple one, and no one is expecting that in such trivial case groovy will leak resources.

Since this can be done for classes created by groovy (and that's my case), I think it's a good idea to do it.

Now, talking about the solution you proposed. The creation of a new classloader I could understand, but the caching solution not. How can I cache a class?

Show
Vinícius Godoy de Mendonça added a comment - Okay, I see that this problem is really tricky. But, the second situation you mentioned above (creating a Java class, changing it's metaclass) is requires a much more explicit action by the programmer. The sample I gave you when I created this issue is a very simple one, and no one is expecting that in such trivial case groovy will leak resources. Since this can be done for classes created by groovy (and that's my case), I think it's a good idea to do it. Now, talking about the solution you proposed. The creation of a new classloader I could understand, but the caching solution not. How can I cache a class?
Hide
blackdrag blackdrag added a comment -

much more explicit action for java classes?

Integer.metaClass.'static'.test = { 
			//Do nothing 
		}
		Integer.test()
that is not any more explicit than your code.

About caching... you do

GroovyClassLoader gcl = new GroovyClassLoader();
Leak leak = (Leak)gcl.parseClass(new File("ResourceLeak.groovy")).newInstance();
Which means you create a new class loader and a new class each time. gcl.parseClass(new File("ResourceLeak.groovy")) will give you the class which you could cache... for example you could use a map and say for this file I use this class. I think GCL supports this after enabling it. Even with recompilation... but with recompilation comes the real problem here: you have to track what classes are used by the script and test your class if it needs recompilation if any class it depends on has been changed. That's one ofthe reasons GroovyScriptEngine is as slow as it is.

Show
blackdrag blackdrag added a comment - much more explicit action for java classes?
Integer.metaClass.'static'.test = { 
			//Do nothing 
		}
		Integer.test()
that is not any more explicit than your code. About caching... you do
GroovyClassLoader gcl = new GroovyClassLoader();
Leak leak = (Leak)gcl.parseClass(new File("ResourceLeak.groovy")).newInstance();
Which means you create a new class loader and a new class each time. gcl.parseClass(new File("ResourceLeak.groovy")) will give you the class which you could cache... for example you could use a map and say for this file I use this class. I think GCL supports this after enabling it. Even with recompilation... but with recompilation comes the real problem here: you have to track what classes are used by the script and test your class if it needs recompilation if any class it depends on has been changed. That's one ofthe reasons GroovyScriptEngine is as slow as it is.
Hide
Vinícius Godoy de Mendonça added a comment -

Oh... now I see, you are talking about java classes in the groovy code. I thought about Java classes created in Java, and sent to the script. Sorry, but I can't see the other problems as fast as you can do, since my groovy knowledge is limited to a little more than I use here.

The caching has a problem, which made me create multiple classloaders in the first place. If the script changes, a rerun will not see this change.
I tried to recompilate the class without success. How can I do it?

I'll try creating a new classloader, as you suggested.

About the issue. Will you close it? How are you planning to solve it?

Show
Vinícius Godoy de Mendonça added a comment - Oh... now I see, you are talking about java classes in the groovy code. I thought about Java classes created in Java, and sent to the script. Sorry, but I can't see the other problems as fast as you can do, since my groovy knowledge is limited to a little more than I use here. The caching has a problem, which made me create multiple classloaders in the first place. If the script changes, a rerun will not see this change. I tried to recompilate the class without success. How can I do it? I'll try creating a new classloader, as you suggested. About the issue. Will you close it? How are you planning to solve it?
Hide
Graeme Rocher added a comment - - edited

This issue is not really resolvable by Groovy, you could have similar issues in a Java program if you compiled java classes at runtime. At the end of the day in this case you are wanting to add dynamic behaviour to classes that get loaded dynamically, this dynamic behaviour has to be kept somewhere by Groovy in order for the behaviour of your program to stay consistent.

Groovy is doing the right thing by keeping this behaviour around. Your program is doing the wrong thing by not cleaning up this new behaviour you add to dynamically loaded classes. If you added the following code to your example program the issue would go away:

def theClass = gcl.parseClass(new File("ResourceLeak.groovy"))
Leak leak = (Leak)theClass.newInstance()
            leak.leakResources();

GroovySystem.metaClass.registry.removeMetaClass(theClass)

We have to do this for example in Grails in the destroy() method of the Grails servlet to make sure Grails classes don't hang around when you undeploy a Grails web application

Show
Graeme Rocher added a comment - - edited This issue is not really resolvable by Groovy, you could have similar issues in a Java program if you compiled java classes at runtime. At the end of the day in this case you are wanting to add dynamic behaviour to classes that get loaded dynamically, this dynamic behaviour has to be kept somewhere by Groovy in order for the behaviour of your program to stay consistent. Groovy is doing the right thing by keeping this behaviour around. Your program is doing the wrong thing by not cleaning up this new behaviour you add to dynamically loaded classes. If you added the following code to your example program the issue would go away:
def theClass = gcl.parseClass(new File("ResourceLeak.groovy"))
Leak leak = (Leak)theClass.newInstance()
            leak.leakResources();

GroovySystem.metaClass.registry.removeMetaClass(theClass)
We have to do this for example in Grails in the destroy() method of the Grails servlet to make sure Grails classes don't hang around when you undeploy a Grails web application
Hide
Graeme Rocher added a comment -

On a side note Groovy 1.6 may solve this issue for you if you add new behaviour using per instance meta classes rather than global meta classes as you are doing here. In Groovy 1.6 you could do something like:

leak.metaClass.mixin MyNewMethods

Then when the instance goes away so does its meta class

Show
Graeme Rocher added a comment - On a side note Groovy 1.6 may solve this issue for you if you add new behaviour using per instance meta classes rather than global meta classes as you are doing here. In Groovy 1.6 you could do something like:
leak.metaClass.mixin MyNewMethods
Then when the instance goes away so does its meta class
Hide
Vinícius Godoy de Mendonça added a comment -

Couldn't you create a clear method of the metaclassregistry?
This whould be much simpler than calling removing class-by-class.

This is useful since I don't know exactly what other metaclasses the script may have created. Also, it allow me to call the method without changing the scripts. Most part of the scripts that my app runs are not written by me, or by any member of my team.

Show
Vinícius Godoy de Mendonça added a comment - Couldn't you create a clear method of the metaclassregistry? This whould be much simpler than calling removing class-by-class. This is useful since I don't know exactly what other metaclasses the script may have created. Also, it allow me to call the method without changing the scripts. Most part of the scripts that my app runs are not written by me, or by any member of my team.
Hide
blackdrag blackdrag added a comment -

but a clear method is not thread safe... for example imagine you change the MetaClass in one thread (A) and then execute code based on that in the same thread, while in another thread (B) you call clear and it is executed right after the first thread (A) has changed the MetaClass. The result would be that execution of the code in the first thread (A) will fail, because the expected MetaClass got deleted!

So do you have a safe point where you can simply clear the registry?

Show
blackdrag blackdrag added a comment - but a clear method is not thread safe... for example imagine you change the MetaClass in one thread (A) and then execute code based on that in the same thread, while in another thread (B) you call clear and it is executed right after the first thread (A) has changed the MetaClass. The result would be that execution of the code in the first thread (A) will fail, because the expected MetaClass got deleted! So do you have a safe point where you can simply clear the registry?
Hide
Vinícius Godoy de Mendonça added a comment -

But this same problem can occur with the suggested remove implementation, right?

Show
Vinícius Godoy de Mendonça added a comment - But this same problem can occur with the suggested remove implementation, right?
Hide
blackdrag blackdrag added a comment -

you mean if a metaclass is set for an arbitrary class? Yes, I guess that's true. That's why using Groovy on a per Thread and class loader base would be better for your app. But ok, given that that should not be used, what could be done?

If the removal should be done manually and not be triggered by garbage collection, but by an explicit method call of some kind and if this should try to not to affect other threads as best as possible, then we need a better way to track the metaclass setting actions.

what would be if we had a listener at the registry that informs you when a new global metaclass is set? You would then have a reference to the new MetaClass and the Class that the MetaClass is used for. You could then for example store in what thread that happened and after the script is done you could remove all the classes in the registry. Or you could track the classloader that was used for this and remove all classes that are using a certain classloader.

Another way would be that we change Groovy to allow the user to set a new registry. Then you could have a thread local registry and simply call clear later. But I think this solution is probably more difficult.

Show
blackdrag blackdrag added a comment - you mean if a metaclass is set for an arbitrary class? Yes, I guess that's true. That's why using Groovy on a per Thread and class loader base would be better for your app. But ok, given that that should not be used, what could be done? If the removal should be done manually and not be triggered by garbage collection, but by an explicit method call of some kind and if this should try to not to affect other threads as best as possible, then we need a better way to track the metaclass setting actions. what would be if we had a listener at the registry that informs you when a new global metaclass is set? You would then have a reference to the new MetaClass and the Class that the MetaClass is used for. You could then for example store in what thread that happened and after the script is done you could remove all the classes in the registry. Or you could track the classloader that was used for this and remove all classes that are using a certain classloader. Another way would be that we change Groovy to allow the user to set a new registry. Then you could have a thread local registry and simply call clear later. But I think this solution is probably more difficult.
Hide
Vinícius Godoy de Mendonça added a comment -

Both of the latest solutions are fine for me. We also think that the listener solution is the finest one.

Show
Vinícius Godoy de Mendonça added a comment - Both of the latest solutions are fine for me. We also think that the listener solution is the finest one.
Hide
Alex Tkachman added a comment -

It is by-design behavior.
You should remove meta class manually when you don't need it any more.

Show
Alex Tkachman added a comment - It is by-design behavior. You should remove meta class manually when you don't need it any more.
Hide
Vinícius Godoy de Mendonça added a comment -

How about the listener we discussed before? Will it be implemented?
Otherwise, it's almost impossible to know what classes should be manually removed!

Show
Vinícius Godoy de Mendonça added a comment - How about the listener we discussed before? Will it be implemented? Otherwise, it's almost impossible to know what classes should be manually removed!
Hide
Vinícius Godoy de Mendonça added a comment -

For those who have the same problem of me, there's a dangerous, yet usefull workaround.
You can use reflection to manually clear the groovy class cache. This is sensitive to groovy code changes, and should only be used while the listener solution is not implemented.

protected boolean clearGroovyClassesCache() {
        try {
            // Retrieve the fields we will manipulate
            Class<MetaClassRegistryImpl> mcriClass = MetaClassRegistryImpl.class;

            Field constantMetaClassesField = mcriClass.getDeclaredField("constantMetaClasses");
            constantMetaClassesField.setAccessible(true);

            Field constantMetaClassCountField = mcriClass.getDeclaredField("constantMetaClassCount");
            constantMetaClassCountField.setAccessible(true);

            // Retrieve the object to be manipulate
            Class<GroovySystem> gsClass = GroovySystem.class;
            Field metaClassRegistryField = gsClass.getDeclaredField("META_CLASS_REGISTRY");
            metaClassRegistryField.setAccessible(true);
            MetaClassRegistryImpl mcri = (MetaClassRegistryImpl) metaClassRegistryField.get(null);
            ConcurrentReaderHashMap constantMetaClasses = (ConcurrentReaderHashMap) constantMetaClassesField.get(mcri);

            // Clears the map
            constantMetaClasses.clear();
            constantMetaClassCountField.set(mcri, 0);
            
            return true;
        }  catch (Exception e) {
            //Log this exception. It indicates that the code changed.
            return false;
        }
}

This code is obviously not thread-safe.
So users must ensure that it is safe to use it before doing so.

Show
Vinícius Godoy de Mendonça added a comment - For those who have the same problem of me, there's a dangerous, yet usefull workaround. You can use reflection to manually clear the groovy class cache. This is sensitive to groovy code changes, and should only be used while the listener solution is not implemented.
protected boolean clearGroovyClassesCache() {
        try {
            // Retrieve the fields we will manipulate
            Class<MetaClassRegistryImpl> mcriClass = MetaClassRegistryImpl.class;

            Field constantMetaClassesField = mcriClass.getDeclaredField("constantMetaClasses");
            constantMetaClassesField.setAccessible(true);

            Field constantMetaClassCountField = mcriClass.getDeclaredField("constantMetaClassCount");
            constantMetaClassCountField.setAccessible(true);

            // Retrieve the object to be manipulate
            Class<GroovySystem> gsClass = GroovySystem.class;
            Field metaClassRegistryField = gsClass.getDeclaredField("META_CLASS_REGISTRY");
            metaClassRegistryField.setAccessible(true);
            MetaClassRegistryImpl mcri = (MetaClassRegistryImpl) metaClassRegistryField.get(null);
            ConcurrentReaderHashMap constantMetaClasses = (ConcurrentReaderHashMap) constantMetaClassesField.get(mcri);

            // Clears the map
            constantMetaClasses.clear();
            constantMetaClassCountField.set(mcri, 0);
            
            return true;
        }  catch (Exception e) {
            //Log this exception. It indicates that the code changed.
            return false;
        }
}
This code is obviously not thread-safe. So users must ensure that it is safe to use it before doing so.
Hide
blackdrag blackdrag added a comment -

once I have my internet connection back, you will get the listener and a thread safe way of iterating them too.

Show
blackdrag blackdrag added a comment - once I have my internet connection back, you will get the listener and a thread safe way of iterating them too.
Hide
blackdrag blackdrag added a comment -

I added an iterator that you can use, and I added listener that you could use as well.

Show
blackdrag blackdrag added a comment - I added an iterator that you can use, and I added listener that you could use as well.

People

Vote (0)
Watch (2)

Dates

  • Created:
    Updated:
    Resolved: