Details
Description
Each DescribeFeatureType leaks small amounts of memory. Using a VM configured to have only 64M or memory and with the default data dir configuration the following test:
ab -n 10000 "http://localhost:8080/geoserver/wfs?version=1.0.0&request=DescribeFeatureType&typename=sf%3Aarchsites"
which tries to run that DFT request 10.000 times starts failing with OOM errors between the 4000th and 5000th request.
Given the nature of the requests the schema cache is filling up, YourKit reports memory being filled with org.eclipse.xsd.impl.XSDElementDeclarationImpl objects
-
- data.tar.bz2
- 09/May/10 11:18 AM
- 78 kB
- Andrea Aime
-
- dft-archsites.sh
- 09/May/10 11:18 AM
- 0.8 kB
- Andrea Aime
-
- FeatureTypeSchemaBuilder.java.patch
- 11/May/10 10:46 AM
- 6 kB
- Vitali Diatchkov
-
- GEOS-3534.patch
- 11/May/10 7:42 PM
- 2 kB
- Justin Deoliveira
-
- insert-archsites.sh
- 09/May/10 11:18 AM
- 2 kB
- Andrea Aime
Issue Links
Activity
Interesting, seems small enough to be applicable to 2.0.x as well (Justin has a more definitive fix in the pipe but it comes with a much larger set of changes so it's not applicable to a stable series).
Justin, what do you think?
Nice work Vitali!! I searched for quite some time and could not find the source of the leak. I am all for applying the patch with a few caveats.
(a) It seems there some other stuff in the patch that is not relevant? Or at least changes some behavior. Talking about the initial block that comments out some stuff.
(b) Your ide is set up to use tabs instead of spaces which makes the patch hard to read.
(c) Some lines that are changed are simply reformatting. If you can avoid these it would be appreciated.
(d) Your patch mentions some side affects? Can you explain?
If you could clean the patch up a bit it would be most appreciated and we can get it applied asap.
Hi Vitali, noticed you were not watching this issue so I guess you haven't seen Justin's comment. I've added you to the watch list now
Hi,
I've tried the patch as-is (forgetting about the formatting issues), unfortunately it did not provide the expected benefits, as far as I can tell the leak is still there.
I've used Luca Morandini's scripts to setup a data directory with 600 copies of a small shapefiles (attached) and then started running a loop of unqualified DescribeFeatureType against it (so that all feature types are loaded into the schema cache) and I see the memory used by Xerces and XSD related classes grow up at each call by a few megabytes (having that many feature types helps in that respect, with just one it would grow by a few kb at a time).
To try it out on your machine just edit the script with the right process id (easily found with jps) and geoserver url.
If you want you can also try the transaction loop (insert-archsites.sh), it provides the same results, but the leak seems to build up faster.
As far as I can see the DFT related leak remains whether one sets up a big feature type cache, or not.
Which brings us back to square one. My improvements on managing schema objects are more or less blocked due to the app schema problems discussed on the list. I guess I could potentially just implement the fixes for the simple case... it would make the implementation a lot easier but I am not too keen on having two paths for creating schema objects.
I have updated patch. There were actually even 2 places where inverted references were cached in a singlton XSDSchema object. The first one was:
element.setSubstitutionGroupAffiliation(schema.resolveElementDeclaration(gmlNamespace,
substitutionGroup));
and the second one:
schema.getContents().add(element);
In current patch both cases are "hacked" and cached XSDSchema doesn't collect anymore references to temporary XSDElementDeclarationImpl objects.
Please format patch , I have too different settings in Eclipse IDE, probably the length of a line.
@Vitali: Aside from formatting issues this patch contains some blatant things that will cause errors, but I have no idea if they are relevant to the patch. You can't really expect someone to effectively use it as is.
That said I took a shot at pulling out the relevant parts and attached a new patch (GEOS-3534.patch) and had some success with it. I ran the transaction case and the dft case and am seeing no longer a leak. Although I may be running into a false positive I am not sure. Andrea: If you could try it out that would be great.
I've just tried both of the latest patches (from Justin and Vitali) against the tests that I've attached and the leak still seems to be there.
But... I'm trying them on trunk, Justin, might that affect the results?
Did anyone besides me try out the tests? Do you see the memory used by XSD and Xerces reach a stable value and stay there? The scripts use jmap to get a histogram of the memory usage by live objects and to compute the total memory consuption by the objects contained in the in the xsd and xerces packages.
Admittedly both tests (transaction and global dft) are much heavier than the simple test that was suggestd in the bug description.
The way the tests work seems fine to me, but maybe you can double check them and assess whether we're actually testing the rigth thing
My patch fixes the problem with caching of temporary XSDElementDeclarationImpl objects created during DescribeFeatureType request. I run GeoServer, connect java VisualVM and profile memory. Well, I remember to explicitely call GC() from that tool when a number of requests are issued (just from a browser manually) and after that the number of live XSDElementDeclarationImpl objects drops to the same value - no leaks in this particular case.
But the problem of caching inverted references in XSDSchema framework can appear in other places of GeoServer codebase..
So, I took a step back and run again the original test again 2.0.x + Justin's patch. The patch does indeed solve the problem, so +1 on committing as it's an improvement.
I'll make more test on 2.0.x and report on my findings
Some more information.
I've modified the transaction loop to work against the release data dir and found no memory leaks either.
So at least one problem is solved.
Then I took the data dir with 600 feature types, removed the last element from global.xml to make it load on 2.0.x, and started running the loops again.
In those the memory leak is there. Where's the difference? The 600 types data directory triggers the flushing of the WFS schema, the release one does not.
The flushing of the wfs schemas happen when a datastore is disposed.
Fine, so I went in ResourcePool and modified the data store cache and the feature type cache to host 1000 entries. No more flushing.
Run the loop of transaction inserts, no leaks -> there is one leak due to us throwing away the wfs schemas
Run the loop of whole server DescribeFeatureType, and there is still a leak (2MB per DFT call more or less) -> so there is yet another leak that affects only DFT it seems
Justin, it seems that with your patch we're one down, two to go ![]()
As another interesting bit, running a loop of a DFT against a single feature type shows no leak either -> it's the big baby that goes against all the feature types that seems to be triggering the last leak I was seeing
Confirmed, I just run a loop of 600 different, but type specific, DFT against the GS modified to have a big enough resource pool, no leak.
The leak occurs only if I do an unqualified DFT
Confirmed there is still a memory leak.
@Vitali: Can you point me at the eclipse xsd sources where the back reference to the created elements lives? I have looked through and can't seem to find it.
There is a number of places in codebase where a call is made:
element.setSubstitutionGroupAffiliation(element2);
It causes in XSD internals an inverted caching of element in element2 in its a substitution group content list.
If element2 is an XSDElementDeclaration instance resolved from a XSDSchema (http://www.opengis.net/gml) like "gml:_Feature" then it caches all XSDElementDeclaration objects in a way described above. If this XSDSchema is a singleton cached object then it infinitely collects XSDElementDeclaration elements substituting "gml:_Feature".
In Eclipse put a breakpoint on any element.setSubstitutionGroupAffiliation(element2) line and look into element2's substitutionGroup list. This is a source of memory leaks in GeoServer (at least that one had been detected).
Make search for setSubstitutionGroupAffiliation method to find several places in GeoServer codebase.
It seems the scenario when application XSDSchema object is created for the temporary purposes and contains internally XSDImport with a singleton cached XSDSchema (like GML schema in GeoServer) doesn't fit well into the architecture of XSD library. The workaround is to use hacks like in FeatureTypeSchemaBuilder buildSchemaContent() or somehow introduce "smart" destroying of temporary XSDSchema referencing cached XSDSchema objects in the end of request lifecycle to clean up this invertion of elements in substitutionGroup/substituionGroupAffiliation.
In some particular cases even an order of calls makes sense. For example:
schema.getContents().add(element); element.setSubstitutionGroupAffiliation(_featureElementClone);
does not cause invertion of references but
element.setSubstitutionGroupAffiliation(_featureElementClone);
schema.getContents().add(element);
does (see patch source code).
FYI, i have committed a fix that I hope will resolve this issue. I am just waiting to hear back from a user who is going to be testing the fix shortly.
Hi there. Noticed the commends and wondering if we need some more managements on the _Feature substitution group. E.g., what happens when the feature type changes structure because of someone changed a sql view definition or added a schema.xml file? In general, should we attach a listener to the catalog so that the registered substitution groups are kept in synch with the feature types evolution?
Yeah, it would probably be a good idea. However the link is only by name. So as long as the qualified name of the feature type remains the same there won't be any problem. However if the qualified name changes then a stale name will be left in the _Feature substitution group.
Going to resolve this one for now. Customer has put the patch through some rigorous testing with no leak.
Mass closing all issues that have been in "resolved" state for 2 months or more without any feedback or update
Memory leak is because of internal implementation in Elcipse xsd framework and logic dealing with XML schemas in GeoServer.
FeatureTypeSchemaBuilder is a singleton component caching GML schema XSDSchema object. At the same time many XSDSchema and XSDElementDeclaration objects are created only for the time of the particular standalone request like DescribeFeatureType request.
In buildSchemaContent(..) method we bind together temporary XSDElementDeclaration object with permanently cached XSDElementDeclaration from gmlSchema related to "gml:_Fetaure" by calling the following:
element.setSubstitutionGroupAffiliation(schema.resolveElementDeclaration(gmlNamespace, substitutionGroup));
This internally causes an inverted caching of element in substitutionGroup property of an object that is returned by schema.resolveElementDeclaration(gmlNamespace, substitutionGroup).
The patch demonstrating a workaround to this problem is attached. I do not know deeply a GeoServer codebase how this hack can affect other use cases then just DescribeFeatureType request, but I have tested various requests and everything works well and memory is not leaked anymore.
We need to have "gml:_Feature" element being set in substitutionGroupAffiliation property of a target element for printing a substituionGroup attribute in output XML schema:
... <xsd:element name="a4224al" substitutionGroup="gml:_Feature" type="paikkaoppi:a4224alType" /> ..