History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: CC-426
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Jason Yip
Reporter: Daniel Patterson
Votes: 2
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
CruiseControl

Leaking SAXParsers causes OUTOFMEMORY errors

Created: 16/Mar/06 03:32 PM   Updated: 20/Nov/06 01:05 PM
Component/s: Core Application
Affects Version/s: 2.4.1
Fix Version/s: 2.5

Original Estimate: Unknown Remaining Estimate: Unknown Time Spent: Unknown
File Attachments: None
Image Attachments:

1. heapdump-screenshot.png
(94 kb)
Environment:
Linux buildbox 2.4.21-20.ELsmp #1 SMP Wed Aug 18 20:46:40 EDT 2004 i686 i686 i386 GNU/Linux

Red Hat Enterprise Linux AS release 3 (Taroon Update 3)

java version "1.4.2"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2)
Classic VM (build 1.4.2, J2RE 1.4.2 IBM build cxia321420-20040626 (JIT enabled: jitc))

Running cruisecontrol 2.4.1 with:
/opt/WebSphere/AppServer/java/bin/java -Xms128m -Xmx1024m -Djava.awt.headless=true -cp /etc/cruisecontrol:/opt/WebSphere/AppServ
er/java/lib/tools.jar:/opt/cruisecontrol/lib/cruisecontrol.jar:/opt/cruisecontrol/lib/log4j.jar:/opt/cruisecontrol/lib/jdom.jar:/opt/cruisecont
rol/lib/ant.jar:/opt/cruisecontrol/lib/ant-launcher.jar:/opt/cruisecontrol/lib/jasper-compiler.jar:/opt/cruisecontrol/lib/jasper-runtime.jar:/o
pt/cruisecontrol/lib/xercesImpl-2.7.0.jar:/opt/cruisecontrol/lib/xml-apis-2.7.0.jar:/opt/cruisecontrol/lib/xmlrpc-2.0.1.jar:/opt/cruisecontrol/
lib/xalan-2.6.0.jar:/opt/cruisecontrol/lib/jakarta-oro-2.0.3.jar:/opt/cruisecontrol/lib/mail.jar:/opt/cruisecontrol/lib/junit.jar:/opt/cruiseco
ntrol/lib/activation.jar:/opt/cruisecontrol/lib/commons-net-1.1.0.jar:/opt/cruisecontrol/lib/starteam-sdk.jar:/opt/cruisecontrol/lib/mx4j.jar:/
opt/cruisecontrol/lib/mx4j-tools.jar:/opt/cruisecontrol/lib/mx4j-remote.jar:/opt/cruisecontrol/lib/smack.jar:/opt/cruisecontrol/lib/comm.jar:/o
pt/cruisecontrol/lib/x10.jar:/opt/cruisecontrol/lib/fast-md5.jar:/opt/cruisecontrol/lib/javax.servlet.jar:/opt/cruisecontrol/lib/org.mortbay.je
tty.jar:/opt/cruisecontrol/lib/commons-logging.jar:/opt/cruisecontrol/lib/commons-el.jar:/opt/cruisecontrol:. -Djavax.management.builder.initia
l=mx4j.server.MX4JMBeanServerBuilder CruiseControlWithJetty -webport 8080 -cchome /opt/cruisecontrol -configfile /etc/cruisecontrol/config.xml
-jmxport 8000


 Description  « Hide
It looks like there is a memory leak buried deep somewhere in the XML parsing code. I'm not 100% sure where. Something
seems to be leaking SAXParsers (more detail below).

We're running with -Xmx1024m and after about 48 hours, we recieve OUTOFMEMORY errors. I have JDK core and heap
dumps that I can supply, but they're big (20MB bzip2'd for the heap dump). I'll attach the javacore file produced by the IBM
JDK, but I'm not sure it helps.

A quick summary of what the heap dump shows is, that at the point where the JDK dies, the following thingies are using the
most memory. I'll start the highest up the graph where we see a cruisecontrol class:

Object : net/sourceforge/cruisecontrol/jmx/CruiseControlControllerJMXAdaptor
Number of children : 4
Owner address: 0x10365ca8
Owner object: array of java/lang/Object
Size : 32
Total size : 1,020,576,480

From this object, there is a pretty clear path that leads to a Hashtable that is full of 50MB objects. The full path
is (and it's quite long). Following up the heap graph from the offending Hashtable, the first interesting looking
class is:

Object : org/apache/xml/utils/XMLReaderManager
Number of children : 2
Owner address: 0x1fe759a8
Owner object: org/apache/xml/dtm/ref/DTMManagerDefault
Size : 24
Total size : 1,020,147,456

which contains the offending hashtable:

Address : 0x1277b1e0
Object : java/util/Hashtable
Number of children : 1
Owner address: 0x1277b210
Owner object: org/apache/xml/utils/XMLReaderManager
Size : 48
Total size : 1,020,147,416

This hashtable contains a single child:
Object : array of java/util/Hashtable$Entry
Number of children : 86
Owner address: 0x1277b1e0
Owner object: java/util/Hashtable
Size : 784
Total size : 1,020,147,368

And each of the 86 children looks something like this:

Address : 0x19dc7d78
Object : java/util/Hashtable$Entry
Number of children : 3
Owner address: 0x1f5c0e28
Owner object: array of java/util/Hashtable$Entry
Size : 32
Total size : 50,316,080

and each of these entries contains:

Object : org/apache/xerces/parsers/SAXParser
Number of children : 12
Owner address: 0x19dc7d78
Owner object: java/util/Hashtable$Entry
Size : 96
Total size : 42,097,856

I can provide the heapdump on request (I'm happy to attach it, if a 20MB attachment isn't a problem).

 All   Comments   Work Log   Change History      Sort Order:
Daniel Patterson [16/Mar/06 03:40 PM]
Screenshot of the interesting components of the heap dump. Hope this helps.

Daniel Patterson [20/Mar/06 04:15 PM]
We were using the <htmlemail> publisher as part of this project. Yesterday, I turned that off, and the OUTOFMEMORY problems that we get every day seem to have temporarily abated. Memory usage seems to be growing, but a lot slower. I'll update this defect further in a couple of days.

Daniel Patterson [23/Mar/06 07:58 PM]
Ok, after a couple of days running without <htmlemail>, it appears that we're no longer leaking. Certainly, memory usage has stabilised well below our maximum heap size (currently around 294 of 1024 MB).

Dan Rollo [24/Mar/06 07:34 AM]
I did some memory profiling (using NetBeans Profiler) to debug memory problems with the contrib/distributed (CCDist) a while back and I found the same behavior: Removing the HTMLEmail publisher signficantly improved memory usage. Small spikes occurred during the build lifecycle only when the processing HTMLEmail and some of that memory was never collected after each spike. (CCDist tends to perform many more builds than non-Dist, and all the publishing work is done on the Master CC instance, so the memory issue shows up quickly and regularly).

Have you tried using a newer version of the XML libs to see if that improves the situation?

Dan

Dan Rollo [06/Apr/06 02:09 PM]
FWIW, I finally had a chance to try upgrading the xml libs, and had some unexpected results:

I upgraded to version 2.8.0 of "xercesImpl" and "xml-apis". All CC unit tests passed, but after a week or so running, I saw no improvement in the memory leak. (Upgraded all copies in /lib, /main/lib, /reporting/jsp/lib, and even my tomcat endorsed dir for good measure).


When I also upgraded the xalan jar to v2.7.0, I get the following unit test failure in the CC main tree.

[junit] Testcase: testTransformWithParameter(net.sourceforge.cruisecontrol.publishers.HTMLEmailPublisherTest): Cause
d an ERROR
    [junit] org/apache/xml/serializer/SerializerTrace
    [junit] java.lang.NoClassDefFoundError: org/apache/xml/serializer/SerializerTrace
    [junit] at java.lang.ClassLoader.defineClass1(Native Method)
    [junit] at java.lang.ClassLoader.defineClass(ClassLoader.java:620)
    [junit] at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:124)
    [junit] at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
    [junit] at java.net.URLClassLoader.access$100(URLClassLoader.java:56)
    [junit] at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
    [junit] at java.security.AccessController.doPrivileged(Native Method)
    [junit] at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
    [junit] at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    [junit] at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:268)
    [junit] at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
    [junit] at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
    [junit] at org.apache.xalan.processor.ProcessorStylesheetElement.getStylesheetRoot(ProcessorStylesheetElement.java:12
1)
    [junit] at org.apache.xalan.processor.ProcessorStylesheetElement.startElement(ProcessorStylesheetElement.java:72)
    [junit] at org.apache.xalan.processor.StylesheetHandler.startElement(StylesheetHandler.java:623)
    [junit] at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source)
    [junit] at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
    [junit] at org.apache.xerces.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(Unknown Source)
    [junit] at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
    [junit] at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
    [junit] at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    [junit] at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
    [junit] at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
    [junit] at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
    [junit] at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
    [junit] at org.apache.xalan.processor.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:920)
    [junit] at org.apache.xalan.processor.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:774)
    [junit] at net.sourceforge.cruisecontrol.publishers.HTMLEmailPublisher.transformFile(HTMLEmailPublisher.java:444)
    [junit] at net.sourceforge.cruisecontrol.publishers.HTMLEmailPublisherTest.testTransformWithParameter(HTMLEmailPublis
herTest.java:259)



BUILD FAILED


I found that xalan had separated some classes into a new jar: serializer.jar, so I then added serializer-2.7.0.jar to various paths, and the CC Main unit tests were happy.

I will attempt running with this setup for a while and report what happens regarding any memory leak changes.

Dan

Daniel Patterson [06/Apr/06 04:51 PM]
Just a quick note to possibly narrow down the problem. Looking at the heap dump, it looks
like the most interesting thing closest to the big collection of SAXParsers is an XMLReaderManager:

http://xml.apache.org/xalan-j/apidocs/org/apache/xml/utils/XMLReaderManager.html

"Creates XMLReader objects and caches them for re-use. This class follows the singleton pattern."

Looks like someone is calling "getXMLReader() " without ever calling "releaseXMLReader(XMLReader reader) "

If I get time, I'll try and hunt down where.

Dan Rollo [14/Apr/06 02:02 PM]
Finally some good news about this SAX/HTMLEmailPublisher memory leak: Updated xml jars appear to have fixed the problem.

I replaced the following xml jars (xercesImpl, xml-apis) with:
xercesImpl-2.8.0.jar
xml-apis-2.8.0.jar

in lib, main/lib, and reporting/jsp/lib (and of course updated all paths to use the new jars), but these jars alone didn't solve the problem. After a few days, CC was still eating upwards of 800mb of ram.

I then replaced xalan-2.6.0.jar with:
xalan-2.7.0.jar
AND
serializer-2.7.0.jar (the new xalan has separated some classes CC needs into this new jar).

in main/lib and reporting/jsp/lib. This appears to have fixed the leak: After a week of continuous builds, CC peaked at 250mb and has been averaging around 150mb of usage.

Also, I haven't seen any other side effects using the new jars.

If it is useful, I can put together a diff of all the paths I had to change, but it's pretty much a global search and replace (except for adding the new serializer-2.7.0.jar wherever xalan.jar was used).


Dan

Jeff Jensen [17/Apr/06 11:17 AM]
This seems like quite a find. Can we get this one into 2.5?

We have CC OOMing 1-2 times per week, and are also using HTMLEmail.

Jeffrey Fredrick [19/Apr/06 09:01 AM]
this will make 2.5 for sure.

Jeffrey Fredrick [24/Apr/06 11:28 PM]
all jars and .bat/.sh and build.xml files updated appropriately (I think).