Export to GitHub

jatetoolkit - issue #1

Out of memory problem


Posted on Jun 12, 2012 by Grumpy Horse

What steps will reproduce the problem?

java -Xmx1024m -classpath /Users/sarnobat/trash/jatetoolkit-read-only/dist/:/Users/sarnobat/trash/jatetoolkit-read-only/libs/apache-log4j-1.2.15/log4j-1.2.15.jar:/Users/sarnobat/trash/jatetoolkit-read-only/libs/apache-opennlp-1.51/jwnl-1.3.3.jar:/Users/sarnobat/trash/jatetoolkit-read-only/libs/apache-opennlp-1.51/opennlp-maxent-3.0.1-incubating.jar:/Users/sarnobat/trash/jatetoolkit-read-only/libs/apache-opennlp-1.51/opennlp-tools-1.5.1-incubating.jar:/Users/sarnobat/trash/jatetoolkit-read-only/libs/dragon/dragontool.jar:/Users/sarnobat/trash/jatetoolkit-read-only/libs/hsqldb2.2.3/hsqldb.jar:/Users/sarnobat/trash/jatetoolkit-read-only/libs/hsqldb2.2.3/sqltool.jar:/Users/sarnobat/trash/jatetoolkit-read-only/libs/wit-commons/wit-commons.jar: uk.ac.shef.dcs.oak.jate.test.AlgorithmTester /Users/sarnobat/trash/jatetoolkit-read-only/nlp_resources/ test/example/ test/output

What is the expected output? What do you see instead? don't know

What version of the product are you using? On what operating system? trunk

Please provide any additional information below.

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2734) at java.util.ArrayList.ensureCapacity(ArrayList.java:167) at java.util.ArrayList.add(ArrayList.java:351) at uk.ac.shef.dcs.oak.jate.core.npextractor.NGramExtractor.getNGram(NGramExtractor.java:123) at uk.ac.shef.dcs.oak.jate.core.npextractor.NGramExtractor.extract(NGramExtractor.java:67) at uk.ac.shef.dcs.oak.jate.core.npextractor.NGramExtractor.extract(NGramExtractor.java:49) at uk.ac.shef.dcs.oak.jate.core.feature.indexer.GlobalIndexBuilderMem.build(GlobalIndexBuilderMem.java:53) at uk.ac.shef.dcs.oak.jate.test.AlgorithmTester.main(AlgorithmTester.java:83)

Comment #1

Posted on Jun 12, 2012 by Happy Dog

Hi there

This is likely due to the larger number of candidate terms extracted by n-gram - perhaps 1G memory isn't enough. Can you try one thing:

In AlgorithmTester, lines 70-76 are:

//Three CandidateTermExtractor are implemented: //1. An OpenNLP noun phrase extractor that extracts noun phrases as candidate terms //CandidateTermExtractor npextractor = new NounPhraseExtractorOpenNLP(stop, lemmatizer); //2. A generic N-gram extractor that extracts n(default is 5, see the property file) grams CandidateTermExtractor npextractor = new NGramExtractor(stop, lemmatizer); //3. A word extractor that extracts single words as candidate terms.

//CandidateTermExtractor wordextractor = new WordExtractor(stop, lemmatizer);

Disable the npextractor but use the noun phrase extractor, i.e., option 1.

If that fixes the problem, it should be the problem of allocated memory.

Comment #2

Posted on Jul 25, 2013 by Happy Dog

Issue closed

Status: Done

Labels:
Type-Defect Priority-Medium