What steps will reproduce the problem? 1. TestFrequency counter for verb phrases 2. I've created a class analogues to the NounPhraseExtractorOpenNLP and updated the strings to B-VP and I-VP 3. The algorithm counts/extracts further the noun phrases
What is the expected output? What do you see instead? Verb Phrase + counter
What version of the product are you using? On what operating system? 1.11
Please provide any additional information below.
What properties must be further altered so that only verb phrases are counted? The OpenNLP parser supports this type of annotation.
Comment #1
Posted on Sep 25, 2013 by Happy DogComment deleted
Comment #2
Posted on Sep 25, 2013 by Happy DogHi jate uses opennlp 1.51, verb phrases are a little tricky to handle. You are right to look at "B-NP" and "I-NP" in the "chunkNP" method in "NounPhraseExtractorOpenNLP" class, but I think you need to write a separate method that implements a slightly different process.
Example: Tokens = They have replaceable teeth . Chunker output = B-NP,B-VP,B-NP,I-NP,O
Tokens = Humans kill around 26 to 73 million sharks every year ... Chunker output =B-NP,B-VP,B-ADVP,B-NP,B-PP,B-NP,I-NP,I-NP,B-NP
As you see, B-VP identifies the beginning of a VP, but there are no "I-VP" that identifies the "inner" of a VP, but rather noun phrases or adverbs/proposition phrases. So your code need to handle these cases.
This will be added in the next version of this tool.
Status: Accepted
Labels:
Type-Defect
Priority-Medium