|
GettingStarted
Getting Started:Prepare a browse index:A browse index is a Lucene index with field description information. Bobo Browse does not currently provide any extensive indexing wrappers on top of Lucene, so indexing with the standard Lucene API (http://lucene.apache.org/java/2_4_1/api/core/index.html) would suffice. See Creating a Browse Index for details. Search and Browse: The Bobo Browse Engine is released as a library, the API is published here. Familiarity with the Lucene API is helpful. Here are some of the basic concepts or objects to understand:
Example:// define facet handlers
// color facet handler
SimpleFacetHandler colorHandler = new SimpleFacetHandler("color");
// category facet handler
SimpleFacetHandler categoryHandler = new SimpleFacetHandler("category");
List<FacetHandler> handlerList = Arrays.asList(new FacetHandler[]{colorHandler,categoryHandler});
// opening a lucene index
Directory idx = FSDirectory.open(new File("myidx"));
IndexReader reader = IndexReader.open(idx,true);
// decorate it with a bobo index reader
BoboIndexReader boboReader = BoboIndexReader.getInstance(reader,handlerList);
// creating a browse request
BrowseRequest br=new BrowseRequest();
br.setCount(10);
br.setOffset(0);
// add a selection
BrowseSelection sel=new BrowseSelection("color");
sel.addValue("red");
br.addSelection(sel);
// parse a query
QueryParser parser = new QueryParser("contents",new StandardAnalyzer(Version.LUCENE_CURRENT));
Query q=parser.parse("cool car");
br.setQuery(q);
// add the facet output specs
FacetSpec colorSpec = new FacetSpec();
colorSpec.setOrderBy(FacetSortSpec.OrderHitsDesc);
FacetSpec categorySpec = new FacetSpec();
categorySpec.setMinHitCount(2);
categorySpec.setOrderBy(FacetSortSpec.OrderHitsDesc);
br.setFacetSpec("color",colorSpec);
br.setFacetSpec("category",categorySpec);
// perform browse
Browsable browser=new BoboBrowser(boboReader);
BrowseResult result=browser.browse(br);
int totalHits = result.getNumHits();
BrowseHit[] hits = result.getHits();
Map<String,FacetAccessible> facetMap = result.getFacetMap();
FacetAccessible colorFacets = facetMap.get("color");
List<BrowseFacet> facetVals = colorFacets.getFacets();
|
► Sign in to add a comment
I tried this code but it only worked when I passed some FacetHandlers? to the BoboIndexReader?.getInstance. Am I missing something ?
That's correct. You either have to pass it in or have a bobo.spring define the facet handlers in the index.
What would be the best way to display result? get documentId out of every BrowseHit?, then get the Lucene Document? When I tried to get the data directly out of BrowseHit? with getField, it gave me null pointer exception. I am using Lucene 2.9.0 and Bobo Browsing 2.0.6
Code Sample: BrowseHit? hits = result.getHits(); for(int i=0;i<hits.length;++i) {
} Is it the only way to display result?we don't currently support lucene 2.9.0 (we do have a branch for lucene2.9 migration however, still working on it though)
one reason you are getting a NPE is because on the BrowseRequest?, you are not requesting field values. Do: BrowseRequest?.setFetchStoredFields(true)
We should probably throw a better exception.
Actually, if you have "color" already defined in a FacetHandler, you don't need to call setFetchStoredFields.
I updated the wiki with a better code snippet
Hi John, Thanks for your quick reply. Your new snippet update and your explanation really got me 80% there. However, I still have alittle problem left. Since its a longer question, I sent you an email regarding the problem. Your help is truely appreciated.
Any one knows why I always get java.lang.OutOfMemoryError?: Java heap space at line BrowseResult? result=browser.browse(br) - for about 3-4 SimpleFacetHandler? on about 5,000,0000 documents search ?
Honestly guys, you need far more documentation here!
I coming from the Solr corner searching for a technology to implement a faceted browser in a standalone Java application. All I get downloading the release is the sources jar-file and the online JavaDoc? isn't working...
Please provide new users by giving a step-by-step tutorial about how to install and start, give some more code examples and more conceptual information.
Map<String,FacetAccessible?> facetMap = result.getFacetMap(); FacetAccessible? colorFacets = facetMap.get("color"); List<BrowseFacet> facetVals = colorFacets.getFacets();
i tried on local.but the facetVals is return null.
i got it.before you call method getFacets,you should call "collectAll".
like following code: SimpleFacetHandler?.SimpleFacetCountCollector? colorFacets=(SimpleFacetHandler?.SimpleFacetCountCollector?)facetMap.get("color"); colorFacets.collectAll(); List<BrowseFacet> facetVals = colorFacets.getFacets();
假如我有2个filed,索引不分词的。一个是datetime,一个是color,我想知道2009年12个月出现color的情况,例如1月份出现了多少种颜色,2月份出现了多少种颜色,这就好像在distinct了月份的基础上再distinct一下color,我只想要color的个数,不需要具体的颜色,如果能取得具体颜色那更好。请问join wang,这个bobo-browse能做到吗?
When I try to specify a AddNotValue?() to my BrowseSelection? I got an exception looking for some classes in com.kamikaze.... package. What version of the kamikaze jar file works best with current version of BoBo?? Thakns.
@yueming314: What you need can easily be done with bobo-browse. I don't read Chinese very well, if you can provide more information, I can help you getting started.
@farshadk: With latest bobo release, you need Kamikaze-1.0.8, if you check out the release branch, it should be in the lib/ directory.
@dipl.inf.matthias.schmidt: Feel free to send questions to the discussion group. I'd be happy to help in anyway. We are working on the documentation. Thanks for the input.
@finalfantasy22: What is your VM memory setting? For such a large index, you need to increase your vm heap size with -Xms and -Xmx options.
I am sorry. I thought you know chinese. Here is the case: There is a table-selling store with many tables. For each table, there are two fields - size and colour. I wonder how to figure out the number of colours for each size. Thank you!
hello join,Now we are analyzing some posters. For example, if there are 10 posters posted by 4 different authors, can bobo-browse rank these authors in the order of the frequency of some key words?
@ yuweiming314, it can be done rather easily: Give a selection with size, e.g. size="small"
BrowseRequest? req = new BrowseRequest?(); BrowseSelection? sel = new BrowseSelection?("size"); sel.addValue("small"); req.addSelection(sel);
// get the color values FacetSpec? fspec = new FacetSpec?(); fspec.setMaxCount(10); // get top 10 colors req.setFacetSpec("color",fspec);
BrowseResult? res = browser.browse(req);
the color values are available in the res object.
Hope this works for you.
Can you clarify what do you mean by "some key words"?
I want to count the frequency of the key word "flower" used by all authors in posters. How to get that? I will set up a Query gathering all the posters with the key word "flower". What I want to know is that can bobo-browse analyze the frequency of the word "flower" used by each author?
I don' think you need bobo to do this. Lucene out of the box supports term vectors. You can just use that. e.g. when you create the Field object, turn on TermVector?.
I get it but since I am not so good at Lucene, I will work hard on it~ Thank you for your patience and professional knowledge~
Hello,
I have a question regarding this example code and bobo-browsing. Is there a way to access the documetIDs or the underlying indexed document from the result?
The purpose of getting the indexed document is to show additional stored values to the user which are not facets.
thanks and best regards..
Having a BrowseResult?, how can I get/calculate the top N hits for each of the BrowseHits??
Thanks, farshadk
Hello, john.wang
How many records can support by bobo-browse? Thanks
and
QueryParser parser = new QueryParser("contents",new StandardAnalyzer(Version.LUCENE_CURRENT));aren't working anymore with lucene 3.0.1 and bobo-browse 2.5.0 RC1. Could you please provide a working/compiling example?
thx
to tino.bino: u must use bobo-browser-2.0.7 and lucene 2.3 or lower version these apis are supported.
Which version of lucene does it support? I'm using lucene 3.0 , can I use bobo-browse 2.5.0 ?
Please provide new users by giving a step-by-step tutorial about how to install and start, give some more code examples and more conceptual information,thanks
In my program using bobo-browse (which is modeled after the sample program above) I am getting facet values that are lower-cased and truncated (probably because of stemming in the index). I need to get the original value of the faceted field. When I browse the index with Luke, the original value is there. How can I get the original field value (e'g., "Active" instead of "activ", "Deprecated" instead of "deprec"), not the stems?
Thank you,
Michael