My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
GettingStarted  

Getting Started:

Prepare a browse index:

A browse index is a Lucene index with field description information. Bobo Browse does not currently provide any extensive indexing wrappers on top of Lucene, so indexing with the standard Lucene API (http://lucene.apache.org/java/2_4_1/api/core/index.html) would suffice.

See Creating a Browse Index for details.

Search and Browse:

The Bobo Browse Engine is released as a library, the API is published here. Familiarity with the Lucene API is helpful.

Here are some of the basic concepts or objects to understand:

  • BrowseSelection: A selection or filter to be applied, e.g. Color=Red
  • FacetSpec: Specifies how facets are to be returned on the result object, e.g. Top 10 facets of car types ordered by count with a min count of 5
  • BrowseRequest: A set of BrowseSelections, a keyword text query, and a set of FacetSpecs.
  • BrowseFacet: a facet, (a string value with a hit count)
  • FacetCollection: A Collection object of BrowseFacets
  • BrowseResult: Result of a browse operation.
  • FacetHandler: A plugin into the browse engine to knows how to manipulate facet data.
  • BoboIndexReader: A Lucene IndexReader containing a List of FacetHandlers.

Example:

// define facet handlers
	  
	    // color facet handler
	    SimpleFacetHandler colorHandler = new SimpleFacetHandler("color");
	    
	    // category facet handler
	    SimpleFacetHandler categoryHandler = new SimpleFacetHandler("category");
	    
	    List<FacetHandler> handlerList = Arrays.asList(new FacetHandler[]{colorHandler,categoryHandler});
	    
		// opening a lucene index
		  Directory idx = FSDirectory.open(new File("myidx"));
		  IndexReader reader = IndexReader.open(idx,true);
		   
		  // decorate it with a bobo index reader
		  BoboIndexReader boboReader = BoboIndexReader.getInstance(reader,handlerList);
		   
		  // creating a browse request
		  BrowseRequest br=new BrowseRequest();
		  br.setCount(10);
		  br.setOffset(0);
		   
		  // add a selection
		  BrowseSelection sel=new BrowseSelection("color");
		  sel.addValue("red");
		  br.addSelection(sel);
		   
		  // parse a query
		  QueryParser parser = new QueryParser("contents",new StandardAnalyzer(Version.LUCENE_CURRENT));
		  Query q=parser.parse("cool car");
		  br.setQuery(q);
		   
		  // add the facet output specs
		  FacetSpec colorSpec = new FacetSpec();
		  colorSpec.setOrderBy(FacetSortSpec.OrderHitsDesc);
		   
		  FacetSpec categorySpec = new FacetSpec();
		  categorySpec.setMinHitCount(2);
		  categorySpec.setOrderBy(FacetSortSpec.OrderHitsDesc);
		   
		  br.setFacetSpec("color",colorSpec);
		  br.setFacetSpec("category",categorySpec);
		   
		  // perform browse
		  Browsable browser=new BoboBrowser(boboReader);
		  BrowseResult result=browser.browse(br);
		   
		  int totalHits = result.getNumHits();
		  BrowseHit[] hits = result.getHits();
		   
		  Map<String,FacetAccessible> facetMap = result.getFacetMap();
		   
		  FacetAccessible colorFacets = facetMap.get("color");
		  List<BrowseFacet> facetVals = colorFacets.getFacets();
Comment by mathieu....@gmail.com, Sep 22, 2009

I tried this code but it only worked when I passed some FacetHandlers? to the BoboIndexReader?.getInstance. Am I missing something ?

Comment by project member john.w...@gmail.com, Sep 25, 2009

That's correct. You either have to pass it in or have a bobo.spring define the facet handlers in the index.

Comment by tommyng2...@gmail.com, Oct 8, 2009

What would be the best way to display result? get documentId out of every BrowseHit?, then get the Lucene Document? When I tried to get the data directly out of BrowseHit? with getField, it gave me null pointer exception. I am using Lucene 2.9.0 and Bobo Browsing 2.0.6

Code Sample: BrowseHit? hits = result.getHits(); for(int i=0;i<hits.length;++i) {

BrowseHit? browseHit = hitsi?; Document d = reader.document(browseHit.getDocid()); System.out.println(d.get("color"));
} Is it the only way to display result?

Comment by project member john.w...@gmail.com, Oct 10, 2009

we don't currently support lucene 2.9.0 (we do have a branch for lucene2.9 migration however, still working on it though)

one reason you are getting a NPE is because on the BrowseRequest?, you are not requesting field values. Do: BrowseRequest?.setFetchStoredFields(true)

We should probably throw a better exception.

Comment by project member john.w...@gmail.com, Oct 10, 2009

Actually, if you have "color" already defined in a FacetHandler, you don't need to call setFetchStoredFields.

Comment by project member john.w...@gmail.com, Oct 10, 2009

I updated the wiki with a better code snippet

Comment by tommyng2...@gmail.com, Oct 21, 2009

Hi John, Thanks for your quick reply. Your new snippet update and your explanation really got me 80% there. However, I still have alittle problem left. Since its a longer question, I sent you an email regarding the problem. Your help is truely appreciated.

Comment by finalfan...@gmail.com, Oct 21, 2009

Any one knows why I always get java.lang.OutOfMemoryError?: Java heap space at line BrowseResult? result=browser.browse(br) - for about 3-4 SimpleFacetHandler? on about 5,000,0000 documents search ?

Comment by dipl.inf...@gmail.com, Oct 27, 2009

Honestly guys, you need far more documentation here!

I coming from the Solr corner searching for a technology to implement a faceted browser in a standalone Java application. All I get downloading the release is the sources jar-file and the online JavaDoc? isn't working...

Please provide new users by giving a step-by-step tutorial about how to install and start, give some more code examples and more conceptual information.

Comment by jay88...@gmail.com, Dec 25, 2009

Map<String,FacetAccessible?> facetMap = result.getFacetMap(); FacetAccessible? colorFacets = facetMap.get("color"); List<BrowseFacet> facetVals = colorFacets.getFacets();

i tried on local.but the facetVals is return null.

Comment by jay88...@gmail.com, Dec 26, 2009

i got it.before you call method getFacets,you should call "collectAll".

like following code: SimpleFacetHandler?.SimpleFacetCountCollector? colorFacets=(SimpleFacetHandler?.SimpleFacetCountCollector?)facetMap.get("color"); colorFacets.collectAll(); List<BrowseFacet> facetVals = colorFacets.getFacets();

Comment by yuweimin...@gmail.com, Dec 29, 2009

假如我有2个filed,索引不分词的。一个是datetime,一个是color,我想知道2009年12个月出现color的情况,例如1月份出现了多少种颜色,2月份出现了多少种颜色,这就好像在distinct了月份的基础上再distinct一下color,我只想要color的个数,不需要具体的颜色,如果能取得具体颜色那更好。请问join wang,这个bobo-browse能做到吗?

Comment by farsh...@gmail.com, Dec 30, 2009

When I try to specify a AddNotValue?() to my BrowseSelection? I got an exception looking for some classes in com.kamikaze.... package. What version of the kamikaze jar file works best with current version of BoBo?? Thakns.

Comment by project member john.w...@gmail.com, Dec 31, 2009

@yueming314: What you need can easily be done with bobo-browse. I don't read Chinese very well, if you can provide more information, I can help you getting started.

Comment by project member john.w...@gmail.com, Dec 31, 2009

@farshadk: With latest bobo release, you need Kamikaze-1.0.8, if you check out the release branch, it should be in the lib/ directory.

Comment by project member john.w...@gmail.com, Dec 31, 2009

@dipl.inf.matthias.schmidt: Feel free to send questions to the discussion group. I'd be happy to help in anyway. We are working on the documentation. Thanks for the input.

Comment by project member john.w...@gmail.com, Dec 31, 2009

@finalfantasy22: What is your VM memory setting? For such a large index, you need to increase your vm heap size with -Xms and -Xmx options.

Comment by yuweimin...@gmail.com, Dec 31, 2009

I am sorry. I thought you know chinese. Here is the case: There is a table-selling store with many tables. For each table, there are two fields - size and colour. I wonder how to figure out the number of colours for each size. Thank you!

Comment by yuweimin...@gmail.com, Jan 1, 2010

hello join,Now we are analyzing some posters. For example, if there are 10 posters posted by 4 different authors, can bobo-browse rank these authors in the order of the frequency of some key words?

Comment by project member john.w...@gmail.com, Jan 2, 2010

@ yuweiming314, it can be done rather easily: Give a selection with size, e.g. size="small"

BrowseRequest? req = new BrowseRequest?(); BrowseSelection? sel = new BrowseSelection?("size"); sel.addValue("small"); req.addSelection(sel);

// get the color values FacetSpec? fspec = new FacetSpec?(); fspec.setMaxCount(10); // get top 10 colors req.setFacetSpec("color",fspec);

BrowseResult? res = browser.browse(req);

the color values are available in the res object.

Hope this works for you.

Comment by project member john.w...@gmail.com, Jan 2, 2010

Can you clarify what do you mean by "some key words"?

Comment by yuweimin...@gmail.com, Jan 2, 2010

I want to count the frequency of the key word "flower" used by all authors in posters. How to get that? I will set up a Query gathering all the posters with the key word "flower". What I want to know is that can bobo-browse analyze the frequency of the word "flower" used by each author?

Comment by project member john.w...@gmail.com, Jan 4, 2010

I don' think you need bobo to do this. Lucene out of the box supports term vectors. You can just use that. e.g. when you create the Field object, turn on TermVector?.

Comment by yuweimin...@gmail.com, Jan 4, 2010

I get it but since I am not so good at Lucene, I will work hard on it~ Thank you for your patience and professional knowledge~

Comment by Everlas...@gmx.de, Feb 3, 2010

Hello,

I have a question regarding this example code and bobo-browsing. Is there a way to access the documetIDs or the underlying indexed document from the result?

The purpose of getting the indexed document is to show additional stored values to the user which are not facets.

thanks and best regards..

Comment by farsh...@gmail.com, Feb 24, 2010

Having a BrowseResult?, how can I get/calculate the top N hits for each of the BrowseHits??

Thanks, farshadk

Comment by zhangxh0...@gmail.com, Mar 10, 2010

Hello, john.wang

How many records can support by bobo-browse? Thanks

Comment by tino.b...@gmail.com, Apr 7, 2010
BoboIndexReader boboReader = BoboIndexReader.getInstance(reader,handlerList);

and

QueryParser parser = new QueryParser("contents",new StandardAnalyzer(Version.LUCENE_CURRENT));

aren't working anymore with lucene 3.0.1 and bobo-browse 2.5.0 RC1. Could you please provide a working/compiling example?

thx

Comment by quzis...@gmail.com, Jan 4, 2011

to tino.bino: u must use bobo-browser-2.0.7 and lucene 2.3 or lower version these apis are supported.

Comment by Anson....@gmail.com, Jun 14, 2011

Which version of lucene does it support? I'm using lucene 3.0 , can I use bobo-browse 2.5.0 ?

Comment by ska...@gmail.com, Nov 14, 2011

Please provide new users by giving a step-by-step tutorial about how to install and start, give some more code examples and more conceptual information,thanks

Comment by msmolyak...@gmail.com, May 18, 2012

In my program using bobo-browse (which is modeled after the sample program above) I am getting facet values that are lower-cased and truncated (probably because of stemming in the index). I need to get the original value of the faceted field. When I browse the index with Luke, the original value is there. How can I get the original field value (e'g., "Active" instead of "activ", "Deprecated" instead of "deprec"), not the stems?

Thank you,

Michael


Sign in to add a comment
Powered by Google Project Hosting