My favorites | Sign in
Project Logo
                
Search
for
Updated Apr 16, 2008 by harry.chen
Labels: Phase-Design, Phase-Implementation
DevSearchTermSuggestion  
Suggests search terms as the user types in the search box.

Problem

The purpose of search term suggestion is to improve the user's search experience.

This feature can help to overcome several problems. First, a user often doesn't know whether or not an input search query will return any matching results until the user actually clicks on the 'search' button. A trial-and-error approach to search is inefficient. Second, searching bookmarks by tags requires prior knowledge about how tags are used. Users can't possibly remember all available tags used in the system. If the user doesn't know what tags exist in the system, the user can't effectively use tags to find bookmarks. Finally, machine tag syntax can be difficult to remember for some users. Not knowing what values are available for a given Machine Tag predicate, users can't search for Machine-Tagged bookmarks.

Solution

As the user enters a search term in the search box, suggests relevant search terms.

Design

The design for search term suggestion consists of the following components:

  1. a JavaScript program that monitors and interacts with the user input in the search box.
  2. A WebWork action that takes any search input and returns a list of suggestions to be displayed.
  3. A lookup table used to find matching suggestions for any given search term.

Implementation

In the current implementation (2.4.0-M2), the implementation is as the follows.

The JavaScript program is implemented based on the AJAX Suggestions developed by Robert Nyman. Some modifications are made to customize the UI layout generated by the program.

The JavaScript programs monitors texts entered into the search box. It triggers suggestions only when the length of the text is greater than or equals to 3 (i.e., fo will not trigger the suggestion, but foo will).

The WebWork! action implementation uses a search suggestion class implemented in the gnzir-core package. After receiving a suggestion request from the client JavaScript project, the action runs the suggestion computation. If more than one matching suggestions are returned, they are formatted into a partial HTML page and returned to the client JavaScript program. This HTML page is dynamically inserted into the page in the client browser. !AJAX Suggestions displays the result in a drop-down list.

The computation for finding matching search term suggestions is built on Lucene API. Matching algorithm tries to perform an "auto-complete" operation on the input term using a set of dictionary words. For example, foo could match food, foot, foobar etc.

Those dictionary words come from two different sources -- a static word dictionary file and a dynamic set of tags used to label bookmarks. The default static word dictionary is the SCOWL (Spell Checker Oriented Word Lists) file from the wordlist project. The dynamic dictionary words are collected from the gnizr tag database of the local installation. Upon starting the server, this dictionary is built by querying the top 2000 most frequently used tags in the system.

Lucene API is used to maintain a search index database of those dictionary words.

The search algorithm works as the follows.

  1. Given a term t, find all words that match t* (e.g., if the word is foo, find foo*).
  2. For each w in the matching words, check if the usage frequency of w in the bookmark search index database is greater than 0. This index database is the Lucene index database that stores search information about all bookmarks.
  3. Add all w that has non-zero usage frequency to the result list until the list size reaches a defined maximum threshold.

Related Classes


Sign in to add a comment
Hosted by Google Code