|
Project Information
Members
Featured
Downloads
Links
|
MontezumaMontezuma is a full-text indexing/search engine library written entirely in Common Lisp. Montezuma is a Common Lisp port of Ferret. Ferret is a Ruby port of Lucene. Lucene is sort of Doug Cutting's Java version of Text Database (TDB), which he and Jan Pedersen developed at Xerox PARC, and which, to complete the circle, was written in Common Lisp (see "An Object-Oriented Architecture for Text Retrieval"). We are hoping Montezuma will have better performance than both Ferret and Lucene, not because of doing anything fancy, but because of relying on native code-generating Common Lisp implementations. Currently performance approaches that of Lucene, but is far from Ferret's relatively blazing speed. See PerformanceComparisons. Montezuma is named after the Montezuma Oropendola. Current StatusThe latest release is the Montezuma 0.1.3 release candidate featuring UTF-8 support and is available for download. The first release was made July 13, 2006. To install it: (asdf-install:install '#:montezuma) You can also download a tarball or check out the latest SVN source from this site. Montezuma is also installable via clbuild. Indexing is complete, with both in-memory indices and indices on disk. Single-term queries, boolean queries, phrase queries and wildcard queries basically work (though there are [report:1 bugs]). Queries can be constructed via the API as well as by parsing a simple query language. RequirementsMontezuma uses CL-PPCRE and CL-FAD, both of which are ASDF-installable. I've tested it with OpenMCL, SBCL (version 0.9.12 or later, due to a bug in earlier versions) and ACL. Loading, Running, TestingLoad via ASDF: CL-USER> (asdf:oos 'asdf:load-op '#:montezuma)
; loading system definition from
; /Users/wiseman/src/montezuma/montezuma.asd into #<PACKAGE "ASDF4465">
; registering #<SYSTEM #:MONTEZUMA {106891F1}> as MONTEZUMA
; registering #<SYSTEM #:MONTEZUMA-TESTS {10A489A1}> as MONTEZUMA-TESTS
; ...Load and run unit tests via ASDF: CL-USER> (asdf:oos 'asdf:test-op '#:montezuma) ; compiling file "/Users/wiseman/src/montezuma/tests/unit/tests.lisp" (written 23 FEB 2006 11:37:38 AM): ; ... ;; MONTEZUMA::TEST-PRIORITY-QUEUE ................ ;; MONTEZUMA::TEST-PRIORITY-QUEUE-CLEAR .. ;; MONTEZUMA::TEST-PRIORITY-QUEUE-STRESS . ;; MONTEZUMA::TEST-RAM-STORE ............................................................................. ;; MONTEZUMA::TEST-FS-STORE ............................................................................. ;; MONTEZUMA::TEST-STANDARD-FIELD ........... ; ... Run individual tests (or test fixtures): CL-USER> (montezuma::run-test-named 'montezuma::test-segment-info) ;; MONTEZUMA::TEST-SEGMENT-INFO ...... Development Strategy1. Make an initial hack & slash port of Ferret to Lisp. This is basically a transliteration of the Ruby code to Lisp. It might be ugly, it might not be idiomatic, but the goal is to get something working quickly. Unit tests will be ported as well. Some features may be left unimplemented (e.g., thread- and process-level locking). 2. Clean it up. Turn it into code that Lispers will enjoy. Craft a nice external API and let it out. A comprehensive library of unit tests will be critical to avoid breakage. 3. Tune performance and finish out the feature list. Once the code reaches a point where it is both adequately useful and adequately pretty, begin the process of attacking the the last 20%, which is a task that will presumably never be complete. Add benchmark tests, fix bugs, etc. Ideally I will find someone who thinks this is fun--and dump the project on them. We're somewhere between (2) and (3) now. Mailing list/Discussion GroupThere is a Montezuma discussion group/mailing list for users and developers alike. |