Export to GitHub

xappy - issue #24

Support MultiValueSorter


Posted on Oct 21, 2008 by Happy Wombat

Xapian provides MultiValueSorter, which seems to make it possible to sort by more than one value. Currently, there is apparently now way to make use of this through Xappy.

Comment #1

Posted on Oct 21, 2008 by Happy Wombat

The patch adds the possibility to pass a tuple to the "sortby" argument for search(). As MultiValueSorter does not support an asc/desc order request for each individual field, it needs to be specified as a separate entry in the tuple:

sortby=('field1', 'field2') sortby=('-', 'field1', 'field2')

Attachments

Comment #2

Posted on Oct 21, 2008 by Quick Elephant

Thanks for the patch. However, MultiValueSorter does allow an ascending or descending direction to be specified for each value: in the C++ header, for example:

void add(Xapian::valueno valno, bool forward = true) {
    valnos.push_back(std::make_pair(valno, forward));
}

ie: the "add()" method takes a second parameter indicating the direction.

I'm not sure what the best API for this is: perhaps the cleanest would be to allow a "-" or "+" to be specified as the first character of the field name, so we'd have values like:

sortby=('-field1', '+field2')

If the fieldname didn't start with either '+' or '-', we would assume ascending order, I think.

I'm open to suggestions for a better API, though!

(Also, note that before this patch can be merged, some unittests need to be written to check its behaviour. Thanks for your work so far, though.)

Comment #3

Posted on Oct 21, 2008 by Happy Wombat

sortby=('-field1', '+field2')

I think that's clearly the best solution. I totally missed the fact that MultiValueSorter supports this, which was the only reason I opted for the somewhat awkward first-tuple-element syntax in the first place.

I'll see if I get some unittests written, too. They would be added as doctests to introduction.rst I presume?

Comment #4

Posted on Oct 21, 2008 by Quick Elephant

Great - thanks!

Unit tests should go in xappy/unittests/ - make a new file to test this feature. Copy one of the existing ones to get a head-start.

Comment #5

Posted on Oct 23, 2008 by Happy Wombat

Attached patch uses ('-field1', '+field2') format and includes a testcase.

Attachments

Comment #6

Posted on Nov 18, 2008 by Quick Elephant

I have committed support for this feature. I used your testcase (with some small tidy-ups), but I've rewritten the patch in searchconnection.py for a couple of reasons:

  • I prefer to test for strings, and allow any sequence, rather than test for a few specific sequence types, so I've done that instead. This is partly for consistency with the rest of xappy (I've done this whereever a sequence-or-string is accepted), but mainly because all strings have a common base class (ie, "basestring") whereas sequences don't. (Though the equivalent will come along in python3k, in the form of abstract base classes, as I understand it.)

  • I prefer not to use nested functions quite so freely. Just a hangover from being a C coder originally, really, but the rest of xappy tends to be in that style. I have a limited memory, and find it easier to read code without too many nested functions!

Your patch was a very helpful prototype, though, and it's very nice not to have to write the test, too! Many thanks, and apologies for taking so long to sort this out.

Status: Fixed

Labels:
Type-Defect Priority-Medium