Xapian provides MultiValueSorter, which seems to make it possible to sort by more than one value. Currently, there is apparently now way to make use of this through Xappy.
Comment #1
Posted on Oct 21, 2008 by Happy WombatThe patch adds the possibility to pass a tuple to the "sortby" argument for search(). As MultiValueSorter does not support an asc/desc order request for each individual field, it needs to be specified as a separate entry in the tuple:
sortby=('field1', 'field2') sortby=('-', 'field1', 'field2')
- 24-multisort.diff 4.44KB
Comment #2
Posted on Oct 21, 2008 by Quick ElephantThanks for the patch. However, MultiValueSorter does allow an ascending or descending direction to be specified for each value: in the C++ header, for example:
void add(Xapian::valueno valno, bool forward = true) {
valnos.push_back(std::make_pair(valno, forward));
}
ie: the "add()" method takes a second parameter indicating the direction.
I'm not sure what the best API for this is: perhaps the cleanest would be to allow a "-" or "+" to be specified as the first character of the field name, so we'd have values like:
sortby=('-field1', '+field2')
If the fieldname didn't start with either '+' or '-', we would assume ascending order, I think.
I'm open to suggestions for a better API, though!
(Also, note that before this patch can be merged, some unittests need to be written to check its behaviour. Thanks for your work so far, though.)
Comment #3
Posted on Oct 21, 2008 by Happy Wombatsortby=('-field1', '+field2')
I think that's clearly the best solution. I totally missed the fact that MultiValueSorter supports this, which was the only reason I opted for the somewhat awkward first-tuple-element syntax in the first place.
I'll see if I get some unittests written, too. They would be added as doctests to introduction.rst I presume?
Comment #4
Posted on Oct 21, 2008 by Quick ElephantGreat - thanks!
Unit tests should go in xappy/unittests/ - make a new file to test this feature. Copy one of the existing ones to get a head-start.
Comment #5
Posted on Oct 23, 2008 by Happy WombatAttached patch uses ('-field1', '+field2') format and includes a testcase.
- 24-multisort-2.diff 5.79KB
Comment #6
Posted on Nov 18, 2008 by Quick ElephantI have committed support for this feature. I used your testcase (with some small tidy-ups), but I've rewritten the patch in searchconnection.py for a couple of reasons:
I prefer to test for strings, and allow any sequence, rather than test for a few specific sequence types, so I've done that instead. This is partly for consistency with the rest of xappy (I've done this whereever a sequence-or-string is accepted), but mainly because all strings have a common base class (ie, "basestring") whereas sequences don't. (Though the equivalent will come along in python3k, in the form of abstract base classes, as I understand it.)
I prefer not to use nested functions quite so freely. Just a hangover from being a C coder originally, really, but the rest of xappy tends to be in that style. I have a limited memory, and find it easier to read code without too many nested functions!
Your patch was a very helpful prototype, though, and it's very nice not to have to write the test, too! Many thanks, and apologies for taking so long to sort this out.
Status: Fixed
Labels:
Type-Defect
Priority-Medium