We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi there, I was setting up luceneutils to benchmark some stuff I have been thinking about w.r.t integer compression in posting lists.
As part of setting it up I wired up an A/A test where both competitors are exactly the same version from git.
I seem to get results that nearly always have the first competitor winning slightly.
Task QPS A StdDev QPS B StdDev Pct diff HighPhrase 741.01 (15.5%) 702.61 (14.4%) -5.2% ( -30% - 29%) IntNRQ 1759.47 (13.8%) 1670.37 (18.8%) -5.1% ( -33% - 31%) AndHighHigh 1082.02 (15.7%) 1040.52 (15.0%) -3.8% ( -29% - 31%) AndHighLow 2048.39 (12.4%) 1974.62 (16.1%) -3.6% ( -28% - 28%) LowSpanNear 1595.02 (11.9%) 1547.22 (19.6%) -3.0% ( -30% - 32%) Respell 361.26 (15.9%) 351.29 (11.5%) -2.8% ( -26% - 29%) PKLookup 270.92 (13.4%) 263.87 (18.1%) -2.6% ( -30% - 33%) MedTerm 3937.97 (13.0%) 3862.01 (18.5%) -1.9% ( -29% - 34%) LowTerm 3756.76 (9.4%) 3685.07 (18.2%) -1.9% ( -26% - 28%) OrHighMed 1175.71 (14.4%) 1154.50 (14.9%) -1.8% ( -27% - 32%) HighSloppyPhrase 534.14 (13.4%) 525.54 (15.4%) -1.6% ( -26% - 31%) MedPhrase 1176.03 (9.7%) 1159.14 (15.7%) -1.4% ( -24% - 26%) Prefix3 484.72 (11.9%) 478.04 (17.4%) -1.4% ( -27% - 31%) Fuzzy2 43.52 (14.7%) 43.08 (18.3%) -1.0% ( -29% - 37%) HighSpanNear 500.88 (20.4%) 502.61 (29.8%) 0.3% ( -41% - 63%) AndHighMed 1372.91 (16.2%) 1382.71 (20.6%) 0.7% ( -31% - 44%) LowPhrase 666.81 (15.4%) 673.52 (17.5%) 1.0% ( -27% - 40%) Wildcard 394.91 (13.3%) 398.97 (18.6%) 1.0% ( -27% - 37%) MedSloppyPhrase 761.00 (12.8%) 772.69 (19.2%) 1.5% ( -26% - 38%) HighTerm 3362.84 (10.5%) 3438.57 (14.2%) 2.3% ( -20% - 30%) Fuzzy1 324.88 (12.0%) 333.20 (15.9%) 2.6% ( -22% - 34%) OrHighLow 1786.50 (13.1%) 1844.04 (14.3%) 3.2% ( -21% - 35%) LowSloppyPhrase 874.38 (10.8%) 904.91 (21.0%) 3.5% ( -25% - 39%) MedSpanNear 415.60 (21.4%) 434.18 (20.1%) 4.5% ( -30% - 58%) OrHighHigh 1075.23 (15.3%) 1155.18 (21.9%) 7.4% ( -25% - 52%)
With the competitors reversed
Task QPS B StdDev QPS A StdDev Pct diff MedTerm 3199.44 (10.6%) 2931.05 (13.2%) -8.4% ( -29% - 17%) LowSloppyPhrase 1233.67 (9.2%) 1134.70 (7.8%) -8.0% ( -22% - 9%) MedPhrase 932.71 (8.5%) 866.26 (9.9%) -7.1% ( -23% - 12%) Wildcard 398.51 (8.6%) 375.41 (10.2%) -5.8% ( -22% - 14%) AndHighLow 2292.24 (10.2%) 2201.04 (18.7%) -4.0% ( -29% - 27%) HighSloppyPhrase 544.45 (16.8%) 528.25 (13.6%) -3.0% ( -28% - 32%) AndHighMed 1340.34 (10.0%) 1301.05 (13.2%) -2.9% ( -23% - 22%) HighSpanNear 519.63 (21.7%) 505.05 (20.9%) -2.8% ( -37% - 50%) OrHighMed 1168.76 (11.6%) 1145.84 (14.6%) -2.0% ( -25% - 27%) Respell 234.98 (10.9%) 230.39 (12.6%) -2.0% ( -22% - 24%) AndHighHigh 945.69 (17.2%) 929.94 (11.4%) -1.7% ( -25% - 32%) IntNRQ 1607.76 (14.0%) 1582.74 (11.7%) -1.6% ( -23% - 28%) OrHighHigh 1026.79 (9.2%) 1012.06 (10.6%) -1.4% ( -19% - 20%) OrHighLow 1573.45 (14.1%) 1556.98 (15.6%) -1.0% ( -26% - 33%) Fuzzy2 53.03 (13.9%) 52.59 (13.4%) -0.8% ( -24% - 30%) Prefix3 456.26 (9.1%) 453.52 (7.0%) -0.6% ( -15% - 17%) LowSpanNear 1009.28 (10.5%) 1003.82 (12.2%) -0.5% ( -21% - 24%) HighTerm 3539.41 (13.6%) 3531.60 (13.2%) -0.2% ( -23% - 30%) MedSpanNear 345.51 (23.4%) 347.01 (18.8%) 0.4% ( -33% - 55%) HighPhrase 559.54 (12.7%) 562.42 (11.3%) 0.5% ( -20% - 28%) LowPhrase 1080.39 (12.6%) 1103.94 (13.5%) 2.2% ( -21% - 32%) Fuzzy1 212.41 (13.2%) 217.09 (13.7%) 2.2% ( -21% - 33%) MedSloppyPhrase 872.23 (13.6%) 892.19 (10.8%) 2.3% ( -19% - 30%) PKLookup 274.07 (10.3%) 286.41 (7.5%) 4.5% ( -12% - 24%) LowTerm 3817.47 (9.9%) 4074.73 (14.8%) 6.7% ( -16% - 34%)
These seem a bit wild on the swings any thoughts as to what could cause this?
The text was updated successfully, but these errors were encountered:
Hmm this looks more like noise than a preference for whichever came first?
Because in each run, I see some queries look slower (top of the output), but others look faster (bottom of the output)?
I think one reason is that your QPS are ridiculously high here: we really can't conclude much from that. Maybe run on a larger set of documents?
Sorry, something went wrong.
Running with the wikipedia10M set did seem to bring numbers much more into alignment.
No branches or pull requests
Hi there, I was setting up luceneutils to benchmark some stuff I have been thinking about w.r.t integer compression in posting lists.
As part of setting it up I wired up an A/A test where both competitors are exactly the same version from git.
I seem to get results that nearly always have the first competitor winning slightly.
With the competitors reversed
These seem a bit wild on the swings any thoughts as to what could cause this?
The text was updated successfully, but these errors were encountered: