Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance problem in HashBasedTable #1085

Closed
gissuebot opened this issue Oct 31, 2014 · 12 comments
Closed

performance problem in HashBasedTable #1085

gissuebot opened this issue Oct 31, 2014 · 12 comments

Comments

@gissuebot
Copy link

Original issue created by adi175652q on 2012-07-25 at 10:56 PM


Hi,

I am encountering a performance problem in HashBasedTable. It appears
in Guava 12.0.1 and in the latest revision. I attached a test that
exposes this problem and a patch that fixes it. For this test, the
patch provides a 1397 X speedup. Note that this test uses a very
small table (300x300), but for medium size tables (e.g., 1000x1000),
the test does not finish even after 30 minutes (I killed the process),
while the patched version finishes in 368 milliseconds.

To run the test, just do:

$ java Test

The output for the un-patched version is:
Time is 65679

The output for the patched version is:
Time is 47

The attached test shows that, for a "HashBasedTable table" object, the
following operation is very slow:

table.values().containsAll(toContain);

"HashBasedTable.values()" returns a "StandardTable.Values" object,
which inherits containsAll() from "java.util.AbstractCollection",
which has quadratic complexity.

I attached two patches. Both patches override containsAll() and
implement a linear time algorithm. patchSmall.diff populates a
HashSet eagerly, and patchFull.diff populates a HashSet lazily.
patchFull.diff is faster than patchSmall.diff when containsAll()
returns false after inspecting only a few elements, though in most
cases they are equally fast. I am including patchSmall.diff just in
case you prefer smaller code.

Note that this problem also exists for the other "Values" classes
(LocalCache.Values, ArrayTable.Values, ImmutableMultimap.Values,
MapMakerInternalMap.Values, AbstractFilteredMap.Values, Maps.Values,
FilteredMultimap.AsMap.Values, FilteredMultimap.Values,
Multimaps.Values, StandardTable.Column.Values).

Is this truly a performance problem, or am I misunderstanding the
intended behavior? If so, can you please confirm that the patch is
correct?

Thanks,

Adrian

@gissuebot
Copy link
Author

Original comment posted by kevinb@google.com on 2012-07-25 at 11:19 PM


(No comment entered for this change.)


Owner: wasserman.louis

@gissuebot
Copy link
Author

Original comment posted by wasserman.louis on 2012-07-26 at 09:35 AM


I honestly think this goes too far, and I think the original behavior is appropriate.

As a rule, I only expect containsAll to be linear time on a collection that is based on a hash table containing the elements to be checked for containing -- e.g. HashSet or HashMultiset, not ArrayList or Map.values(). The JDK set that precedent, and I feel totally comfortable following it.

I also strongly suspect there exist classes used only as the values in a map or table that don't have a properly implemented hashCode() because they're never expected to be used as keys, which this change would break.

In any event, if any users really desire the linear-time behavior that constructs a HashSet, they can freely substitute the one-liner Sets.newHashSet(table.values()).containsAll(collection);

@gissuebot
Copy link
Author

Original comment posted by adi175652q on 2012-07-26 at 03:13 PM


Hi Louis,

I think the original behavior is appropriate.

There are two options here. One, we can have a method that blocks
the program even for a 1000x1000 table, and tell programmers "I bet
you missed that, our method was supposed to block your program (and we
knew about it)". Two, we can solve the problem, and make a slow
method fast.

If we can make a slow method fast, why not do it? Especially when the
fix is 3 lines of code.

they can freely substitute the one-liner
Sets.newHashSet(table.values()).containsAll(collection);

Do you think a programmer thinks of the complexity of the method,
each time he types a method call? Coding like this would be really
impossible. We have bugs because programmers miss the null objects,
let alone complexity. That's why we have libraries: to make
programmers' life easy, not to test their power of concentration.

Best,

Adrian

@gissuebot
Copy link
Author

Original comment posted by cpovirk@google.com on 2012-07-26 at 09:21 PM


We have a long-standing internal bug to "create a table of our collection implementations and their operations showing the big-O performance of each." That might help here (though the JDK precedent is probably still more important), but it doesn't answer the question "Why not make it faster?"

Well, there's always the potential for problems in weird edge cases. If anyone implements hashCode() wrong, as Louis points out, he would have problems. He is arguably getting what he deserves, but if there's a chance that we're going to break his app, we'd better be able to demonstrate a clear improvement over what users can do on their own.

Most affected users, as you point out, can call "newHashSet" themselves, and their problem is gone. If the performance impact is severe, they'll likely identify this problem and fix quickly. For some other users, the performance impact will be negligible. For still others, the HashSet version may even perform worse. (For example, my understanding is that object allocation on Android is still expensive in comparison to server JVMs.)

Should programmers have to worry about time complexity? When we can hide it from them, we can, but I don't think this is one of those cases.


Status: WontFix

@gissuebot
Copy link
Author

gissuebot commented Nov 1, 2014

Original comment posted by em...@soldal.org on 2012-07-27 at 06:21 AM


Random thought, but should we try to document certain methods and classes with their big-O performance? Like

/**
 * This is my class. It does class things
 *
 * @big-O O(n log n)
 */

At least in a few places this could work as a gentle warning to the developer.

@gissuebot
Copy link
Author

Original comment posted by wasserman.louis on 2012-07-27 at 03:30 PM


For "views" like the values() collections, it's much less obvious how to do this cleanly.

@gissuebot
Copy link
Author

Original comment posted by adi175652q on 2012-07-27 at 07:11 PM


Hi,

I agree with most discussion in Comment 5 and I would like to add two
points to the discussion.

First, I agree it is easy to find performance problems after they
occur during production. But finding performance problems during
development and testing
is hard, even if fixing them requires only
one line of code (like in the containsAll() example). To avoid this
problem and not have programmers worry about time complexity, like
Comment 5 says, we should hide the complexity in the library.

Second, we are missing many serious performance improvements because
we assume potentially wrong hashCode(). Most performance improvements
would require some sort of fast searching, e.g., using a HashSet. Not
using hashCode() is a huge blocker for such major performance
improvements (sure, we can do minor tweaks without HashSet, or in some
particular cases, there is already a HashSet in the code). If
developers have wrong hashCode(), they should patch their bugs. Right
now, we are seriously limiting the opportunities for improvement
because some developers might have bugs in their code.

Personally, I feel Guava is better than JDK because Guava evolves
quicker than JDK. But limiting most (if not all) serious performance
improvements because of the possibility of having a corner case in
hashCode() looks like a bad trade-off. It's almost like having a
written contract to never have any major performance improvements.

Best,

Adrian

@gissuebot
Copy link
Author

Original comment posted by kevinb@google.com on 2012-07-27 at 07:24 PM


I'm not fussed about what happens to people who implement hashCode wrong, or whose hashCode changes while in the collection. I'd say let's set that argument aside.

So we have a quadratic operation that can be made linear by allocating a bunch of temporary objects, is that right? It's a huge win for your case, but it sounds like a loss for most usages. It seems better to me to document the complexity of our stuff clearly, and them do the newHashSet themselves when they need to.

@gissuebot
Copy link
Author

Original comment posted by wasserman.louis on 2012-07-27 at 08:34 PM


I guess I feel that it's just common sense that containsAll() should take time proportional to the cost of a single contains(), times the size of the target collection.

This also feels like an extreme edge case that people who care about asymptotics already expect to be a quadratic-time operation. I certainly can't remember the last time, in any context, when I called values().containsAll(). Did the OP encounter an actual, real-world case where this caused a performance bug?

I have to admit I don't consider this a "major performance improvement" at all, compared to e.g. a significant constant-factor improvement to add, remove, or contains in a "named" collection like HashMultiset.

@gissuebot
Copy link
Author

Original comment posted by adi175652q on 2012-07-31 at 02:23 PM


Hi Kevin,

It's a huge win for your case, but it sounds like a loss for most
usages.

I agree with Comment 9 that we should also consider other cases. The
quantitative data that compares the existing containsAll() code and
the proposed patch is below.

I wrote a test (attached TestWorstScenariosForProposedPatch.java) to
exercise the best cases for the existing containsAll(), which are
the worst cases for the proposed Full/Lazy patch. These cases happen
when containsAll() returns after inspecting only very small parts of
table.values() and/or "collection". I parametrized the test by the
number of elements inspected until containsAll() can return.

These are the results on my machine (ORIGINAL is the existing code,
PROPOSED is the Full/Lazy patch, CALLER_RESPONSIBILITY is the one-line
workaround proposed in Comment 3):

MILLISECONDS when table.values().containsAll(collection) ORIGINAL PROPOSED CALLER_RESPONSIBILITY
...returns FALSE for the 60th element of collection : 48.936 4.422 4.296
...returns FALSE for the 40th element of collection : 30.048 4.412 4.248
...returns FALSE for the 30th element of collection : 21.704 4.364 4.254
...returns FALSE for the 25th element of collection : 18.258 4.372 4.260
...returns FALSE for the 20th element of collection : 14.578 4.384 4.266
...returns FALSE for the 15th element of collection : 12.362 4.486 4.264
...returns FALSE for the 10th element of collection : 8.144 4.358 4.260
...returns FALSE for the 9th element of collection : 7.878 4.358 4.250
...returns FALSE for the 8th element of collection : 7.726 4.374 4.262
...returns FALSE for the 7th element of collection : 6.992 4.374 4.246
...returns FALSE for the 6th element of collection : 6.938 4.360 4.272
...returns FALSE for the 5th element of collection : 5.826 4.356 4.252
...returns FALSE for the 4th element of collection : 5.312 4.368 4.236
...returns FALSE for the 3th element of collection : 3.964 4.380 4.266
...returns FALSE for the 2th element of collection : 3.474 4.376 4.264
...returns FALSE for the 1th element of collection : 2.812 4.360 4.256

...returns FALSE for the 0th element of collection : 1.534 4.366 4.364

...returns TRUE for the 60th element of table.values() : 0.260 0.074 4.276
...returns TRUE for the 40th element of table.values() : 0.182 0.074 4.284
...returns TRUE for the 30th element of table.values() : 0.144 0.074 4.240
...returns TRUE for the 25th element of table.values() : 0.112 0.074 4.288
...returns TRUE for the 20th element of table.values() : 0.094 0.074 4.282
...returns TRUE for the 15th element of table.values() : 0.070 0.076 4.278
...returns TRUE for the 10th element of table.values() : 0.054 0.074 4.268
...returns TRUE for the 9th element of table.values() : 0.058 0.070 4.312
...returns TRUE for the 8th element of table.values() : 0.046 0.072 4.284
...returns TRUE for the 7th element of table.values() : 0.040 0.068 4.282
...returns TRUE for the 6th element of table.values() : 0.036 0.100 4.288
...returns TRUE for the 5th element of table.values() : 0.034 0.072 4.316
...returns TRUE for the 4th element of table.values() : 0.030 0.072 4.304
...returns TRUE for the 3th element of table.values() : 0.024 0.068 4.276
...returns TRUE for the 2th element of table.values() : 0.022 0.096 4.278
...returns TRUE for the 1th element of table.values() : 0.018 0.092 4.284

...returns TRUE for the 0th element of table.values() : 0.014 0.070 4.276

The above quantitative data shows the following:

(1) the proposed patch is faster than the existing containsAll() in
    the majority of cases when small parts of the collections are
    inspected, and significantly faster for the other cases.

(1.1) when containsAll() returns false, the proposed patch is
    significantly faster than the existing code. In fact, the
    proposed patch is slower only when containsAll() returns after not
    finding an element with index between 0 and 3 from "collection".
    The very worst case is for index 0, when the patched code builds
    the set but never reuses it. Even for the element with index 3
    the difference is small: the existing code takes 3.964
    milliseconds, while the proposed patch takes 4.380 milliseconds.
    If we need to guard against the pathological case when you have
    only small collections, we can opt not to build the set for very
    small "collection". I put this in the patch (but not for the
    experiment) for size 4.

(1.2) when containsAll() returns true, the absolute times are
    negligible, e.g., 0.074 milliseconds for the proposed patch. The
    proposed patch is slower only when containsAll() returns after
    finding all the "collection" elements in the first 15 elements of
    "table.values()". Typically, the difference is very small, e.g.,
    0.072 vs. 0.046 milliseconds. If we need to guard against the
    pathological case with only small "table.values()", we can opt not
    to build the set for very small "table.values()". I put this in
    the patch (but not for the experiment) for size 16.

(2) the proposed patch is significantly faster than the the one-line
    workaround in Comment 3 for half of the cases (when containsAll()
    returns true) and equally fast for the other half (when
    containsAll() returns false). So, the one-line workaround not
    only requires the programmer to pay attention to time complexity
    but also results in code slower than necessary. (The code could
    be made faster by implementing some method for lazy creation of
    sets, say, com.google.common.collect.Sets.newLazyHashSet, though
    this would still require the programmer to pay attention when this
    method needs to be used.)

I am running JDK 1.6.0_24 on Ubuntu 11.10. If possible, I would like
to see the running times on your machines. To run the above test,
just do "java TestWorstScenariosForProposedPatch". Note that the
execution may take some time (3.5 minutes on my machine), because it
generates many data points.

I am also attaching patchFullImproved.java, which has several changes
from the original patchFull.java.

The above numbers are the worst cases for the patch and the best cases
for the existing code, so the common usage will be improved by the
patch, and certainly not slowed down.

Best,

Adrian

@gissuebot
Copy link
Author

Original comment posted by cpovirk@google.com on 2012-08-02 at 10:54 PM


I still fear for GWT and Android performance, which are less convenient to measure, and I have vague reservations, but that's not much reason to close this.


Status: Research

@gissuebot
Copy link
Author

Original comment posted by adi175652q on 2012-09-25 at 10:23 PM


I am attaching the benchmark in Comment 11 written with Caliper. The
results are similar to the ones in Comment 11:

......................................................................
testReturnFalse benchmark index us linear runtime
           true Original 60 49950.0 ==============================
           true Original 40 30831.9 ==================
           true Original 30 19968.1 ===========
           true Original 25 16661.5 ==========
           true Original 20 14361.2 ========
           true Original 15 12968.3 =======
           true Original 10 7568.6 ====
           true Original 9 8066.8 ====
           true Original 8 7886.0 ====
           true Original 7 7359.7 ====
           true Original 6 7212.8 ====
           true Original 5 5966.7 ===
           true Original 4 5392.1 ===
           true Original 3 3984.6 ==
           true Original 2 3641.4 ==
           true Original 1 2990.8 =
           true Original 0 1600.5 =
           true ProposedLazy 60 4588.4 ==
           true ProposedLazy 40 4371.8 ==
           true ProposedLazy 30 4680.9 ==
           true ProposedLazy 25 4385.7 ==
           true ProposedLazy 20 4623.4 ==
           true ProposedLazy 15 4593.7 ==
           true ProposedLazy 10 4593.1 ==
           true ProposedLazy 9 4594.7 ==
           true ProposedLazy 8 4607.5 ==
           true ProposedLazy 7 4456.8 ==
           true ProposedLazy 6 4375.6 ==
           true ProposedLazy 5 4428.9 ==
           true ProposedLazy 4 4594.7 ==
           true ProposedLazy 3 4469.4 ==
           true ProposedLazy 2 4361.2 ==
           true ProposedLazy 1 4601.2 ==
           true ProposedLazy 0 4684.5 ==
           true CallerResponsibility 60 4341.3 ==
           true CallerResponsibility 40 4286.1 ==
           true CallerResponsibility 30 4295.8 ==
           true CallerResponsibility 25 4343.6 ==
           true CallerResponsibility 20 4340.1 ==
           true CallerResponsibility 15 4285.8 ==
           true CallerResponsibility 10 4316.3 ==
           true CallerResponsibility 9 4288.5 ==
           true CallerResponsibility 8 4324.6 ==
           true CallerResponsibility 7 4304.1 ==
           true CallerResponsibility 6 4283.4 ==
           true CallerResponsibility 5 4505.9 ==
           true CallerResponsibility 4 4278.6 ==
           true CallerResponsibility 3 4385.2 ==
           true CallerResponsibility 2 4386.2 ==
           true CallerResponsibility 1 4293.1 ==
           true CallerResponsibility 0 4303.8 ==
          false Original 60 246.4 =
          false Original 40 170.7 =
          false Original 30 113.2 =
          false Original 25 103.6 =
          false Original 20 79.5 =
          false Original 15 68.7 =
          false Original 10 46.0 =
          false Original 9 47.5 =
          false Original 8 44.9 =
          false Original 7 40.2 =
          false Original 6 38.0 =
          false Original 5 33.0 =
          false Original 4 27.0 =
          false Original 3 24.6 =
          false Original 2 20.9 =
          false Original 1 17.3 =
          false Original 0 14.1 =
          false ProposedLazy 60 70.6 =
          false ProposedLazy 40 70.4 =
          false ProposedLazy 30 70.2 =
          false ProposedLazy 25 69.4 =
          false ProposedLazy 20 69.4 =
          false ProposedLazy 15 69.1 =
          false ProposedLazy 10 68.8 =
          false ProposedLazy 9 69.4 =
          false ProposedLazy 8 68.1 =
          false ProposedLazy 7 68.7 =
          false ProposedLazy 6 69.3 =
          false ProposedLazy 5 69.5 =
          false ProposedLazy 4 69.2 =
          false ProposedLazy 3 69.5 =
          false ProposedLazy 2 68.8 =
          false ProposedLazy 1 68.7 =
          false ProposedLazy 0 69.6 =
          false CallerResponsibility 60 4539.6 ==
          false CallerResponsibility 40 4308.1 ==
          false CallerResponsibility 30 4332.4 ==
          false CallerResponsibility 25 4284.4 ==
          false CallerResponsibility 20 4352.9 ==
          false CallerResponsibility 15 4305.9 ==
          false CallerResponsibility 10 4344.9 ==
          false CallerResponsibility 9 4365.5 ==
          false CallerResponsibility 8 4266.1 ==
          false CallerResponsibility 7 4279.6 ==
          false CallerResponsibility 6 4282.6 ==
          false CallerResponsibility 5 4307.1 ==
          false CallerResponsibility 4 4309.7 ==
          false CallerResponsibility 3 4323.3 ==
          false CallerResponsibility 2 4284.7 ==
          false CallerResponsibility 1 4379.7 ==
          false CallerResponsibility 0 4337.0 ==
......................................................................

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants