My favorites | Sign in
Google
                
Search
for
Updated Apr 07, 2009 by kevinb9n
Labels: Featured
Faq  
Frequently anticipated questions

General

Is the release candidate safe to use?

Absolutely. It's been proven extensively in production in services like GMail, Reader, Blogger, Docs & Spreadsheets, AdWords, AdSense and dozens more. However, it is still possible that we will need to make a small number of API-level changes between now and when the release is finalized. If you're skittish about change, just wait for 1.0 to be finalized.

Should I use it in my own library which I release publicly?

Sure! But until 1.0 is final, be careful. Use it only in the internals of your implementation; don't expose it in your public API. Use Jar Jar Links or a similar tool to make sure your users won't have any conflicts.

Once 1.0 is released, you should be safe to do just about anything with it.

Should I serialize your collections persistently and expect to be able to deserialize them with future versions of your library?

NOT YET! Wait for 1.0 to go final, please!

Why did Google build all this, when it could have tried to improve the Apache Commons Collections instead?

The Apache Commons Collections very clearly did not meet our needs. It does not use generics, which is a problem for us as we hate to get compilation warnings from our code. It has also been in a "holding pattern" for a long time. We could see that it would require a pretty major investment from us to fix it up until we were happy to use it, and in the meantime, our own library was already growing organically.

An important difference between the Apache library and ours is that our collections very faithfully adhere to the contracts specified by the JDK interfaces they implement. If you review the Apache documentation, you'll find countless examples of violations. They deserve credit for pointing these out so clearly, but still, deviating from standard collection behavior is risky! You must be careful what you do with such a collection; bugs are always just waiting to happen.

Our collections are fully generified and never violate their contracts (with isolated exceptions, where JDK implementations have set a strong precedent for acceptable violations). This means you can pass one of our collections to any method that expects a Collection and feel pretty confident that things will work exactly as they should.

Why build on Java 5, instead of 6?

We had planned on moving to Java 6 in time for the 1.0 release, so that our library could build upon interfaces like NavigableSet. However, this was deprioritized. We can deliver 99% of the value while sticking to Java 5, so it's senseless to cut off a large chunk of user base.

Why build on Java 5, instead of 1.4?

Because we hate Java 1.4. Just kidding (but we do). Basically, at Google we simply don't use Java 1.4 anymore, and haven't for years (GWT excepted, but even that has worked with 1.5 for a while now). If you're using 1.4, please try feeding our library into Retrotranslator. Try out the results, tell us how it goes, and please kindly send us any patches you needed to make to our code to get it working.

Design

Why so much emphasis on Iterators and Iterables?

In general, our methods do not require a Collection to be passed in when an Iterable or Iterator would suffice. This distinction is important to us, as sometimes at Google we work with very large quantities of data, which may be too large to fit in memory, but which can be traversed from beginning to end in the course of some computation. Such data structures can be implemented as collections, but most of their methods would have to either throw an exception, return a wrong answer, or perform abysmally. For these situations, Collection is a very poor fit; a square peg in a round hole.

An Iterator represents a one-way scrollable "stream" of elements, and an Iterable is anything which can spawn independent iterators. A Collection is much, much more than this, so we only require it when we need to.

Why do the non-Collection iterables you return implement toString() but not equals() or hashCode()?

It's hard to imagine any equals() implementation that would be useful, given that it must return false when given any List or Set in order to maintain the transitive property.

It's possible hashCode() would be safe to implement, but it seems pointless. If you have an iterable you want to store inside some other collection, you should really copy the contents into a real collection of their own first.

toString() is potentially harmful as it will not perform as anyone expects; it could take arbitrarily long to run if the data set is large.

Why are so many implementation classes marked final?

Designing and documenting classes of a public API is hard stuff, and designing and documented to be safely extended is a hundred times harder. We decided that in most cases when we're interested in extending a collection, we'd really be fine with just writing a decorator for it instead. So, we've provided the full complement of "forwarding collections", to make writing decorators easy, and we're advocating that approach.

If there are final classes that you have a very compelling reason to subclass, we'll hear your case.

Why are the names Multiset and Multimap, not MultiSet and MultiMap?

Because "multiset" is a single, unhyphenated word, and we don't capitalize random letters inside those unless we have a good reason to.

Aha, but "bimap" is also a single word! So why is it BiMap, not Bimap?

This case seems analogous but is slightly different. A BiMap is also a Map; it extends the Map interface, and we felt it was important to make that connection clear. This is not the case for Multimap, which emphatically does not extend Map, nor does Multiset extend Set.

Why does BiMap.put(newKey, existingValue) throw an exception instead of just remapping the value?

Because this method comes from the Map interface, and such behavior on a Map would violate the principle of least surprise. If this is the behavior you want, just use BiMap.forcePut() instead.

Why is BiMap.putAll() allowed to leave the bimap in an indeterminate state if it throws an exception?

In general, a method that throws an exception ought to leave the instance with its state unchanged. However, this is not always feasible. For a typical bimap implementation, it would be downright ugly, and slow. Note that this is no different from the regular behavior of Map.putAll(), Collection.addAll(), etc.

Why does BiMap have no getKeyForValue() method?

We did think about it (Doug Lea even half-jokingly suggested naming it teg()!). But you don't really need it; just call inverse().get(). If this method did exist, every implementor of the interface would have to write it over again, and would probably do it exactly like that.

ClassToInstanceMap is interesting, but I need to map a type T to a Foo<T> / I need a ClassToInstanceMultimap / etc. How?

If your goal is to maintain type-safety and avoid casts when using the API "normally", but you don't care to go to great lengths to prevent the wrong types of objects being added, it's always been easy to do this yourself:

  @SuppressWarnings("unchecked")
  public static class ClassToFooMap extends HashMap<Class<?>, Foo<?>> {
    public <T> Foo<T> putInstance(Class<T> type, Foo<T> value) {
      return (Foo<T>) put(type, value);
    }
    public <T> Foo<T> getInstance(Class<T> type) {
      return (Foo<T>) get(type);
    }
  }

Why do you use the type <E extends Comparable> in various APIs, which is not "fully generified"? Shouldn't it be <E extends Comparable<?>>, <E extends Comparable<E>> or <E extends Comparable<? super E>>?

The last suggestion is the correct one, as explained in Effective Java. However, we will be using <E extends Comparable<E>> on parameterless methods in order to work around a hideous javac bug. This will cause you problems when you use a very unusual type like java.sql.Timestamp which is comparable to a supertype. (Needs more explanation.)

Why does Multimap have no putAll(K, Iterable<V>) or putAll(Map<K,V>) methods?

These operations can be performed rather simply. First, it's important to realize that the get(key) method of Multimap returns a "live view" of the collection corresponding to that key. So, multimap.get(myKey).putAll(myValues) accomplishes the first task.

The second task is also simple to perform, because any Map can be viewed as a Multimap: multimap.putAll(Multimaps.forMap(map)).

These "workarounds" are simple enough that we did not want to add two additional methods to Multimap interface, which is already quite large enough as it is.


Comment by koen.handekyn, Oct 30, 2007

i have been happily using the generic and functional rewrite of the apache commons collections framework @ http://larvalabs.com/collections/. specifically the functional extensions i like very much.

i support your efforts fully however, and specifically the plans to standardize it into the jdk.

don't hesitate to look for some inspiration at the mentioned project (http://larvalabs.com/collections/), again specifically to the functional parts.

Comment by olaf.ber...@gmx.de, Jan 29, 2008

Something I have always been wondering about since getting to know Apache Commons Collections: given the power and expressiveness of Predicate, Transformer/Function and the like, why does nobody extend the Collection interfaces with methods like

- filter(Predicate pred) - transform(Transformer/Function func) - collect(Predicate pred)

and so forth? Why do we have to resort to using static utility methods, which I find to be not very OO and, frankly, clumsy?

Is it the desire to leave the original Collection API untouched, avoiding the need to downcast to customized interfaces?

Comment by rick.beton, Jul 21, 2008

Why use the name Multiset when Bag is quite sufficiently self-descriptive? The vague conceptual similarity between Multisets and Multimaps is not important enough to justify introducing the term Multiset.

Comment by kevinb9n, Jul 21, 2008

Rick, in our experience Multiset is the more commonly-used term. See, as one example, the wikipedia disambiguation page for "Bag":

"In set theory and computer science, a multiset, sometimes called a "bag", is a data structure."

Comment by omkaar1, Jul 27, 2008

When are the tests coming?

Comment by nistem, Aug 19, 2008

Would be nice to have a concrete use case example.

Comment by akarnokd, Aug 19, 2008

It would be nice to have examples of the classes, probably with before-after code like in the presentations.

Comment by tovare, Sep 21, 2008

+1 for changing Multisets name to Bag. I get the Computer Science/Wikipedia rationale, but a Bag is nicer, simpler and a lot more intutive because it's a good and working metaphor.

Comment by pyrogx1133, Oct 02, 2008

If Bag was truly intuitive, then there should be a method called get() that returns a random element.

Intuitively a Bag contains a bunch of stuff that you pull out randomly.

Multiset is actually far more intuitive.

Comment by tovare, Oct 26, 2008

@pyrogx1133, I usually don't get() up random() things from Bags. I see for instance some attribute of the (size,color,title) of the book, do a get() book operation and woila I have the book I wanted. Now ofcourse, there may be different scenarious, like getting pencil at the bottom of the bag, where I must get a book before being able to reach the pencil.

Not all invented words are good words, multiset is one of the bad ones. Multisets means nothing, it could have meant multiple sets, but it doesn't ... it is just a badly invented word.

Comment by Alex.X.Zhang, Nov 06, 2008

I prefer the name multiset. I still cannot get why "bag" is intuitive. Also the name has to be google-able. Try google bag.

Comment by ericjswrycan, Nov 30, 2008

I agree Multiset is an awful coining. And in my own experience it's hardly in common use among Java users. Bag is much more familiar from other collections frameworks such as the Commons. Why use a longer, clunkier word when a short, simple one is available?

Comment by Ionized, Dec 02, 2008

Why does Maps.capacity aim for a table that is 25-50% full when it contains the expectedSize entries? Have there been performance tests on this?

Comment by private.biron.ran, Jan 04, 2009

How would you mix a reference map with a Multimap? Specifically, I'd like a weak reference key and a collection with weak references in it, all cleaned up automatically.

Comment by iwan.memruk, Feb 19, 2009

where did Sets.immutableSet() go?!

Comment by dolphin.wan, Apr 08, 2009

> where did Sets.immutableSet() go?!

You can use:

ImmutableSet?.of() ImmutableSet?.of(E...) ImmutableSet?.of(E)

See http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/ImmutableSet.html

Comment by da...@randombits.org, Apr 14, 2009

Hi guys,

The library looks great, and I noticed the '@GwtCompatible?' attributes, but can't figure out how get them to actually work in GWT, since there is no Module XML file, or source code in the library.

Do we have to roll our own?

Comment by kevinb9n, Apr 14, 2009

It will be part of the final 1.0 release. I'll try to get the remaining classes you need copied out here soon.

Comment by da...@randombits.org, Apr 17, 2009

Ok, thanks. I'll check back periodically.

Comment by johnsoneal, Apr 18, 2009

I’m a bit confused, in the presentation and in the "Coding in the small with Google Collections" there is mention of Constrained collections. In the rc1 these have been documented as removed, is this short term or permanent?

Comment by xmlking, May 05, 2009

how to create an immutable List from Array ? If I use ImmutableList?.of(), it is doing duplicate job of converting the argument into array!

Comment by kevinb9n, May 06, 2009

ImmutableList?.of(array) does copy the array once, but it has to, or it could not be immutable.

Comment by danrbr, May 13, 2009

Great project! We made this work with Hamcrest and Fork Join operators on our little project called Fluent Java, and were very happy with the results.

Comment by estebistec, May 22, 2009

Was there an estimated date for the 1.0 release?

Comment by jared.l.levy, May 22, 2009

We decided to remove the constrained collections permanently, since they were hardly ever used by Google's Java code.

Comment by suttons, Jun 02, 2009

Can someone point me to a Jared version of the Google Collections documentation?

Comment by bertvanbrakel, Jul 08, 2009

+1 on changing Multset to MultiBag? as Set already means something in Java. If a Multiset is not a Set then find a different name. We can be academic about this and continue backing Set because it's the most correct usage of the word, however this is not the aim. The aim is to write a library which is easily usable with the minimuim amount of learning and surprise. Using Set IMHO will just cause confusion and will be something which will be regretted down the track (when a set is not a set). A intuitive easy to work with API is better than one which gets the academics all hot and sweaty.

The API docs already state 'A multiset is also sometimes called a bag', why not just use it and remove confusion

Comment by bertvanbrakel, Jul 08, 2009

When I use autocomlete and I see the word 'Set' I assume it's a set. Don't make me lookup why it isn't.

MultiMap? has the same issue

Comment by david.sheldon, Aug 27, 2009

What happened to PrimitiveArrays?? For example "PrimitiveArrays?.asList(...)"

Comment by xyyzzzer, Oct 15, 2009

Why there is no Stack? Or I just have not found it.


Sign in to add a comment