New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggregator Functions #546
Comments
Original comment posted by SeanPFloyd on 2011-02-10 at 06:41 PM Darn, forgot to delete a main method for testing. Here's a new Version of the class |
Original comment posted by jim.andreou on 2011-02-10 at 08:12 PM Sigh. I agree for the need, but I find the (usage) hassle huge. Especially if you compare this code and the equivalent scala code... it's painful. I do play around with such designs for experimental data analysis. Here is an old attempt, somewhere 4 years back: http://code.google.com/p/jbenchy/ The point is it made my life easier by various shortcuts, i.e. your "I" would be a tuple, "S" would be just the name of the component (and the selector function would be implied, instead of having to painfully spell it out as an inner class), and so on. But the whole thing, in its full generality, while fundamental and useful, does not feel right for Java to me; it's such an uphill battle. I would aggressively take shortcuts to make client code simpler, but then it wouln't be general enough for guava. |
Original comment posted by attilan3i on 2011-02-25 at 03:13 PM I don't like the idea of specifying the selector function either. There might be a way to add aggregation support to guava in some form, but I don't think this will be it. I see an aggregator more like a new interface in guava and not Functions that perform on Iterables. I think the most natural representation of an aggregator would be: public interface Aggregator<I, O> { There can be several utility methods gathered into a class called Aggregators, processing all elements of an iterable would be one of them: public class Aggregators { I like this approach better, but I'm still not sure if this has a place in guava. |
Original comment posted by SeanPFloyd on 2011-02-25 at 04:35 PM attilan3i: I agree, your approach is very nice. I would be very happy with that as well. |
Original comment posted by g.e.dejong on 2011-03-13 at 12:30 PM Hmmm, this is just an instance of a foldl or foldr method, possibly with extra state. However, the Java language is just too verbose to fully encompass it in a usable matter. Still, I see a possibility looming. Sketching it here: public interface AggregateState<T, K> { public class DoubleAverageState implements AggregateState<Double, Double> { public void add(Double d) { public Double state() { public class AggregateStateFunction<T, K> implements BinaryFunction<AggregateState<T, K>, AggregateState<T, K>, T> { Iterable<Double> d = Arrays.asList(1d, 2d, 3d, 4d); // Untested, but the meaning should be clear. |
Original comment posted by gscerbak on 2011-04-14 at 08:30 PM Folding is nice, selector function unnecessary, AggregatorState could be IMHO replaced with the name Accumulator. I don't know what are the reasons for not having indexed Iterators, but I guess they would be similar to those for not having Aggregator/Fold. Maybe one has to put up with "for" loop in Java... |
Original comment posted by g.e.dejong on 2011-04-20 at 09:44 AM Accumulator definitely sounds better. The function (or functor) is there because it allows different behaviour when using the AggregateState. Graphically, this looks like this: s0 --> s1 --> s2 --> s3 A is an iterator, f is the function (or functor), and the state is seperate of that (except for type parameterisation). By the way, in our implementation, the AggregateStateFunctor is a little bit more involved: public static class AggregateStateFunctor<T, K> (Functors eq. Google Functions here). The aggregate function then becomes: public static <T, K> K aggregate(Iterable< ? extends T> it, and as an examle a static method top (returns the top elements of a list): public static <T extends Comparable< ? super T>> Collection<T> top(Iterable< ? extends T> it) which, to me, looks rather elegant. It allows you to use ComparableTopAggregateState in another context. |
Original comment posted by em...@soldal.org on 2011-07-13 at 10:49 PM We use something similar to this quite regularly in our code. Collator<T>(Comparator<T>,Function<Iterable<T>,Void>).collate(Iterable<T>) We use the function basically as a simple closure which is called with sets of objects which are equivalent in the eyes of the comparator. We've found this to be invaluable when trying to optimise away a lot of needless sql joins. By working only with objects that are similar, we can reduce the complexity required elsewhere. |
Original comment posted by kevinb@google.com on 2011-07-16 at 08:21 PM Thanks for the suggestions. Unfortunately, Guava really does not want to go in this direction (higher-order functional programming). We have things like Function and Predicate because we need them for various reasons, but we don't think FP is a good fit for Java as it currently stands in general. If you disagree, perhaps try http://functionaljava.org/ (a library which I, of course, strongly dislike, but to each his own). Status: |
Original issue created by SeanPFloyd on 2011-02-10 at 06:39 PM
I would like to see a way to perform aggregation on collections / iterables.
Example a: Given a List of Integers, add them all up
Example b: Given a Collection of Person Objects, calculate their average age by querying their birthDate fields
My original idea was for Example b) to add an aggregate() method to Iterables:
But since this is basically a one-liner (the docs are much longer than the method :-)) and calling a generic method with many parameters would can be a real pain, I would instead suggest to create a class Aggregates that contains Aggregator functions only. I have added a sample implementation with some functions I could think of (see attached file):
public static Function<Iterable<String>,Integer> averageLength()
A Function that returns the average length of the Strings in an Iterable.
public static Function<Iterable<? extends Number>,BigDecimal> averageOfFloats()
A Function that returns a BigDecimal that corresponds to the average of all numeric values passed from the iterable.
public static Function<Iterable<? extends Number>,BigInteger> averageOfIntegers()
A Function that returns a BigInteger that corresponds to the average of all numeric values passed from the iterable.
public static Function<Iterable<String>,Integer> maxLength()
A Function that returns the length of the longest String in an Iterable.
public static Function<Iterable<String>,Integer> minLength()
A Function that returns the length of the shortest String in an Iterable.
public static Function<Iterable<? extends Number>,BigDecimal> sumOfFloats()
A Function that returns a BigDecimal that corresponds to the sum of all numeric values passed from the iterable.
public static Function<Iterable<? extends Number>,BigInteger> sumOfIntegers()
A Function that returns a BigInteger that corresponds to the integer sum of all numeric values passed from the iterable.
From the examples above, this solves scenario 1:
BigInteger sum = Aggregators.sumOfIntegers().apply(listOfIntegers);
For scenario 2, it would have to be something like this:
(or add Aggregators.averageDate() )
The text was updated successfully, but these errors were encountered: