Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a fold method for Iterables #218

Closed
gissuebot opened this issue Oct 31, 2014 · 45 comments
Closed

Add a fold method for Iterables #218

gissuebot opened this issue Oct 31, 2014 · 45 comments

Comments

@gissuebot
Copy link

Original issue created by rwallace1979 on 2009-08-12 at 11:37 PM


I find myself always wanting to iterate over a collection to aggregate the
data and wishing Google Collections had a fold operation.

An implementation can be found on StackOverflow,
<http://stackoverflow.com/questions/950751/how-to-implement-a-list-fold-in-java/951337#951337>.

@gissuebot
Copy link
Author

Original comment posted by rwallace1979 on 2009-08-13 at 12:07 AM


A possibly easier implementation is to create a new FoldFunction type. And implement
foldLeft as

static <A, B> A foldLeft(Iterable<B> xs, A z, FoldFunction<A, B> f)
{
    A p = z;
    for (B x : xs)
    {
        p = f.apply(p, x);
    }
    return p;
}

interface FoldFunction<A, B>
{
    A apply(A a, B b);
}

@gissuebot
Copy link
Author

Original comment posted by andre.ruediger on 2009-08-13 at 08:52 AM


What about a compress() function? (Maybe there's a better name.) compress() would
have to return null in the case of failure. Like this:

static <T> Iterable<T> compress(Iterable<T> iterable, CompressFunction<T> f) {
    Iterator<T> i = iterable.iterator();

    if (!i.hasNext())
        return Collections.emptyList();

    T first = i.next();
    if (!i.hasNext())
        return ImmutableList.of(first);

    Stack<T> compressed = new Stack<T>();
    compressed.push(first);

    while (i.hasNext()) {
        T t = i.next();
        T c = f.compress(compressed.peek(), t);

        if (c != null) {
            compressed.pop();
            compressed.push(c);
        } else {
            compressed.push(t);
        }
    }

    return ImmutableList.copyOf(compressed);
}

interface CompressFunction<T> {
    T compress(T t1, T t2);
}

class Range {
    int from, to;
}

class CompressRanges implements CompressFunction<Range> {

    public Range compress(Range r1, Range r2) {
        if (r1.from > r2.to || r1.to < r2.from)
            return null;

        Range r = new Range();
        r.from = Math.min(r1.from, r2.from);
        r.to = Math.min(r1.to, r2.to);
        return r;
    }

}

I'd like to hear your input on this. Do my random thoughts make sense to anybody?

@gissuebot
Copy link
Author

Original comment posted by jvdneste on 2009-08-14 at 09:42 AM


I prefer the term reduce, and i'd also like to see it in the library.

/** (a, b, c, d) -> f(f(f(a, b), c), d) */
public static <T> T reduce(final Iterable<? extends T> gen, final Function2<? super 
T, ? super T, ? extends T> f) {
    final Iterator<? extends T> it = gen.iterator();
    if (!it.hasNext()) {
        return null;
    }
    T result = it.next();
    while (it.hasNext()) {
        result = f.apply(result, it.next());
    }
    return result;
}

/** (a, b, c, d), initial -> f(f(f(f(initial,a), b), c), d) */
public static <X,Y> X reduce(final Iterable<? extends Y> gen, final X initial, final 
Function2<? super X, ? super Y, ? extends X> function) {
    final Iterator<? extends Y> it = gen.iterator();
    if (!it.hasNext()) {
        return initial;
    }
    X acc = initial;
    while (it.hasNext()) {
        acc = function.apply(acc, it.next());
    }
    return acc;
}

@gissuebot
Copy link
Author

Original comment posted by jvdneste on 2009-08-18 at 08:42 AM


This requires the inclusion of a two-argument functor interface. I've seen negative
reactions on previous requests so I'm thinking there's little chance of this one
getting accepted. The fact that it keeps coming back accounts for something though.

@gissuebot
Copy link
Author

Original comment posted by kevinb9n on 2009-08-18 at 09:51 PM


I am not specifically declaring this to be out of scope, but it's important to realize
that this kind of functional programming stuff has never been our core focus. The
main reason we have Predicate at all is just that it's hard to filter an iterator by
hand, and a filter() library method needs something like that, so there it is. The
main reason we have Function is that MapMaker needs it.

@gissuebot
Copy link
Author

Original comment posted by kevinb9n on 2009-09-17 at 04:59 PM


(remember that "Accepted" != "promise that we will do it")


Status: Accepted
Labels: post-1.0

@gissuebot
Copy link
Author

Original comment posted by kevinb9n on 2009-09-17 at 05:45 PM


(No comment entered for this change.)


Labels: -post-1.0, Milestone-Post1.0

@gissuebot
Copy link
Author

Original comment posted by kevinb9n on 2009-09-17 at 05:57 PM


(No comment entered for this change.)


Labels: Type-Enhancement

@gissuebot
Copy link
Author

Original comment posted by kevinb@google.com on 2010-07-30 at 03:53 AM


(No comment entered for this change.)


Labels: -Milestone-Post1.0

@gissuebot
Copy link
Author

Original comment posted by dan.rosen on 2010-10-09 at 12:17 AM


A two arg version of Function isn't necessary. Just some sort of tuple type, let's say Pair<A, B>. So Rich's implementation reduces to:

static <A, B> A foldLeft(Iterable<B> xs, A z, Function<Pair<A, B>, A> f)
    {
        A p = z;
        for (B x : xs)
        {
            p = f.apply(new Pair(p, x));
        }
        return p;
    }

@gissuebot
Copy link
Author

Original comment posted by cgdecker on 2011-04-08 at 03:30 AM


Issue #446 has been merged into this issue.

@gissuebot
Copy link
Author

Original comment posted by cgdecker on 2011-04-08 at 04:48 AM


Creating a Pair type and using it like that just makes the semantics of the function less clear and forces lots of Pair objects to be created when they needn't be.

@gissuebot
Copy link
Author

Original comment posted by gscerbak on 2011-05-10 at 11:39 AM


I have written Iterable around AbstractLinkedIterator with static factory method called unfold which tkes first element and a Function to compute the next element.
Then I added static factory methods first() and rest() for Functions decomposing Iterables using Iterables.getFirst() and Iterables.skip().
Finally fold look as simply as this:

public static final <E, A> A fold(final Iterable<E> iterable,
        final Function<E, A> accumulator) {
    final Function<Iterable<E>, Iterable<E>> rest = rest();
    final Function<Iterable<E>, E> first = first();
    final Iterable<E> generator = Iterables.transform(unfold(iterable, rest),
            first);
    return Iterables.getLast(Iterables.transform(generator, accumulator));
}

The accumulator is simply a function - guava javadocs allow for functions with side effects, so it is possible to store the state of folding in the function itself. This is a trade-off between purity and problems with tuples in Java, resp. creating custom classes just for the accumulated value.

Example:

@Test
public final void sum() {
    final Function<Integer, Integer> sum = new Function<Integer, Integer>() {
        private Integer sum = 0;

        @Override
        public Integer apply(final Integer input) {
            sum += input;
            return sum;
        }
    };
    assertThat(fold(limit(unfold(1, successor), 100), sum), is(5050));
}

If you find this approach useful, just let me know, I will polish the source and I will publish it.

@gissuebot
Copy link
Author

Original comment posted by kevinb@google.com on 2011-07-13 at 06:18 PM


(No comment entered for this change.)


Status: Triaged

@gissuebot
Copy link
Author

Original comment posted by fry@google.com on 2011-12-10 at 03:40 PM


(No comment entered for this change.)


Labels: Package-Collect

@gissuebot
Copy link
Author

Original comment posted by fry@google.com on 2012-02-16 at 07:17 PM


(No comment entered for this change.)


Status: Acknowledged

@gissuebot
Copy link
Author

Original comment posted by qualidafial on 2012-04-09 at 05:33 PM


In my own project I implemented folding as follows:

  interface Reducer<A, B> {
    A reduce(A a, B b);
  }

  class Iterables {
    static <A, B> A reduce(Iterable<B> xs, A z, Reducer<A, B> reducer);
  }

  class BigDecimals {
    static Reducer<BigDecimal, BigDecimal> sum();
  }

With these in place I could reduce a list to a sum as follows:

  Iterable<BigDecimal> lineTotals = ...
  BigDecimal grandTotal = Iterables.reduce(lineTotals, BigDecimals.sum());

I hope you guys will consider adding this feature to Guava; with this one exception, Guava has always seemed to have exactly what I need for data processing.

If there is interest from the project maintainers in adding this feature, I'd be delighted to do the work and submit a patchset.

@gissuebot
Copy link
Author

Original comment posted by wasserman.louis on 2012-04-09 at 07:17 PM


FYI, I think we're astronomically more concerned about the API design, the potential for misuse, and our already existing concern over the readability of the functional idioms we provide (pending Java 8), than the difficulty of adding the feature once an API is endorsed, but we'll definitely update this issue if we decide to pursue folds and the like.

@gissuebot
Copy link
Author

Original comment posted by kevinb@google.com on 2012-05-30 at 07:43 PM


(No comment entered for this change.)


Labels: -Type-Enhancement, Type-Addition

@gissuebot
Copy link
Author

Original comment posted by kevinb@google.com on 2012-06-22 at 06:16 PM


(No comment entered for this change.)


Status: Research

@gissuebot
Copy link
Author

Original comment posted by phidias51 on 2012-07-09 at 08:21 PM


I recently found myself needing the same type of functionality for calculating basic statistical functions on lists of numbers. Since the interest within the project seems limited, has anyone given any thought to making this a separate project?

@gissuebot
Copy link
Author

Original comment posted by wasserman.louis on 2012-07-09 at 08:46 PM


This certainly exists in other projects (functional-java is an example), but...yeah.

(I have to admit, I still don't see what advantage this sort of thing would gain over a traditional for-loop.)

@gissuebot
Copy link
Author

Original comment posted by phidias51 on 2012-07-09 at 09:39 PM


In my case I want to be able to do something like this:

Double sum = List.reduce(List<Double> values, SumReduction summation)

Additional classes for StandardDeviation, Mean, Median, Max, Min would also be useful. Having the interfaces and a couple of simple implementations would make it easier for people to be able to implement more complex mathematical functions.

@gissuebot
Copy link
Author

Original comment posted by wasserman.louis on 2012-07-10 at 01:04 PM


...I'm still not seeing what possible advantage this would have over methods named e.g. sum(List<Double> values), mean(List<Double> values), median(List<Double> values), etc. If you write those methods out, they're almost certainly shorter than "functional-style" reducers based on anonymous classes. If you want to be able to plug different ones in, you're still probably better off with just a Function<List<Double>, Double>.

Furthermore, some of those operations can't be implemented as simple folds.

@gissuebot
Copy link
Author

Original comment posted by phidias51 on 2012-07-10 at 03:53 PM


The problem with creating specific methods like sum, mean, etc is that it's always incomplete in someone's mind. By creating an interface and a couple of simple implementations, it demonstrates how implementations will be used. If someone wants to then spinoff an SPI project to implement a more complete set of functions, or to create libraries of different types of functions, they could do that.

Although the Function<List<Double>, Double> complies with the generic signature of the Function interface, it doesn't make sense from a practical sense. As someone mentioned earlier, the intent of the Function interface is to perform an operation on a each object in a collection with the intent being that you could end up with a transformed copy of the list, or a transformed instance of the original list.

If you accidentally tried to use one of these implementations (of Function<List<Double>, Double>) with Lists.transform, the code would compile, but you would have runtime errors when you ran it. Therefore, I think a new interface with a different method signature would better communicate the intent of these types of aggregate functions.

Could you give me some examples of "operations [that] can't be implemented as simple folds"? I'm not claiming that it's a panacea, nor doubting what you say, merely interested in what you mean.

@gissuebot
Copy link
Author

Original comment posted by qualidafial on 2012-07-10 at 08:46 PM


Standard deviation cannot be implemented as a simple fold, since you must first do one pass to determine the mean, then a second pass to accumulate the square of the difference from the mean.

@gissuebot
Copy link
Author

Original comment posted by wasserman.louis on 2012-07-10 at 10:44 PM


And median can't be implemented as a simple fold, since all the implementations I know of require either multiple passes, or at least O(n) extra storage, which kind of defeats the point.

I'm still not really getting the point, though. If you want to write a project to implement more functions...you write a library with a bunch of static methods: foo(List<Double>), or whatever. (I also vaguely recall that Guava's been working on some statistics utilities, which we might see sometime in the future...)

In any event, the only advantage I can think of to making these objects, instead of e.g. static methods, is being able to pass them around polymorphically and then apply them...which is a need satisfied by Function<List<Double>, Double>. I'm not suggesting anyone use Lists.transform with such a function, but I am saying that you can pass them around polymorphically and call their apply() method, which seems to me like it covers the one advantage objects would have over functions.

@gissuebot
Copy link
Author

Original comment posted by ian.b.robertson on 2012-07-24 at 05:09 PM


<pedantry>Standard deviation could be implemented with a fold, although the accumulator would need to be accumulating both sum(x_i) and sum(x_i)^2.</pedantry>
That said, I agree with Louis - for loops aren't that bad, and they perform well.

@gissuebot
Copy link
Author

Original comment posted by bytefu on 2013-08-30 at 08:29 AM


That said, I agree with Louis - for loops aren't that bad, and they perform well.
Assembler code isn't that bad, and it performs well. Does it mean that we don't need C and Java? 'fold' is a standard idiom for processing lists, after all.

@gissuebot
Copy link
Author

Original comment posted by ceefour666 on 2013-08-30 at 08:45 AM


+1 for #29 above.

The nicest thing about anything that takes closures is that they translate well to the function composition world, that we're already seeing partly with Guava, and will be even more when Java 8 Lambda comes along (syntax sugar to write the FoldFunction). Project Sumatra and proposed Java 9 parallelism also uses this.

@gissuebot
Copy link
Author

Original comment posted by orionllmain on 2013-08-30 at 08:57 AM


Guys, introducing fold without adding other high order functions like skipWhile, takeWhile, reduce, groupBy, flatMap, indexWhere and so on is not the best idea, I think. Guava is not an FP library. If you want fold, use totallylazy. Or Scala.

@gissuebot
Copy link
Author

Original comment posted by bytefu on 2013-08-31 at 08:45 AM


-> #31
So, you see these as FP only, despite their clear correlation to list processing? Guava has few tools for processing collections, why not add some more if people need them? Or it's like: "It's our library, we don't want this stuff. People use it and request some features? Fork 'em!"

Guava already has filter() and transform(). They're FP, they don't belong there. Or do they? I don't see any real reasons not to include some "FP" methods for collection processing. They reduce :) code, they are faster to write than beloved for-loops, and they look nicer in IntelliJ Idea even today, without Java 8 with it's nya-looking lambdas.

You suggest to use some other library or even Scala just for one or two missing functions? I don't want my pom.xml to become separate project, just because I use a library every time I need a method.

Finally, do Guava developers have any reasons to not to add fold() and stuff, other than ideological (because they think almost every list processing idiom is FP only)? "Open source" not only means that sources are open (pretty obvious, huh), but also means some cultural things - like sharing ideas, receiving requests and implementing some useful stuff, you know. Position like "I don't need that, so nobody does" seems kind of strange here.

@gissuebot
Copy link
Author

Original comment posted by orionllmain on 2013-08-31 at 04:24 PM


I want only say that fold is definitely not enough for me. Why fold? Why not flatMap, maxBy or zip? You either add all of these methods, or you add nothing. Adding only fold looks ridiculous.

@gissuebot
Copy link
Author

Original comment posted by xaerxess on 2013-08-31 at 09:17 PM


Why fold? Why not flatMap, maxBy or zip?
 - flatMap - there's FluentIterable#transformAndConcat,
 - maxBy - use Ordering#onResultOf + Ordering#max,
 - zip - not in Guava - requires Pair / Tuple + see this issue.

I don't see any real reasons not to include some "FP" methods for collection processing.
They would require adding many things to Guava API (two-arg functions, pair, etc.) - see new java.util.function package from Java 8 - but contrary to Java 8, functional idioms in Java 6/7 are horribly verbose and ugly.

Although Guava could adopt some names / API from Project Lambda to "backport" FP goodies to Java 6/7, it would be painful to use fold without language support (i.e. without lambdas). And bringing FP idioms to Java as such is separate discussion, for example my collegue believes that average "enterprise" Java programmer will have troubles with lambdas and they are not necessary in the language where for-loop is a common and well known idiom. (FYI: I know / have been using higher order functions in Lisp / Perl and it's not the same experience for both code writer and reader in Java, at least without lambdas in core for which I am waiting impatiently.)

@gissuebot
Copy link
Author

Original comment posted by lowasser@google.com on 2013-12-03 at 07:56 PM


Issue #1601 has been merged into this issue.

@gissuebot
Copy link
Author

Original comment posted by lowasser@google.com on 2013-12-03 at 07:56 PM


Issue #1601 has been merged into this issue.

@gissuebot
Copy link
Author

Original comment posted by jysjys1486 on 2014-02-04 at 08:24 AM


The following assumes sequential operation:

interface Collector<T,A,R> {
   A newContainer();
   void accumulate(A container, T element);
   R finish(A container);
}
Like Java 8 Collector, though functions of supplying and accumulating is pushed to the interface.

<T,A,R> R collect(Iterable<? extends T> elements, Collector<? super T, A, ? extends R> collector) {
   final A container = collector.newContainer();
   for (T each : elements)
      collector.accumulate(container,each);
   return collector.finish(container);
}
Typical mutable reduction implemention

class Collectors {

   public static <T, C extends Collection<T>> Collector<T,?,C> toCollection(Supplier<? extends C> factory) {
      // ...
   }

   // ...

}
Utility for collectors.

@gissuebot
Copy link
Author

Original comment posted by Lord.Laraby on 2014-03-26 at 11:42 PM


I just use this arrangement:
public interface Foldable<A, R> {
    R fold(A left, R right);
    /**
     * reduce or fold the {@link Iterable} with an initial value and folding function from left to right
     * @param list
     * @param acc
     * @param f the Foldable (interface override)
     * @return
     /
    static <A, R> R foldRight(final Iterable<A> list, final R acc, final Foldable<A, R> f) {
        if (Iterables.isEmpty(list))
            return acc;
        return foldRight(Iterables.skip(list, 1), f.fold(Iterables.getFirst(list, (A)null), acc), f);
    }
    /
*
     * reduce or fold the {@link Iterable} with an initial value and folding function from right to left
     *
     * @param list the iterable list or set etc.
     * @param acc initial value
     * @param f the Foldable (interface override)
     * @return the result of folding from right to left
     */
    static <A, R> R foldLeft(final Iterable<A> list, final R acc, final Foldable<A, R> f) {
        if (Iterables.isEmpty(list))
            return acc;
        return f.fold(Iterables.getFirst(list, (A)null), foldLeft(Iterables.skip(list, 1), acc, f));
    }
}

@gissuebot
Copy link
Author

Original comment posted by wasserman.louis on 2014-03-27 at 01:45 AM


The nested Iterables.skips will give extremely terrible performance, certainly quadratic if not worse. You should really just be folding over the Iterator instead.

@kluever
Copy link
Member

kluever commented Apr 10, 2015

At this point we're not adding any more functional features to Guava because of Java8.

@kluever kluever closed this as completed Apr 10, 2015
@jffiorillo
Copy link

But.... What happens with Android (Android still doesn't have Java 8) or others frameworks without Java 8?

@ekovacs
Copy link

ekovacs commented Apr 26, 2016

exactly... on Java 8 i am using the language but where i can't use java 8 i would want to rely on 3rd party libraries, like guava...

@jffiorillo
Copy link

jffiorillo commented Apr 26, 2016

@ekovacs : I guess they have stopped this because they are migrating to Java 8 in Android N.

@kevinb9n
Copy link
Contributor

No, it's not that, we'll be supporting a pre-Java-8 Guava fork for quite a long time. And we're committed to maintaining that as a high quality experience. However, that doesn't go so far as to cover investing in brand new features that can only at best hope to be poor substitutes for what's coming in 8.

@nmarasoiu
Copy link

I have not found a decent way to fold in Java 8 ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants