Java Streams

Everything about Java Streams put together¶

Boilerplate code¶

We’ll use the following code later to perform some of our streams operations:*

class Employee {
    private int ID;
    private String name;
    private double salary;
    Employee (int ID, String name, double salary) {
        this.ID = ID;
        this.name = name;
        this.salary = salary;
    }

    public double getSalary() {
        return salary;
    }
    public void incrementSalary(double percentage) {
        this.salary += (this.salary * percentage) / 100;
    }
}

class Main {
    public static void main(String[] args) {
        Employee[] employees = {
                new Employee(1, "Mickey Mouse", 100000.0),
                new Employee(2, "Donald Duck", 200000.0),
                new Employee(3, "Goofy Goo", 300000.0)
        };
        Employee brotherBear = new Employee(4, "Brother Bear", 5000.0);
        Employee mufasa = new Employee(5, "Mufasa - The Lion King", 500000.0);
    }
}

Java Streams Creation¶

Following are some snippets to help create streams.

Stream of Employees from an existing array¶

Stream<Employee> employeeStream = Stream.of(employees);

Stream of employees from an existing list¶

List<Employee> empList = Arrays.asList(employees);
Stream<Employee> employeeStream = empList.stream();

Stream from a list of individual Employee objects¶

Stream<Employee> employeeStream = Stream.of(brotherBear, mufasa);

Using Stream.Builder to build a stream of Employee objects¶

Stream.Builder<Employee> builder = Stream.builder();
builder.accept(brotherBear);
builder.accept(mufasa);
builder.accept(new Employee(6, "SherKhan", 450000.0));
Stream<Employee> employeeStream = builder.build();

Java Streams Operations¶

There are two types of stream operations — intermediate & terminal.

Terminal¶

An operation that marks the stream as consumed. And ends the stream operation.

Intermediate¶

An operation that returns a new stream after performing the supplied operation on input stream.

Following are some snippets of available operations on Java Streams and their usages.

forEach¶

Operation Type: Terminal

Loop over stream element and call the supplied function over each. forEach() is a terminal operation, i.e. once forEach() is called, the stream is considered to be consumed.

List<Employee> empList = Arrays.asList(employees);
empList.stream().forEach(e -> e.incrementSalary(10.0));

NOTE: Stream can be consumed only once. If attempted to consume after being consumed, the following exception is thrown: IllegalStateException: stream has already been operated upon or closed.

map¶

Operation Type: Intermediate

map() applies the supplied function to each element of the current stream and returns a new stream. The resultant stream can be of the same or different type.

List<Employee> empList = Arrays.asList(employees);
List<Double> salaries = empList.stream()
        .map(Employee::getSalary).collect(Collectors.toList());

// stream of integers returned as stream of squared versions of themselves.
Stream<Integer> squared = Stream.of(1, 2, 3, 4, 5).map(x -> x * x);

collect¶

Operation Type: Terminal

Once all the stream processing is done, we can use the collect() with a suitable collector option.

filter¶

Operation Type: Intermediate

filter() as the name suggests, helps filter a given stream.

List<Employee> employeesWithSalariesUnder200K = empList.stream()
                .filter(e -> e.getSalary() < 200000)
                .collect(Collectors.toList());

findFirst¶

Operation Type: Terminal

findFirst() returns an Optional for the first entry in the stream; the Optional can, of course, be empty.

Optional<Employee> employee = empList.stream().findFirst();

toArray¶

Operation Type: Terminal

collect() is used to collect the stream into a Collection. If we need to get an array out of the stream, we can simply use toArray().

Employee[] employees = empList.stream().toArray(Employee[]::new);

flatMap¶

Operation Type: Intermediate

flatMap() helps us flatten a complex stream.

List<List<String>> couples = Arrays.asList(
        Arrays.asList("Donald Duck", "Daisy Duck"),
        Arrays.asList("Mickey Mouse", "Minnie Mouse"),
        Arrays.asList("Nobita", "Shizuka"));

List<String> employees = couples.stream()
                .flatMap(Collection::stream)
                .collect(Collectors.toList());

peek¶

Operation Type: Intermediate

peek() is an intermediate operation that helps perform a function over each element of a stream.

Method Types and Pipelines¶

A stream pipeline consists of a stream source, followed by zero or more intermediate operations, and a terminal operation.

Example:

List<Employee> employeesWithSalariesUnder200K = empList.stream()
                .filter(e -> e.getSalary() < 200000)
                .collect(Collectors.toList());

Lazy Evaluation¶

One thing that significantly improves Java streams is the ability to evaluate operations lazily.

Employee employee = Stream.of(employees)
  .filter(e -> e != null)
  .filter(e -> e.getSalary() > 100000)
  .findFirst()
  .orElse(null);

Comparison Based Stream Operations¶

sorted¶

sorted() sorts the input stream based on the comparator passed inside it.

empList.stream()
      .sorted((e1, e2) -> e1.getSalary() > e2.getSalary())
      .collect(Collectors.toList());

min and max¶

As the name suggests, these are used to get the maximum or minimum element from a stream based on a comparator.

Employee highestSalariedEmployee = empList.stream()
      .max(Comparator.comparing(Employee::getSalary))
      .orElseThrow(NoSuchElementException::new);

Employee lowestSalariedEmployee = empList.stream()
      .min(Comparator.comparing(Employee::getSalary))
      .orElseThrow(NoSuchElementException::new);

allMatch, anyMatch, and noneMatch¶

All of these operations take a Predicate and return a boolean.

boolean allEven = intList.stream().allMatch(i -> i % 2 == 0);
boolean oneEven = intList.stream().anyMatch(i -> i % 2 == 0);
boolean noneMultipleOfThree = intList.stream().noneMatch(i -> i % 3 == 0);

Java Stream Specialisations¶

So far we have dealt with object streams. But, there exist streams to work with the primitive data types — IntStream, DoubleStream, LongStream.

Creation¶

Double latestEmpId = empList.stream()
      .mapToDouble(Employee::getSalary)
      .max()
      .orElseThrow(NoSuchElementException::new);

Specialised Operations¶

Specialised streams provide some additional operations that make dealing with numbers quite effortless.

Double averageSalary = empList.stream()
        .mapToDouble(Employee::getSalary)
        .average()
        .orElseThrow(NoSuchElementException::new);

Reduction Operations¶

A reduction is the process of combining a stream into a summarised result by applying a combination operation. We already saw a few reduction operations like findFirst(), min(), and max().

reduce()¶

Let’s see the general-purpose reduce() operation. The most common form of reduce() is:

T reduce(T identity, BinaryOperator<T> accumulator)

where identity is the initial value and accumulator is the repeating binary operation.

For example, sum of all salaries -

Double totalSalaries = empList.stream()
    .map(Employee::getSalary)
    .reduce(0.0, Double::sum);

Advanced collect¶

We already saw how we used Collectors.toList() to get the list out of the stream. Let’s now see a few more ways to collect elements from the stream.

joining¶

String empNames = empList.stream()
    .map(Employee::getName)
    .collect(Collectors.joining(", "))
    .toString();

Collectors.joining() helps join 2 strings by putting a delimiter between them by internally using java.util.StringJoiner.

toSet¶

We can also use toSet() to get a set out of stream elements:

@Test
public void whenCollectBySet_thenGetSet() {
    Set<String> empNames = empList.stream()
        .map(Employee::getName)
        .collect(Collectors.toSet());

    assertEquals(empNames.size(), 3);
}

toCollection¶

We can use Collectors.toCollection() to extract the elements into any other collection by passing in a Supplier<Collection>. We can also use a constructor reference for the Supplier:

Vector<String> empNames = empList.stream()
    .map(Employee::getName)
    .collect(Collectors.toCollection(Vector::new));

Here, an empty collection is created internally, and its add() method is called on each element of the stream.

summarizingDouble¶

If summarised statistics are a requirement that is to be built from a stream, summarizingDouble() is the collector. It applies a double-producing mapping function to each input element and returns a special class containing statistical information for the resulting values -

DoubleSummaryStatistics stats = empList.stream()
    .collect(Collectors.summarizingDouble(Employee::getSalary));

Integer count = stats.getCount();
Double sum = stats.getSum();
Double max = stats.getMax();
Double min = stats.getMin();
Double avg = stats.getAverage();

The DoubleSummaryStatistics objects get us statistics like – count, sum, min, max, average, etc.

summaryStatistics() can be used to generate similar results when we’re using one of the specialised streams -

DoubleSummaryStatistics stats = empList.stream()
    .mapToDouble(Employee::getSalary)
    .summaryStatistics();

partitioningBy¶

We can partition a stream into two — based on whether the elements satisfy certain criteria or not.

Let’s split our List of numerical data, into Even and Odds:

Map<Boolean, List<Integer>> mapOfEvenOdd = Stream.of(2, 4, 5, 6, 8).collect(
    Collectors.partitioningBy(i -> i % 2 == 0));

assertEquals(mapOfEvenOdd.get(true).size(), 4); // 4 even numbers
assertEquals(mapOfEvenOdd.get(false).size(), 1);  // 1 odd number

Here, the stream is partitioned into a Map, with evens and odds stored as true and false keys.

groupingBy¶

groupingBy() is an extension of partitioningBy(). It partitions the stream into more than two groups. It takes a classification function as its parameter. This classification function is applied to each element of the stream.

The value returned by the function is used as a key to the map that we get from the groupingBy() collector -

Map<Character, List<Employee>> groupByAlphabet = empList.stream().collect(
    Collectors.groupingBy(e -> new Character(e.getName().charAt(0))));

Here, we grouped the employees based on the initial character of their first name.

mapping¶

In the above example, we saw how we can use groupingBy() to group elements of the stream with the use of a Map. And we were able to group Employee objects using the first character in their first names. What if we wanted to map the first characters of their first names, with something other than Employee objects? Like mapping the first character of the first name with Employee IDs. That’s what we can achieve with mapping() -

Map<Character, List<Integer>> idsGroupedByFirstChar = empList.stream().collect(
    Collectors.groupingBy(e -> new Character(e.getName().charAt(0)),
        Collectors.mapping(Employee::getId, Collectors.toList())));

reducing¶

reducing() is similar to reduce() – which we explored before. It simply returns a collector which performs a reduction of its input elements -

Double percentage = 10.0;
Double salIncrOverhead = empList.stream().collect(Collectors.reducing(
    0.0, e -> e.getSalary() * percentage / 100, (s1, s2) -> s1 + s2));

Here, by reducing(), we are incrementing each Employee’s salary by 10% and then collecting all the increments. The overall operation is broken down into multiple pieces - Identity, Mapper, BinaryOperator. Here, 0.0 is the identity (initial value), e -> e.getSalary() * percentage / 100 is the mapper piece. BinaryOperator is the addition expression - (s1, s2) -> s1 + s2.

Parallel Streams¶

Parallel streams help us execute code in parallel on separate processor cores. The final result is the combination of each individual outcome.

empList.stream().parallel().forEach(e -> e.incrementSalary(10.0));

Here, incrementSalary would get executed on multiple elements in parallel. As in the case with writing multi-threaded code, one needs to be aware of a couple of things while using parallel():

Code is to be thread-safe. Special care is to be taken if operations performed access shared data.
If order is of importance, parallel streams should be avoided. The result after each run would differ.

Infinite Streams¶

At times, we might need a continuous stream of elements while still performing operations. Knowing the range of elements beforehand might not be possible, unlike List or Map where elements are pre-populated. We have Infinite Streams for such cases; also known as unbounded streams.

Two ways to generate infinite streams -

generate¶

Provide a Supplier which gets called anytime we need new stream elements to be generated.

Stream.generate(Math::random)
    .limit(5)
    .forEach(System.out::println);

With infinite streams, we need to provide an eventual termination condition. Here, we used limit(); it limits the stream to 5 random numbers, generated with the Supplier - Math.random()

iterate¶

iterate() takes two parameters - an initial value, called seed element, and a function which generates the next element using the previous value. iterate(), by design, is stateful and hence may not be useful in parallel streams -

List<Integer> firstFiveMultiplesOfTwo = Stream.iterate(2, i -> i * 2)
    .limit(5)
    .collect(Collectors.toList());

Here, 2 is the seed value, and the following expression is the lambda for consecutive iterations. The value 2 is passed to it, which generates 4, which continues until the total elements, including the seed value, amount to 5.