Java Streams
Everything about Java Streams put together¶
Boilerplate code¶
- We’ll use the following code later to perform some of our streams operations:*
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | |
Java Streams Creation¶
Following are some snippets to help create streams.
Stream of Employees from an existing array¶
1 | |
Stream of employees from an existing list¶
1 2 | |
Stream from a list of individual Employee objects¶
1 | |
Using Stream.Builder to build a stream of Employee objects¶
1 2 3 4 5 | |
Java Streams Operations¶
There are two types of stream operations — intermediate & terminal.
Terminal¶
An operation that marks the stream as consumed. And ends the stream operation.
Intermediate¶
An operation that returns a new stream after performing the supplied operation on input stream.
Following are some snippets of available operations on Java Streams and their usages.
forEach¶
Operation Type: Terminal
Loop over stream element and call the supplied function over each. forEach() is a terminal operation, i.e. once forEach() is called, the stream is considered to be consumed.
1 2 | |
NOTE: Stream can be consumed only once. If attempted to consume after being consumed, the following exception is thrown: IllegalStateException: stream has already been operated upon or closed.
map¶
Operation Type: Intermediate
map() applies the supplied function to each element of the current stream and returns a new stream. The resultant stream can be of the same or different type.
1 2 3 | |
1 2 | |
collect¶
Operation Type: Terminal
Once all the stream processing is done, we can use the collect() with a suitable collector option.
filter¶
Operation Type: Intermediate
filter() as the name suggests, helps filter a given stream.
1 2 3 | |
findFirst¶
Operation Type: Terminal
findFirst() returns an Optional for the first entry in the stream; the Optional can, of course, be empty.
1 | |
toArray¶
Operation Type: Terminal
collect() is used to collect the stream into a Collection. If we need to get an array out of the stream, we can simply use toArray().
1 | |
flatMap¶
Operation Type: Intermediate
flatMap() helps us flatten a complex stream.
1 2 3 4 5 6 7 8 | |
peek¶
Operation Type: Intermediate
peek() is an intermediate operation that helps perform a function over each element of a stream.
Method Types and Pipelines¶
A stream pipeline consists of a stream source, followed by zero or more intermediate operations, and a terminal operation.
Example:
1 2 3 | |
Lazy Evaluation¶
One thing that significantly improves Java streams is the ability to evaluate operations lazily.
1 2 3 4 5 | |
Comparison Based Stream Operations¶
sorted¶
sorted() sorts the input stream based on the comparator passed inside it.
1 2 3 | |
min and max¶
As the name suggests, these are used to get the maximum or minimum element from a stream based on a comparator.
1 2 3 4 5 6 7 | |
allMatch, anyMatch, and noneMatch¶
All of these operations take a Predicate and return a boolean.
1 2 3 | |
Java Stream Specialisations¶
So far we have dealt with object streams. But, there exist streams to work with the primitive data types — IntStream, DoubleStream, LongStream.
Creation¶
1 2 3 4 | |
Specialised Operations¶
Specialised streams provide some additional operations that make dealing with numbers quite effortless.
1 2 3 4 | |
Reduction Operations¶
A reduction is the process of combining a stream into a summarised result by applying a combination operation. We already saw a few reduction operations like findFirst(), min(), and max().
reduce()¶
Let’s see the general-purpose reduce() operation. The most common form of reduce() is:
1 | |
where identity is the initial value and accumulator is the repeating binary operation.
For example, sum of all salaries -
1 2 3 | |
Advanced collect¶
We already saw how we used Collectors.toList() to get the list out of the stream. Let’s now see a few more ways to collect elements from the stream.
joining¶
1 2 3 4 | |
Collectors.joining() helps join 2 strings by putting a delimiter between them by internally using java.util.StringJoiner.
toSet¶
We can also use toSet() to get a set out of stream elements:
1 2 3 4 5 6 7 8 | |
toCollection¶
We can use Collectors.toCollection() to extract the elements into any other collection by passing in a Supplier<Collection>. We can also use a constructor reference for the Supplier:
1 2 3 | |
Here, an empty collection is created internally, and its add() method is called on each element of the stream.
summarizingDouble¶
If summarised statistics are a requirement that is to be built from a stream, summarizingDouble() is the collector. It applies a double-producing mapping function to each input element and returns a special class containing statistical information for the resulting values -
1 2 3 4 5 6 7 8 | |
The DoubleSummaryStatistics objects get us statistics like – count, sum, min, max, average, etc.
summaryStatistics() can be used to generate similar results when we’re using one of the specialised streams -
1 2 3 | |
partitioningBy¶
We can partition a stream into two — based on whether the elements satisfy certain criteria or not.
Let’s split our List of numerical data, into Even and Odds:
1 2 3 4 5 | |
Here, the stream is partitioned into a Map, with evens and odds stored as true and false keys.
groupingBy¶
groupingBy() is an extension of partitioningBy(). It partitions the stream into more than two groups. It takes a classification function as its parameter. This classification function is applied to each element of the stream.
The value returned by the function is used as a key to the map that we get from the groupingBy() collector -
1 2 | |
Here, we grouped the employees based on the initial character of their first name.
mapping¶
In the above example, we saw how we can use groupingBy() to group elements of the stream with the use of a Map. And we were able to group Employee objects using the first character in their first names. What if we wanted to map the first characters of their first names, with something other than Employee objects? Like mapping the first character of the first name with Employee IDs. That’s what we can achieve with mapping() -
1 2 3 | |
reducing¶
reducing() is similar to reduce() – which we explored before. It simply returns a collector which performs a reduction of its input elements -
1 2 3 | |
Here, by reducing(), we are incrementing each Employee’s salary by 10% and then collecting all the increments. The overall operation is broken down into multiple pieces - Identity, Mapper, BinaryOperator. Here, 0.0 is the identity (initial value), e -> e.getSalary() * percentage / 100 is the mapper piece. BinaryOperator is the addition expression - (s1, s2) -> s1 + s2.
Parallel Streams¶
Parallel streams help us execute code in parallel on separate processor cores. The final result is the combination of each individual outcome.
1 | |
Here, incrementSalary would get executed on multiple elements in parallel. As in the case with writing multi-threaded code, one needs to be aware of a couple of things while using parallel():
- Code is to be thread-safe. Special care is to be taken if operations performed access shared data.
- If order is of importance, parallel streams should be avoided. The result after each run would differ.
Infinite Streams¶
At times, we might need a continuous stream of elements while still performing operations. Knowing the range of elements beforehand might not be possible, unlike List or Map where elements are pre-populated. We have Infinite Streams for such cases; also known as unbounded streams.
Two ways to generate infinite streams -
generate¶
Provide a Supplier which gets called anytime we need new stream elements to be generated.
1 2 3 | |
With infinite streams, we need to provide an eventual termination condition. Here, we used limit(); it limits the stream to 5 random numbers, generated with the Supplier - Math.random()
iterate¶
iterate() takes two parameters - an initial value, called seed element, and a function which generates the next element using the previous value. iterate(), by design, is stateful and hence may not be useful in parallel streams -
1 2 3 | |
Here, 2 is the seed value, and the following expression is the lambda for consecutive iterations. The value 2 is passed to it, which generates 4, which continues until the total elements, including the seed value, amount to 5.