Scala has several data structures to support lazy operations: Stream, Iterator, and View. These collections are non-strict because all computations on them are deferred. In this tutorial, we’ll discuss the use cases and features of these collections.
2. Lazy and Strict Collections
When we’re dealing with strict collections (such as List, Set, Map), all transformations of the elements are computed at once we can verify it by the assertion:
val list = List(1, 2, 3) list.map(_ * 2) shouldBe List(2, 4, 6))
On the other hand, there are lazy collections (such as Stream) that allow transformations of the elements without immediate computations:
val stream = Stream(1, 2, 3) stream.map(_ * 2) // scala.collection.immutable.Stream[Int] = Stream(2, )
In this case, the tail of the stream is not yet computed until we reference it.
Iterator is a trait in Scala to access sequential elements one by one. We can obtain an iterator for any collection since all Scala collections extend the IterableOnce trait which defines an abstract method iterator: Iterator[A]:
val list = List(1, 2, 3) val it: Iterator[Int] = list.iterator
The main methods of the Iterator are hasNext and next which shifts the value of the iterator to the next value. There are also a few methods of collections available for Iterator, such as foreach, map, flatMap, filter. However, it’s important that in some cases, Iterator will behave differently. For example, applying of foreach to the Iterator will change it, and the call itr.next will lead to NoSuchElementException because the iterator is empty:
val itr = Iterator(1, 2, 3) itr foreach println itr.next
Meanwhile, we don’t change an Iterator when we transform it:
val itr = Iterator(1, 2, 3)
val itrUpdated = itr.map(_ * 5) itr.next shouldBe 5
We get the value of Iterator above only when we reference it (lazy execution):
val itrUpdated: Iterator[Int] = <iterator>
We can access only subsequent values of the Iterator via next, and we can’t access values which Iterator has already traversed.
4. Stream (LazyList)
Unlike Iterator, Stream (deprecated since 2.13.0 in favor of LazyList) is a collection. Actually, Stream is a List whose tail is a lazy val:
val stream: Stream[Int] = Stream(1, 2, 3) stream.head shouldBe 1
In the example above, only the head of the Stream is computed.
Unlike the Iterator Stream allows to access previously computed values and coming values via index:
stream(0) shouldBe 1 stream(1) shouldBe 2
We can also construct a Stream via the #:: operator in a way similar to the construction of a List via the :: operator:
val stream = 1 #:: 2 #:: 3 #:: Stream.empty
Unless we iterate through the Stream, the elements of that Stream will never be computed. For example, we can safely implement recursive algorithms which cause a stack overflow in case of strict collections, such as in the factorial example:
def factorial(a: Int, b: Int): Stream[Int] = a #:: factorial(a*(b+1), b+1) val factorials7: Stream[Int] = factorial(1, 1).take(7) val factorialsList = factorials7.toList // List(1, 2, 6, 24, 120, 720, 5040)
We will get a stack overflow error if we implement the same factorial method for strict collection, such as List. It’s important to note that the Stream’s laziness doesn’t matter when we need to access all elements of the Stream immediately (unlike applying of map, flatMap, filter methods):
stream.size shouldBe 3
The View is a special kind of collection in Scala that takes a base collection and executes transformer methods on that collection lazily. We can turn every Scala collection into a lazy representation and back via the view method.
Here’s how we apply view on the List to get scala.collection.SeqView[Int]:
val list = List(1, 2, 3) val listView = list.view
In this instance, calling list.view creates a new collection of type SeqView[Int]. We usually use Views when we need to avoid overhead with intermediate collections to increase performance. It is especially important when operations on collection cannot be combined at once. In the example above, we’re not interested in the result of mapping the collection by (_ * 2), and we need only the final collection of strings:
(list.view.map(_ * 2).map(_.toString)).force
The call of force will create a collection back from View.
When we’re dealing with some large set of data, we can consider using the collection view. For example, we want to check if the data set of strings contains some word:
val listOfWords = loadListOfWords("myList.txt") val occurence = "Scala" val hasOccurence = (listOfWords: Iterable[String]) => listOfWords.exists(_ == occurrence)
Then if we call hasOccurence on the listOfWords.view, we don’t operate with the whole list of words and stop when the occurrence is found:
val res = hasOccurence(listOfWords.view)
It’s also important to note that we should be careful when we use views with side effects because side effects will not be computed until we force it:
def printer = println(System.currentTimeMillis()) val printView: SeqView[Unit, Seq[_]] = List.range(0, 10).view.map(_ => printer)
In this article, we met Iterator, Stream, and View to deal with lazy calculations in Scala.
As usual, these examples are available over on GitHub.