In this article, we will be looking at PCollections, a Java library providing persistent, immutable collections.
Persistent data structures (collections) can't be modified directly during the update operation, rather a new object with the result of the update operation is returned. They are not only immutable but also persistent – which means that after modification is performed, previous versions of the collection remain unchanged.
PCollections is analogous to and compatible with the Java Collections framework.
Let's add the following dependency to our pom.xml for us to use PCollections in our project:
<dependency> <groupId>org.pcollections</groupId> <artifactId>pcollections</artifactId> <version>2.1.2</version> </dependency>
If our project is Gradle based, we can add the same artifact to our build.gradle file:
The latest version can be found on Maven Central.
3. Map Structure (HashPMap)
HashPMap is a persistent map data structure. It is the analog for java.util.HashMap used for storing non-null, key-value data.
We can instantiate HashPMap by using convenient static methods in HashTreePMap. These static methods return a HashPMap instance that is backed by an IntTreePMap.
The static empty() method of the HashTreePMap class creates an empty HashPMap that has no elements – just like using the default constructor of java.util.HashMap:
HashPMap<String, String> pmap = HashTreePMap.empty();
There are two other static methods that we can use to create HashPMap. The singleton() method creates a HashPMap with only one entry:
HashPMap<String, String> pmap1 = HashTreePMap.singleton("key1", "value1"); assertEquals(pmap1.size(), 1);
The from() method creates a HashPMap from an existing java.util.HashMap instance (and other java.util.Map implementations):
Map map = new HashMap(); map.put("mkey1", "mval1"); map.put("mkey2", "mval2"); HashPMap<String, String> pmap2 = HashTreePMap.from(map); assertEquals(pmap2.size(), 2);
Although HashPMap inherits some of the methods from java.util.AbstractMap and java.util.Map, it has methods that are unique to it.
The minus() method removes a single entry from the map while the minusAll() method removes multiple entries. There's also the plus() and plusAll() methods that add single and multiple entries respectively:
HashPMap<String, String> pmap = HashTreePMap.empty(); HashPMap<String, String> pmap0 = pmap.plus("key1", "value1"); Map map = new HashMap(); map.put("key2", "val2"); map.put("key3", "val3"); HashPMap<String, String> pmap1 = pmap0.plusAll(map); HashPMap<String, String> pmap2 = pmap1.minus("key1"); HashPMap<String, String> pmap3 = pmap2.minusAll(map.keySet()); assertEquals(pmap0.size(), 1); assertEquals(pmap1.size(), 3); assertFalse(pmap2.containsKey("key1")); assertEquals(pmap3.size(), 0);
It's important to note that calling put() on pmap will throw an UnsupportedOperationException. Since PCollections objects are persistent and immutable, every modifying operation returns a new instance of an object (HashPMap).
Let's move on to look at other data structures.
4. List Structure (TreePVector and ConsPStack)
TreePVector is a persistent analog of java.util.ArrayList while ConsPStack is the analog of java.util.LinkedList. TreePVector and ConsPStack have convenient static methods for creating new instances – just like HashPMap.
The empty() method creates an empty TreePVector, while the singleton() method creates a TreePVector with only one element. There's also the from() method that can be used to create an instance of TreePVector from any java.util.Collection.
ConsPStack has static methods with the same name that achieve the same goal.
TreePVector has methods for manipulating it. It has the minus() and minusAll() methods for removal of element(s); the plus(), and plusAll() for addition of element(s).
The with() is used to replace an element at a specified index, and the subList() gets a range of elements from the collection.
These methods are available in ConsPStack as well.
Let's consider the following code snippet that exemplifies the methods mentioned above:
TreePVector pVector = TreePVector.empty(); TreePVector pV1 = pVector.plus("e1"); TreePVector pV2 = pV1.plusAll(Arrays.asList("e2", "e3", "e4")); assertEquals(1, pV1.size()); assertEquals(4, pV2.size()); TreePVector pV3 = pV2.minus("e1"); TreePVector pV4 = pV3.minusAll(Arrays.asList("e2", "e3", "e4")); assertEquals(pV3.size(), 3); assertEquals(pV4.size(), 0); TreePVector pSub = pV2.subList(0, 2); assertTrue(pSub.contains("e1") && pSub.contains("e2")); TreePVector pVW = (TreePVector) pV2.with(0, "e10"); assertEquals(pVW.get(0), "e10");
In the code snippet above, pSub is another TreePVector object and is independent of pV2. As can be observed, pV2 was not changed by the subList() operation; rather a new TreePVector object was created and filled with elements of pV2 from index 0 to 2.
This is what is meant by immutability and it is what happens with all modifying methods of PCollections.
5. Set Structure (MapPSet)
MapPSet is a persistent, map-backed, analog of java.util.HashSet. It can be conveniently instantiated by static methods of HashTreePSet – empty(), from() and singleton(). They function in the same way as explained in previous examples.
MapPSet has plus(), plusAll(), minus() and minusAll() methods for manipulating set data. Furthermore, it inherits methods from java.util.Set, java.util.AbstractCollection and java.util.AbstractSet:
MapPSet pSet = HashTreePSet.empty() .plusAll(Arrays.asList("e1","e2","e3","e4")); assertEquals(pSet.size(), 4); MapPSet pSet1 = pSet.minus("e4"); assertFalse(pSet1.contains("e4"));
Finally, there's also OrderedPSet – which maintains the insertion order of elements just like java.util.LinkedHashSet.
In conclusion, in this quick tutorial, we explored PCollections – the persistent data structures that are analogous to core collections we have available in Java. Of course, the PCollections Javadoc provides more insight into the intricacies of the library.
And, as always, the complete code can be found over on Github.