**I just announced the new *** Learn Spring * course, focused on the fundamentals of Spring 5 and Spring Boot 2:

*Learn Spring*course, focused on the fundamentals of Spring 5 and Spring Boot 2:

Last modified: July 19, 2019

Data structures represent a crucial asset in computer programming, and knowing when and why to use them is very important.

This article is a brief introduction to trie (pronounced “try”) data structure, its implementation and complexity analysis.

A trie is a discrete data structure that’s not quite well-known or widely-mentioned in typical algorithm courses, but nevertheless an important one.

A trie (also known as a digital tree) and sometimes even radix tree or prefix tree (as they can be searched by prefixes), is an ordered tree structure, which takes advantage of the keys that it stores – usually strings.

A node’s position in the tree defines the key with which that node is associated, which makes tries different in comparison to binary search trees, in which a node stores a key that corresponds only to that node.

All descendants of a node have a common prefix of a *String* associated with that node, whereas the root is associated with an empty *String.*

Here we have a preview of *TrieNode *that we will be using in our implementation of the *Trie:*

public class TrieNode { private HashMap<Character, TrieNode> children; private String content; private boolean isWord; // ... }

There may be cases when a trie is a binary search tree, but in general, these are different. Both binary search trees and tries are trees, but each node in binary search trees always has two children, whereas tries’ nodes, on the other hand, can have more.

In a trie, every node (except the root node) stores one character or a digit. By traversing the trie down from the root node to a particular node *n*, a common prefix of characters or digits can be formed which is shared by other branches of the trie as well.

By traversing up the trie from a leaf node to the root node, a *String* or a sequence of digits can be formed.

Here is the *Trie *class, which represents an implementation of the trie data structure:

public class Trie { private TrieNode root; //... }

Now, let’s see how to implement basic operations.

The first operation that we’ll describe is the insertion of new nodes.

Before we start the implementation, it’s important to understand the algorithm:

- Set a current node as a root node
- Set the current letter as the first letter of the word
- If the current node has already an existing reference to the current letter (through one of the elements in the “children” field), then set current node to that referenced node. Otherwise, create a new node, set the letter equal to the current letter, and also initialize current node to this new node
- Repeat step 3 until the key is traversed

**The complexity of this operation is O(n), where n represents the key size.**

Here is the implementation of this algorithm:

public void insert(String word) { TrieNode current = root; for (int i = 0; i < word.length(); i++) { current = current.getChildren() .computeIfAbsent(word.charAt(i), c -> new TrieNode()); } current.setEndOfWord(true); }

Now let’s see how we can use this method to insert new elements in a trie:

private Trie createExampleTrie() { Trie trie = new Trie(); trie.insert("Programming"); trie.insert("is"); trie.insert("a"); trie.insert("way"); trie.insert("of"); trie.insert("life"); return trie; }

We can test that trie has already been populated with new nodes from the following test:

@Test public void givenATrie_WhenAddingElements_ThenTrieNotEmpty() { Trie trie = createTrie(); assertFalse(trie.isEmpty()); }

Let’s now add a method to check whether a particular element is already present in a trie:

- Get children of the root
- Iterate through each character of the
*String* - Check whether that character is already a part of a sub-trie. If it isn’t present anywhere in the trie, then stop the search and return
*false* - Repeat the second and the third step until there isn’t any character left in the
*String.*If the end of the*String*is reached, return*true*

**The complexity of this algorithm is O(n), where n represents the length of the key. **

Java implementation can look like:

public boolean find(String word) { TrieNode current = root; for (int i = 0; i < word.length(); i++) { char ch = word.charAt(i); TrieNode node = current.getChildren().get(ch); if (node == null) { return false; } current = node; } return current.isEndOfWord(); }

And in action:

@Test public void givenATrie_WhenAddingElements_ThenTrieContainsThoseElements() { Trie trie = createExampleTrie(); assertFalse(trie.containsNode("3")); assertFalse(trie.containsNode("vida")); assertTrue(trie.containsNode("life")); }

Aside from inserting and finding an element, it’s obvious that we also need to be able to delete elements.

For the deletion process, we need to follow the steps:

- Check whether this element is already part of the trie
- If the element is found, then remove it from the trie

The complexity of this algorithm is *O(n)*, where n represents the length of the key.

Let’s have a quick look at the implementation:

public void delete(String word) { delete(root, word, 0); } private boolean delete(TrieNode current, String word, int index) { if (index == word.length()) { if (!current.isEndOfWord()) { return false; } current.setEndOfWord(false); return current.getChildren().isEmpty(); } char ch = word.charAt(index); TrieNode node = current.getChildren().get(ch); if (node == null) { return false; } boolean shouldDeleteCurrentNode = delete(node, word, index + 1) && !node.isEndOfWord(); if (shouldDeleteCurrentNode) { current.getChildren().remove(ch); return current.getChildren().isEmpty(); } return false; }

And in action:

@Test void whenDeletingElements_ThenTreeDoesNotContainThoseElements() { Trie trie = createTrie(); assertTrue(trie.containsNode("Programming")); trie.delete("Programming"); assertFalse(trie.containsNode("Programming")); }

In this article, we’ve seen a brief introduction to trie data structure and its most common operations and their implementation.

The full source code for the examples shown in this article can be found over on GitHub.

In the (common) special case where you know the char will fit in latin-1 (or some other managably small data set), defining children to be an array of TriNode to be indexed by the char is MUCH MUCH faster.

Hey Micha – that’s an interesting note, thanks. Yes, I’m very much aware there are optimizations that can be made here, that will improve the time and space complexity of the algorithms.

However, keep in mind the main focus of this article is to introduce and explain the core concept, not necessarily to reach the most efficient implementation possible.

So, I do agree, we’ll leave that here as a note in case someone needs to optimize our base implementation here.

Thanks,

Eugen.