Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE

1. Overview

We know a Map holds key-value pairs in Java. Sometimes, we may want to load a text file’s content and convert it into a Java Map.

In this quick tutorial, let’s explore how we can achieve it.

2. Introduction to the Problem

Since Map stores key-value entries, the file should follow a specific format if we would like to import a file’s content to a Java Map object.

An example file may explain it quickly:

$ cat theLordOfRings.txt
title:The Lord of the Rings: The Return of the King
director:Peter Jackson
actor:Sean Astin
actor:Ian McKellen
Gandalf and Aragorn lead the World of Men against Sauron's
army to draw his gaze from Frodo and Sam as they approach Mount Doom with the One Ring.

As we can see in the theLordOfRings.txt file, if we consider the colon character as the delimiter, most lines follow the pattern “KEY:VALUE“, such as “director:Peter Jackson“.

Therefore, we can read each line, parse the key and value, and put them in a Map object.

However, there are some special cases we need to take care of:

  • Values containing the delimiter – Value shouldn’t be truncated. For example, the first line “title:The Lord of the Rings: The Return …
  • Duplicated Keys – Three strategies: overwriting the existing one, discarding the latter, and aggregating the values into a List depending on the requirement. For example, we have two “actor” keys in the file.
  • Lines that don’t follow the “KEY:VALUE” pattern – The line should be skipped. For instance, see the last two lines in the file.

Next, let’s read this file and store it in a Java Map object.

3. The DupKeyOption Enum

As we’ve discussed, we’ll have three options for the duplicated keys case: overwriting, discarding, and aggregating.

Moreover, if we use the overwriting or discarding option, we’ll have a returned Map of type Map<String, String>. However, if we would like to aggregate values for duplicate keys, we’ll get the result as Map<String, List<String>>.

So, let’s first explore the overwriting and discarding scenarios. In the end, we’ll discuss the aggregating option in a standalone section.

To make our solution flexible, let’s create an enum class so that we can pass the option as a parameter to our solution methods:

enum DupKeyOption {
    OVERWRITE, DISCARD
}

4. Using the BufferedReader and FileReader Classes

We can combine BufferedReader and FileReader to read content from a file line by line.

4.1. Creating the byBufferedReader Method

Let’s create a method based on BufferedReader and FileReader:

public static Map<String, String> byBufferedReader(String filePath, DupKeyOption dupKeyOption) {
    HashMap<String, String> map = new HashMap<>();
    String line;
    try (BufferedReader reader = new BufferedReader(new FileReader(filePath))) {
        while ((line = reader.readLine()) != null) {
            String[] keyValuePair = line.split(":", 2);
            if (keyValuePair.length > 1) {
                String key = keyValuePair[0];
                String value = keyValuePair[1];
                if (DupKeyOption.OVERWRITE == dupKeyOption) {
                    map.put(key, value);
                } else if (DupKeyOption.DISCARD == dupKeyOption) {
                    map.putIfAbsent(key, value);
                }
            } else {
                System.out.println("No Key:Value found in line, ignoring: " + line);
            }
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
    return map;
}

The byBufferedReader method accepts two parameters: the input file path, and the dupKeyOption object that decides how to handle entries with duplicated keys.

As the code above shows, we’ve defined a BufferedReader object to read lines from the given input file. Then, we parse and handle each line in a while loop. Let’s walk through and understand how it works:

  • We create a BufferedReader object and use try-with-resources to ensure the reader object gets closed automatically
  • We use the split method with the limit parameter to keep the value part as it is if it contains colon characters
  • Then an if check filters out the line that doesn’t match the “KEY:VALUE” pattern
  • In case there are duplicate keys, if we would like to take the “overwrite” strategy, we can simply call map.put(key, value)
  • Otherwise, calling the putIfAbsent method allows us to ignore the latter coming entries with duplicated keys

Next, let’s test if the method works as expected.

4.2. Testing the Solution

Before we write the corresponding test method, let’s initialize two map objects containing the expected entries:

private static final Map<String, String> EXPECTED_MAP_DISCARD = Stream.of(new String[][]{
    {"title", "The Lord of the Rings: The Return of the King"},
    {"director", "Peter Jackson"},
    {"actor", "Sean Astin"}
  }).collect(Collectors.toMap(data -> data[0], data -> data[1]));

private static final Map<String, String> EXPECTED_MAP_OVERWRITE = Stream.of(new String[][]{
...
    {"actor", "Ian McKellen"}
  }).collect(Collectors.toMap(data -> data[0], data -> data[1]));

As we can see, we’ve initialized two Map objects to help with test assertions. One is for the case where we discard duplicate keys, and the other is for when we overwrite them.

Next, let’s test our method to see if we can get the expected Map objects:

@Test
public void givenInputFile_whenInvokeByBufferedReader_shouldGetExpectedMap() {
    Map<String, String> mapOverwrite = FileToHashMap.byBufferedReader(filePath, FileToHashMap.DupKeyOption.OVERWRITE);
    assertThat(mapOverwrite).isEqualTo(EXPECTED_MAP_OVERWRITE);

    Map<String, String> mapDiscard = FileToHashMap.byBufferedReader(filePath, FileToHashMap.DupKeyOption.DISCARD);
    assertThat(mapDiscard).isEqualTo(EXPECTED_MAP_DISCARD);
}

If we give it a run, the test passes. So, we’ve solved the problem.

5. Using Java Stream

Stream has been around since Java 8. Also, the Files.lines method can conveniently return a Stream object containing all lines in a file.

Now, let’s create a mothed using Stream to solve the problem:

public static Map<String, String> byStream(String filePath, DupKeyOption dupKeyOption) {
    Map<String, String> map = new HashMap<>();
    try (Stream<String> lines = Files.lines(Paths.get(filePath))) {
        lines.filter(line -> line.contains(":"))
            .forEach(line -> {
                String[] keyValuePair = line.split(":", 2);
                String key = keyValuePair[0];
                String value = keyValuePair[1];
                if (DupKeyOption.OVERWRITE == dupKeyOption) {
                    map.put(key, value);
                } else if (DupKeyOption.DISCARD == dupKeyOption) {
                    map.putIfAbsent(key, value);
                }
            });
    } catch (IOException e) {
        e.printStackTrace();
    }
    return map;
}

As the code above shows, the main logic is quite similar to our byBufferedReader method. Let’s pass through quickly:

  • We’re still using try-with-resources on the Stream object since the Stream object contains a reference to the open file. We should close the file by closing the stream.
  • The filter method skips all lines that don’t follow the “KEY:VALUE” pattern.
  • The forEach method does pretty much the same as the while block in the byBufferedReader solution.

Finally, let’s test the byStream solution:

@Test
public void givenInputFile_whenInvokeByStream_shouldGetExpectedMap() {
    Map<String, String> mapOverwrite = FileToHashMap.byStream(filePath, FileToHashMap.DupKeyOption.OVERWRITE);
    assertThat(mapOverwrite).isEqualTo(EXPECTED_MAP_OVERWRITE);

    Map<String, String> mapDiscard = FileToHashMap.byStream(filePath, FileToHashMap.DupKeyOption.DISCARD);
    assertThat(mapDiscard).isEqualTo(EXPECTED_MAP_DISCARD);
}

When we execute the test, it passes as well.

6. Aggregating Values by Keys

So far, we’ve seen the solutions to the overwriting and discarding scenarios. But, as we’ve discussed, if it’s required, we can also aggregate values by keys. Thus, in the end, we’ll have a Map object of the type Map<String, List<String>>. Now, let’s build a method to realize this requirement:

public static Map<String, List<String>> aggregateByKeys(String filePath) {
    Map<String, List<String>> map = new HashMap<>();
    try (Stream<String> lines = Files.lines(Paths.get(filePath))) {
        lines.filter(line -> line.contains(":"))
          .forEach(line -> {
              String[] keyValuePair = line.split(":", 2);
              String key = keyValuePair[0];
              String value = keyValuePair[1];
              if (map.containsKey(key)) {
                  map.get(key).add(value);
              } else {
                  map.put(key, Stream.of(value).collect(Collectors.toList()));
              }
          });
    } catch (IOException e) {
        e.printStackTrace();
    }
    return map;
}

We’ve used the Stream approach to read all lines in the input file. The implementation is pretty straightforward. Once we’ve parsed the key and value from an input line, we check if the key already exists in the result map object. If it does exist, we append the value to the existing list. Otherwise, we initialize a List containing the current value as the single element: Stream.of(value).collect(Collectors.toList()). 

It’s worth mentioning that we shouldn’t initialize the List using Collections.singletonList(value) or List.of(value). This is because both Collections.singletonList and List.of (Java 9+) methods return an immutable List. That is to say, if the same key comes again, we cannot append the value to the list.

Next, let’s test our method to see if it does the job. As usual, we create the expected result first:

private static final Map<String, List<String>> EXPECTED_MAP_AGGREGATE = Stream.of(new String[][]{
      {"title", "The Lord of the Rings: The Return of the King"},
      {"director", "Peter Jackson"},
      {"actor", "Sean Astin", "Ian McKellen"}
  }).collect(Collectors.toMap(arr -> arr[0], arr -> Arrays.asList(Arrays.copyOfRange(arr, 1, arr.length))));

Then, the test method itself is pretty simple:

@Test
public void givenInputFile_whenInvokeAggregateByKeys_shouldGetExpectedMap() {
    Map<String, List<String>> mapAgg = FileToHashMap.aggregateByKeys(filePath);
    assertThat(mapAgg).isEqualTo(EXPECTED_MAP_AGGREGATE);
}

The test passes if we give it a run. It means our solution works as expected.

7. Conclusion

In this article, we’ve learned two approaches to read content from a text file and save it in a Java Map object: using BufferedReader class and using Stream.

Further, we’ve addressed implementing three strategies to handle duplicate keys: overwriting, discarding, and aggregating.

As always, the full version of the code is available over on GitHub.

Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE
res – REST with Spring (eBook) (everywhere)
Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.