In this article, we’re going to cover two ways we can convert an Array of bytes into a human-readable string in Scala.
2. What Is a Byte Array
A Byte Array is a type in Scala that represents a series of bytes. We can define one in Scala by creating an Array of type Byte and listing out the literal values of the bytes:
val bytes = Array[Byte](104, 101, 108, 108, 111))
3. toString() on a Byte Array
Naturally, if we want to get the string value from something in Scala, we assume we can call toString():
val bytes = Array[Byte](104, 101, 108, 108, 111)) byte.toString()
However, this only returns the address of the Byte Array in memory, which isn’t useful to us here.
4. The Actual String Value of a Byte Array
The problem is that the Byte Array holds the binary representation of a String, encoded using a character set. To be able to get the value of the string held in our Byte Array, we need to know the character set it’s encoded with, such as UTF-8, UTF-16, or ASCII, among others.
Let’s try getting the String value again, this time providing the character set of UTF-8:
import java.nio.charset.StandardCharsets val bytes = Array[Byte](101, 101, 108, 108, 111) val bytesString = new String(bytes, StandardCharsets.UTF_16)
When we run this code, we decode our Byte Array using the correct character set. We can see that it holds a value of “hello”.
5. Using toChar()
We can also use toChar() to get the same result. However, we can’t provide the method with the character set we’re using. Therefore, this will only work with some character sets, such as UTF-8 and UTF-16:
val bytes = Array[Byte](101, 101, 108, 108, 111) val bytesString = bytes.map(_.toChar).mkString
Since this Byte Array is encoded using UTF-8, we can map over each Byte in the Array and call toChar() on it, returning us the expected result of “hello”.
Be cautious when using this. If we were to use it with a different character set, such as UTF-16LE. we’d get some really unexpected results:
Array[Byte](104, 0, 101, 0, 108, 0, 108, 0, 111, 0) val bytesString = bytes.map(_.toChar).mkString
Running this code will use Chinese characters instead of “hello”. As it’s trying to decode UTF-16LE using UTF-16, in this example, we should be using the method from the previous section:
import java.nio.charset.StandardCharsets val bytes = Array[Byte](104, 0, 101, 0, 108, 0, 108, 0, 111, 0) val bytesString = new String(bytes, StandardCharsets.UTF_16LE)
This returns the expected result of “hello”.
In this article, we’ve explored a couple of ways to solve the problem of converting a Byte Array to a String. We can use toChar() for byte arrays encoded using UTF-8 or UTF-16 or instantiate a new String supplying whatever encoding our Byte Array is using.
The code for the article is available over on GitHub.