Java Top

Get started with Spring 5 and Spring Boot 2, through the Learn Spring course:

>> CHECK OUT THE COURSE

1. Overview

Serialization is the process of converting an object into a stream of bytes. That object can then be saved to a database or transferred over a network. The opposite operation, extracting an object from a series of bytes, is deserialization. Their main purpose is to save the state of an object so that we can recreate it when needed.

In this tutorial, we'll explore different serialization approaches for Java objects.

First, we'll discuss Java's Native APIs for serialization. Next, we'll explore libraries that support JSON and YAML formats to do the same. Finally, we'll take a look at some cross-language protocols.

2. Sample Entity Class

Let's start by introducing a simple entity that we're going to use throughout this tutorial:

public class User {
    private int id;
    private String name;
    
    //getters and setters
}

In the next sections, we'll go through the most widely used serialization protocols. Through examples, we'll learn the basic usage of each of them.

3. Java's Native Serialization

Serialization in Java helps to achieve effective and prompt communication between multiple systems. Java specifies a default way to serialize objects. A Java class can override this default serialization and define its own way of serializing objects.

The advantages of Java native serialization are:

  • It's a simple yet extensible mechanism
  • It maintains the object type and safety properties in the serialized form
  • Extensible to support marshaling and unmarshaling as needed for remote objects
  • This is a native Java solution, so it doesn't require any external libraries

3.1. The Default Mechanism

As per the Java Object Serialization Specification, we can use the writeObject() method from ObjectOutputStream class to serialize the object. On the other hand, we can use the readObject() method, which belongs to the ObjectInputStream class, to perform the deserialization.

We'll illustrate the basic process with our User class.

First, our class needs to implement the Serializable interface:

public class User implements Serializable {
    //fields and methods
}

Next, we need to add the serialVersionUID attribute:

private static final long serialVersionUID = 1L;

Now, let's create a User object:

User user = new User();
user.setId(1);
user.setName("Mark");

We need to provide a file path to save our data:

String filePath = "src/test/resources/protocols/user.txt";

Now, it's time to serialize our User object to a file:

FileOutputStream fileOutputStream = new FileOutputStream(filePath);
ObjectOutputStream objectOutputStream = new ObjectOutputStream(fileOutputStream);
objectOutputStream.writeObject(user);

Here, we used ObjectOutputStream for saving the state of the User object to a “user.txt” file.

On the other hand, we can read the User object from the same file and deserialize it:

FileInputStream fileInputStream = new FileInputStream(filePath);
ObjectInputStream objectInputStream = new ObjectInputStream(fileInputStream);
User deserializedUser = (User) objectInputStream.readObject();

Finally, we can test the state of the loaded object:

assertEquals(1, deserializedUser.getId());
assertEquals("Mark", deserializedUser.getName());

This is the default way to serialize Java objects. In the next section, we'll see the custom way to do the same.

3.2. Custom Serialization Using the Externalizable Interface

Custom serialization can be particularly useful when trying to serialize an object that has some unserializable attributes. This can be done by implementing the Externalizable interface, which has two methods:

public void writeExternal(ObjectOutput out) throws IOException;

public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException;

We can implement these two methods inside the class that we want to serialize. A detailed example can be found in our article on the Externalizable Interface.

3.3. Java Serialization Caveats

There are some caveats that concern native serialization in Java:

  • Only objects marked Serializable can be persisted. The Object class does not implement Serializable, and hence, not all the objects in Java can be persisted automatically
  • When a class implements the Serializable interface, all its sub-classes are serializable as well. However, when an object has a reference to another object, these objects must implement the Serializable interface separately, or else a NotSerializableException will be thrown
  • If we want to control the versioning, we need to provide the serialVersionUID attribute. This attribute is used to verify that the saved and loaded objects are compatible. Therefore, we need to ensure it is always the same, or else InvalidClassException will be thrown
  • Java serialization heavily uses I/O streams. We need to close a stream immediately after a read or write operation because if we forget to close the stream, we'll end up with a resource leak. To prevent such resource leaks, we can use the try-with-resources idiom

4. Gson Library

Google's Gson is a Java library that is used to serialize and deserialize Java objects to and from JSON representation.

Gson is an open-source project hosted in GitHub. In general, it provides toJson() and fromJson() methods to convert Java objects to JSON and vice versa.

4.1. Maven Dependency

Let's add the dependency for the Gson library:

<dependency>
    <groupId>com.google.code.gson</groupId>
    <artifactId>gson</artifactId>
    <version>2.8.7</version>
</dependency>

4.2. Gson Serialization

First, let's create a User object:

User user = new User();
user.setId(1);
user.setName("Mark");

Next, we need to provide a file path to save our JSON data:

String filePath = "src/test/resources/protocols/gson_user.json";

Now, let's use the toJson() method from the Gson class to serialize the User object into the “gson_user.json” file:

Writer writer = new FileWriter(filePath);
Gson gson = new GsonBuilder().setPrettyPrinting().create();
gson.toJson(user, writer);

4.3. Gson Deserialization

We can use the fromJson() method from the Gson class to deserialize the JSON data.

Let's read the JSON file and deserialize the data into a User object:

Gson gson = new GsonBuilder().setPrettyPrinting().create();
User deserializedUser = gson.fromJson(new FileReader(filePath), User.class);

Finally, we can test the deserialized data:

assertEquals(1, deserializedUser.getId());
assertEquals("Mark", deserializedUser.getName());

4.4. Gson Features

Gson has many important features, including:

  • It can handle collections, generic types, and nested classes
  • With Gson, we can also write a custom serializer and/or deserializer so that we can control the whole process
  • Most importantly, it allows deserializing instances of classes for which the source code is not accessible
  • In addition, we can use a versioning feature in case our class file has been modified in different versions. We can use the @Since annotation on newly added fields, and then we can use the setVersion() method from GsonBuilder

For more examples, please check our cookbooks for Gson Serialization and Gson Deserialization.

In this section, we serialized data in the JSON format using Gson API. In the next section, we'll use the Jackson API to do the same.

5. Jackson API

Jackson is also known as “the Java JSON library” or “the best JSON parser for Java”. It provides multiple approaches to work with JSON data.

To understand the Jackson library in general, our Jackson Tutorial is a good place to start.

5.1. Maven Dependency

Let's add the dependency for the Jackson libraries:

<dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-core</artifactId>
    <version>2.12.3</version>
</dependency>
<dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-annotations</artifactId>
    <version>2.12.3</version>
</dependency>
<dependency>
    <groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-databind</artifactId>
     <version>2.12.3</version>
</dependency>

5.2. Java Object to JSON

We can use the writeValue() method, which belongs to the ObjectMapper class, to serialize any Java object as JSON output.

Let's start by creating a User object:

User user = new User();
user.setId(1);
user.setName("Mark Jonson");

After that, let's provide a file path to store our JSON data:

String filePath = "src/test/resources/protocols/jackson_user.json";

Now, we can store a User object into a JSON file using the ObjectMapper class:

File file = new File(filePath);
ObjectMapper mapper = new ObjectMapper();
mapper.writeValue(file, user);

This code will write our data to the “jackson_user.json” file.

5.3. JSON to Java Object

The simple readValue() method of the ObjectMapper is a good entry point. We can use it to deserialize JSON content into a Java object.

Let's read the User object from the JSON file:

User deserializedUser = mapper.readValue(new File(filePath), User.class);

We can always test the loaded data:

assertEquals(1, deserializedUser.getId());
assertEquals("Mark Jonson", deserializedUser.getName());

5.4. Jackson Features

  • Jackson is a solid and mature JSON serialization library for Java
  • The ObjectMapper class is the entry point of the serialization process and provides a straightforward way to parse and generate JSON objects with a lot of flexibility
  • One of the greatest strengths of the Jackson library is the highly customizable serialization and deserialization process

Until now, we saw data serialization in the JSON format. In the next section, we'll explore serialization using YAML.

6. YAML

YAML stands for “YAML Ain't Markup Language”. It is a human-readable data serialization language. We can use YAML for configuration files, as well as in the applications where we want to store or transmit data.

In the previous section, we saw the Jackson API process JSON files. We can also use Jackson APIs to process YAML files. A detailed example can be found in our article on parsing YAML with Jackson.

Now, let's take a look at other libraries.

6.1. YAML Beans

YAML Beans makes it easy to serialize and deserialize Java object graphs to and from YAML.

The YamlWriter class is used to serialize Java objects to YAML. The write() method automatically handles this by recognizing public fields and the bean's getter methods.

Conversely, we can use the YamlReader class to deserialize YAML to Java objects. The read() method reads the YAML document and deserializes it into the required object.

First of all, let's add the dependency for YAML Beans:

<dependency>
    <groupId>com.esotericsoftware.yamlbeans</groupId>
    <artifactId>yamlbeans</artifactId>
    <version>1.15</version>
</dependency>

Now. let's create a map of User objects:

private Map<String, User> populateUserMap() {
    User user1 = new User();
    user1.setId(1);
    user1.setName("Mark Jonson");
    //.. more user objects
    
    Map<String, User> users = new LinkedHashMap<>();
    users.put("User1", user1);
    // add more user objects to map
    
    return users;
}

After that, we need to provide a file path to store our data:

String filePath = "src/test/resources/protocols/yamlbeans_users.yaml";

Now, we can use the YamlWriter class to serialize the map into a YAML file:

YamlWriter writer = new YamlWriter(new FileWriter(filePath));
writer.write(populateUserMap());
writer.close();

On the opposite side, we can use the YamlReader class to deserialize the map:

YamlReader reader = new YamlReader(new FileReader(filePath));
Object object = reader.read();
assertTrue(object instanceof Map); 

Finally, we can test the loaded map:

Map<String, User> deserializedUsers = (Map<String, User>) object;
assertEquals(4, deserializedUsers.size());
assertEquals("Mark Jonson", (deserializedUsers.get("User1").getName()));
assertEquals(1, (deserializedUsers.get("User1").getId()));

6.2. SnakeYAML

SnakeYAML provides a high-level API to serialize Java objects to YAML documents and vice versa. The latest version, 1.2, can be used with JDK 1.8 or higher Java versions. It can parse Java structures such as String, List, and Map.

The entry point for SnakeYAML is the Yaml class, which contains several methods that help in serialization and deserialization.

To deserialize YAML input into Java objects, we can load a single document with the load() method and multiple documents with the loadAll() method. These methods accept an InputStream, as well as String objects.

Going the other direction, we can use the dump() method to serialize Java objects into YAML documents.

A detailed example can be found in our article on parsing YAML with SnakeYAML.

Naturally, SnakeYAML works well with Java Maps, however, it can work with custom Java objects as well.

In this section, we saw different libraries to serialize data into YAML format. In the next sections, we'll discuss cross-platform protocols.

7. Apache Thrift

Apache Thrift was originally developed by Facebook and is currently maintained by Apache.

The best benefit of using Thrift is that it supports cross-language serialization with lower overhead. Also, many serialization frameworks support only one serialization format, however, Apache Thrift allows us to choose from several.

7.1. Thrift Features

Thrift provides pluggable serializers that are known as protocols. These protocols provide flexibility to use any one of several serialization formats for data exchange. Some examples of supported protocols include:

  • TBinaryProtocol uses a binary format and hence faster to process than the text protocol
  • TCompactProtocol is a more compact binary format and, therefore, more efficient to process as well
  • TJSONProtocol uses JSON for encoding data

Thrift also supports the serialization of container types – lists, sets, and maps.

7.2. Maven Dependency

To use the Apache Thrift framework in our application, let's add the Thrift libraries:

<dependency>
    <groupId>org.apache.thrift</groupId>
    <artifactId>libthrift</artifactId>
    <version>0.14.2</version>
</dependency>

7.3. Thrift Data Serialization

Apache Thrift protocols and transports are designed to work together as a layered stack. The protocols serialize data into a byte stream, and the transports read and write the bytes.

As stated earlier, Thrift provides a number of protocols. We'll illustrate thrift serialization using a binary protocol.

First of all, we need a User object:

User user = new User();
user.setId(2);
user.setName("Greg");

The next step is to create a binary protocol:

TMemoryBuffer trans = new TMemoryBuffer(4096);
TProtocol proto = new TBinaryProtocol(trans);

Now, let's serialize our data. We can do so using the write APIs:

proto.writeI32(user.getId());
proto.writeString(user.getName());

7.4. Thrift Data Deserialization

Let's use the read APIs to deserialize the data:

int userId = proto.readI32();
String userName = proto.readString();

Finally, we can test the loaded data:

assertEquals(2, userId);
assertEquals("Greg", userName);

More examples can be found in our article on Apache Thrift.

8. Google Protocol Buffers

The last approach that we'll cover in this tutorial is Google Protocol Buffers (protobuf). It is a well-known binary data format.

8.1. Benefits of Protocol Buffers

Protocol buffers provide several benefits, including:

  • It's language and platform-neutral
  • It's a binary transfer format, meaning the data is transmitted as binary. This improves the speed of transmission because it takes less space and bandwidth
  • Supports both backward and forward compatibility so that new versions can read old data and vice versa

8.2. Maven Dependency

Let's start by adding the dependency for the Google protocol buffer libraries:

<dependency>
    <groupId>com.google.protobuf</groupId>
    <artifactId>protobuf-java</artifactId>
    <version>3.17.3</version>
</dependency>

8.3. Defining a Protocol

With our dependencies squared away, we can now define a message format:

syntax = "proto3";
package protobuf;
option java_package = "com.baeldung.serialization.protocols";
option java_outer_classname = "UserProtos";
message User {
    int32 id = 1;
    string name = 2;
}

This is a protocol of a simple message of User type that has two fields – id and name, of type integer and string, respectively. Note that we're saving it as the “user.proto” file.

8.4. Generating a Java Code From Protobuf File

Once we have a protobuf file, we can use the protoc compiler to generate code from it:

protoc -I=. --java_out=. user.proto

As a result, this command will generate a UserProtos.java file.

After that, we can create an instance of the UserProtos class:

UserProtos.User user = UserProtos.User.newBuilder().setId(1234).setName("John Doe").build();

8.5. Serializing and Deserializing Protobuf

First, we need to provide a file path to store our data:

String filePath = "src/test/resources/protocols/usersproto";

Now, let's save the data into a file. We can use the writeTo() method from the UserProtos class – a class we had generated from a protobuf file:

FileOutputStream fos = new FileOutputStream(filePath);
user.writeTo(fos);

After executing this code, our object will be serialized to binary format and saved to the “usersproto” file.

Oppositely, we can use the mergeFrom() method to load that data from a file and deserialize it back to a User object:

UserProtos.User deserializedUser = UserProtos.User.newBuilder().mergeFrom(new FileInputStream(filePath)).build();

Finally, we can test the loaded data:

assertEquals(1234, deserializedUser.getId());
assertEquals("John Doe", deserializedUser.getName());

9. Summary

In this tutorial, we explored some widely used protocols for the serialization of Java objects. The choice of data serialization format for an application depends on various factors such as data complexity, need for human readability, and speed.

Java supports built-in serialization that is easy to use.

JSON is preferable due to readability and being schema-less. Hence, both Gson and Jackson are good options for serializing JSON data. They are simple to use and well documented. For editing data, YAML is a good fit.

On the other hand, binary formats are faster than textual formats. When speed is important for our application, Apache Thrift and Google Protocol Buffers are great candidates for serializing data. Both are more compact and quicker than XML or JSON formats.

To sum up, there is often a trade-off between convenience and performance, and serialization proves no different. There are, of course, many other formats available for data serialization.

As always, the full example code is over on GitHub.

Java bottom

Get started with Spring 5 and Spring Boot 2, through the Learn Spring course:

>> CHECK OUT THE COURSE
guest
0 Comments
Inline Feedbacks
View all comments