Guide to MicroStream

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

1. Overview

MicroStream is an object graph persistence engine built for the JVM. We can use it for storing Java object graphs and restoring them in memory. Using a custom serialization concept, MicroStream enables us to store any Java type and to load the entire object graph, partial subgraphs, or single objects.

In this tutorial, we’ll first look at the reasons for developing such an object graph persistence engine. Then, we’ll compare this approach to traditional relational databases and standard Java serialization. We’ll see how to create an object graph storage and use it to persist, load, and delete data.

Finally, we’ll query the data using our local system memory and plain Java APIs.

2. Object-Relational Mismatch

Let’s start by looking at the motivation for developing MicroStream. In most Java projects we require some kind of database storage.

However, Java and popular relational or NoSQL databases use different data structures. Therefore, we need a way to map Java objects to the database structure and vice-versa. This mapping requires both programming effort and execution time. For example, we can use entities that map to tables and properties that match fields in a relational database.

To load data from a database, we would often need to execute complex multi-table SQL queries. Although object-relational mapping frameworks such as Hibernate help developers bridge this gap, in many complex scenarios, the framework-generated queries are not fully optimized.

MicroStream looks to solve this data structure mismatch by using the same structure for in-memory operations as for persisting data.

3. Using JVM as Storage

MicroStream uses the JVM as its storage to achieve fast, in-memory data processing with pure Java. Instead of using storage separated from the JVM, it provides us with a modern, native data storage library.

3.1. Database Management Systems

MicroStream is a persistence engine, not a database management system (DBMS). Some standard DBMS features like user management, connection management, and session handling have been left out by design.

Instead, MicroStream focuses on providing us with an easy way to store and restore our application data.

3.2. Java Serialization

MicroStream uses a custom serialization concept, purposely built to provide a more performant alternative to legacy DBMS.

It doesn’t use Java’s built-in serialization due to several limitations:

Only complete object graphs can be stored and restored
Inefficiency in terms of storage size and performance
The manual effort required when changing class structures

On the other hand, the custom MicroStream data store can:

Persist, load or update object graphs partially and on-demand
Efficiently handle storage size and performance
Handle changing class structures by mapping data via internal heuristics or a user-defined mapping strategy

4. Object Graph Storage

MicroStream tries to simplify software development by using only one data structure with one data model.

Object instances are stored as a byte stream and references between them are mapped with unique identifiers. Therefore, an object graph can be stored in a simple and quick way. In addition, it can be loaded either wholly or partially.

4.1. Dependencies

Before we can start storing object graphs using MicroStream, we’ll need to add two dependencies:

<dependency>
    <groupId>one.microstream</groupId>
    <artifactId>microstream-storage-embedded</artifactId>
    <version>07.00.00-MS-GA</version>
</dependency>
<dependency>
    <groupId>one.microstream</groupId>
    <artifactId>microstream-storage-embedded-configuration</artifactId>
    <version>07.00.00-MS-GA</version>
</dependency>

4.2. Root Instance

When using object graph storage, our entire database is accessed starting at a root instance. This instance is called the root object of an object graph that gets persisted by MicroStream.

Object graph instances, including the root instance, can be of any Java type. Therefore, a simple String instance can be registered as the entity graph’s root:

EmbeddedStorageManager storageManager = EmbeddedStorage.start(directory);
storageManager.setRoot("baeldung-demo");
storageManager.storeRoot();

However, as this root instance contains no children, our String instance comprises our entire database. Therefore, we would usually need to define a custom root type specific to our application:

public class RootInstance {

    private final String name;
    private final List<Book> books;

    public RootInstance(String name) {
        this.name = name;
        books = new ArrayList<>();
    }

    // standard getters, hashcode and equals
}

We can register a root instance using a custom type in a similar way, by calling the setRoot() and storeRoot() methods:

EmbeddedStorageManager storageManager = EmbeddedStorage.start(directory);
storageManager.setRoot(new RootInstance("baeldung-demo"));
storageManager.storeRoot();

For now, our books list will be empty, but with our custom root, we’ll be able to store book instances later on:

RootInstance rootInstance = (RootInstance) storageManager.root();
assertThat(rootInstance.getName()).isEqualTo("baeldung-demo");
assertThat(rootInstance.getBooks()).isEmpty()
storageManager.shutdown();

We should note that once our application has finished working with the storage, it’s recommended to call the shutdown() method for safety.

5. Manipulating Data

Let’s check how we can perform standard CRUD operations via our object graph persisted by MicroStream.

5.1. Storing

When storing new instances, we need to make sure to call the store() method on the correct object. The correct object is the owner of the newly created instances — in our example, a list:

RootInstance rootInstance = (RootInstance) storageManager.root();
List<Book> books = rootInstance.getBooks();
books.addAll(booksToStore);
storageManager.store(books);
assertThat(books).hasSize(2);

Storing a new object would also store all instances referenced by this object. Also, executing the store() method guarantees that the data has been physically written to the underlying storage layer, usually a file system.

5.2. Eager Loading

Loading data with MicroStream can be done in two ways, eager and lazy. Eager loading is the default way of loading objects from a stored object graph. If an already existing database is found during startup, then all objects of a stored object graph are loaded into memory.

After starting an EmbeddedStorageManager instance, we can load the data by getting the root instance of our object graph:

EmbeddedStorageManager storageManager = EmbeddedStorage.start(directory);
if (storageManager.root() == null) {
    RootInstance rootInstance = new RootInstance("baeldung-demo");
    storageManager.setRoot(rootInstance);
    storageManager.storeRoot();
} else {
    RootInstance rootInstance = (RootInstance) storageManager.root();
    // Use existing root loaded from storage
}

A null value of the root instance indicates a non-existing database in the underlying storage.

5.3. Lazy Loading

When we’re dealing with large amounts of data, loading all data directly into the memory at the start might not be a viable option. Therefore, MicroStream also supports lazy loading by wrapping an instance into a Lazy field.

Lazy is a simple wrapper class, similar to the JDK’s WeakReference. Its instances internally hold an identifier and a reference to the actual instance:

private final Lazy<List<Book>> books;

A new ArrayList wrapped in a Lazy can be instantiated using the Reference() method:

books = Lazy.Reference(new ArrayList<>());

Just as with WeakReference, to get the actual instance, we need to call a simple get() method:

public List<Book> getBooks() {
    return Lazy.get(books);
}

The get() method call will reload the data when it’s needed, without developers having to deal with any low-level database identifiers.

5.4. Deleting

Deleting data with MicroStream does not require performing explicit deletion actions. Instead, we just need to clear any references to the object in our object graph and store those changes:

List<Book> books = rootInstance.getBooks();
books.remove(1);
storageManager.store(books);

We should note that the deleted data is not immediately erased from the storage. Rather, a background housekeeping process runs a scheduled cleanup.

6. Query System

Unlike with standard DBMS, MicroStream queries do not operate on the storage directly but run on data in our local system memory. Therefore, there’s no need to learn any special query languages, as all operations are done with plain Java.

A common approach may be to use Streams with standard Java collections:

List<Book> booksFrom1998 = rootInstance.getBooks().stream()
    .filter(book -> book.getYear() == 1998)
    .collect(Collectors.toList());

Given that queries run in memory, memory consumption might be high, but queries can run quickly.

The data storing and loading process can be parallelized by using multiple threads. At the moment, horizontal scaling is not possible, but MicroStream announced they are currently developing an object-graph replication approach. This would enable clustering and data replication over multiple nodes in the future.

7. Conclusion

In this article, we explored MicroStream, an object graph persistence engine for the JVM. We learned how MicroStream solves the object-relational data structure mismatch by applying the same structure for in-memory operations and data persistence.

We explored how to create object graphs using custom root instances. Also, we saw how to store, delete, and load data using the eager and lazy loading approaches. Finally, we looked at MicroStream’s query system based on in-memory operations with plain Java.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.