Getting Started With RethinkDB

Last updated: January 8, 2024

Written by: Graham Cox

Reviewed by: Grzegorz Piwowarek

NoSQL

Persistence Basics

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

1. Introduction

In this article, we’re going to have a look at RethinkDB. This is an open-source, NoSQL database that is designed for use in real-time applications. We’ll see what features it brings to our applications, what we can do with it, and how to interact with it.

2. What Is RethinkDB?

RethinkDB is an open-source NoSQL database emphasizing scalability and high availability. It allows us to store JSON documents that we can then later query. We also have the ability to perform joins across multiple tables within our database and to perform map-reduce functions on our data.

However, what makes RethinkDB stand apart is its real-time streaming capabilities. We can execute queries against our database so that changes to the resultset are constantly streamed back to the client, allowing us to get real-time updates to our data. This means that our applications can give immediate updates to our users whenever anything changes.

3. Running and Using RethinkDB

RethinkDB is a native application written in C++. Pre-built packages are available for most platforms. There’s also an official Docker image.

Once installed, we can start the database by simply running the executable. If necessary, we can tell it where to store the data files, but this will have a sensible default if not. We can also configure the ports it listens on and even run multiple servers in a cluster configuration for scaling and availability. All of this, and more, can be seen in the official documentation.

We then need actually to use the database from our application. This will require us to use an appropriate client to connect to it — with options in many languages. However, for this article, we’ll use the Java client.

Adding the client to our application is as simple as adding a single dependency:

<dependency>
    <groupId>com.rethinkdb</groupId>
    <artifactId>rethinkdb-driver</artifactId>
    <version>2.4.4</version>
</dependency>

Next, we actually need to connect to the database. For this, we need a Connection:

Connection conn = RethinkDB.r.connection()
  .hostname("localhost")
  .port(28015)
  .connect();

4. Interacting With RethinkDB

Now that we have a connection to RethinkDB, we need to know how to use it. At the basic level for a database, this means we need to be able to create, manipulate, and retrieve data.

All interaction with RethinkDB is done with a programmatic interface. Rather than write queries in a custom query language, we write them in standard Java using a richer type model. This gives us the advantage of getting the compiler to ensure our queries are valid instead of finding out at runtime that we have a problem with them.

4.1. Working With Tables

A RethinkDB instance exposes several databases, each storing data in tables. These are conceptually similar to tables in a SQL database. However, RethinkDB doesn’t enforce schemas on our tables, instead leaving this to the application.

We can create a new table by using our connection:

r.db("test").tableCreate(tableName).run(conn);

Equally, we can drop tables using Db.tableDrop(). We can also list all known tables using Db.tableList():

r.db(dbName).tableCreate(tableName).run(conn);
List<String> tables = r.db(dbName).tableList().run(conn, List.class).first();
assertTrue(tables.contains(tableName));

4.2. Inserting Data

Once we have tables to work with, we need to be able to populate them. We can do this by using Table.insert() and providing it with the data.

Let’s insert some data into our table by providing objects constructed by the RethinkDB API itself:

r.db(DB_NAME).table(tableName)
  .insert(r.hashMap().with("name", "Baeldung"))
  .run(conn);

Or alternatively, we can provide standard Java collections:

r.db(DB_NAME).table(tableName)
  .insert(Map.of("name", "Baeldung"))
  .run(conn);

This data we insert can be as simple as a single key/value pair or as complicated as it needs to be. This can include nested structures, arrays, or anything that is desired:

r.db(DB_NAME).table(tableName)
  .insert(
    r.hashMap()
      .with("name", "Baeldung")
      .with("articles", r.array(
        r.hashMap()
          .with("id", "article1")
          .with("name", "String Interpolation in Java")
          .with("url", "https://www.baeldung.com/java-string-interpolation"),
        r.hashMap()
          .with("id", "article2")
          .with("name", "Access HTTPS REST Service Using Spring RestTemplate")
          .with("url", "https://www.baeldung.com/spring-resttemplate-secure-https-service"))
      )
).run(conn);

Every record that is inserted will have a unique ID — either one that we provided as the “id” field on the record or else a randomly generated one from the database.

4.3. Retrieving Data

Now that we have a database that contains some data, we need to be able to get it out again. As with all databases, we do this by querying the database.

The simplest thing that we can do is to query a table without anything extra:

Result<Map> results = r.db(DB_NAME).table(tableName).run(conn, Map.class);

Our result object gives us several ways to access the results, including being able to treat it directly as an iterator:

for (Map result : results) {
    // Process result
}

We also have the ability to convert the results to a List or a Stream – including a parallel stream – if we then want to treat the results as a normal Java collection.

If we want to retrieve only a subset of the results, we can apply a filter while running the query. This is done by providing a Java lambda to perform the queries:

Result<Map> results = r.db(DB_NAME)
  .table(tableName)
  .filter(r -> r.g("name").eq("String Interpolation in Java"))
  .run(conn, Map.class);

Our filter is evaluated against the rows in the table, and only those that match will get returned in our resultset.

We can also go directly to a single row by the ID value if we know it:

Result<Map> results = r.db(DB_NAME).table(tableName).get(id).run(conn, Map.class);

4.4. Updating and Deleting Data

A database that can’t change data once it’s been inserted only has limited use, so how do we update our data? The RethinkDB API gives us an update() method that we can chain onto the end of our query statement in order to apply those updates to every record that is matched by the query.

These updates are patches, not complete replacements, so we specify only the changes we want to make:

r.db(DB_NAME).table(tableName).update(r.hashMap().with("site", "Baeldung")).run(conn);

As with querying, we can select exactly which records we want to update by using filters. These need to be done before the update is specified. This is because the filters are actually applied to the query that is selecting the records to update, and the update is then applied to everything that matches:

r.db(DB_NAME).table(tableName)
  .filter(r -> r.g("name").eq("String Interpolation in Java"))
  .update(r.hashMap().with("category", "java"))
  .run(conn);

We can also delete records in a similar way by using the delete() call instead of update() at the end of our query:

r.db(DB_NAME).table(tableName)
  .filter(r -> r.g("name").eq("String Interpolation in Java"))
  .delete()
  .run(conn);

5. Live Updates

So far, we’ve seen some examples of how to interact with RethinkDB, but none of this is anything special. Everything we’ve seen is also achievable with most other database systems.

What makes RethinkDB special is the ability to get live updates to our data without our application needing to poll it. Instead, we can execute a query in such a way that the cursor will remain open, and the database will push any changes out to us whenever they happen:

Result<Map> cursor = r.db(DB_NAME).table(tableName).changes().run(conn, Map.class);
cursor.stream().forEach(record -> System.out.println("Record: " + record));

This is hugely powerful when writing real-time applications where we want to get immediate updates — for example, to show live stock prices, game scores, or many other things.

When we execute a query like this, we get a cursor back exactly as before. However, adding changes() will mean that we don’t query the records that are already there. Instead, the cursor will give us an unbounded collection of changes that happen to the records that match the query. This includes inserts, updates, and deletes.

The fact that the cursor is unbounded means that any iteration we perform on it, whether using a normal for-loop or a stream, will continue for as long as we need. What we can’t safely do is collect to a list because there is no end to the list.

Our records returned in the cursor include the new and old values for the changed records. We can then determine if the change was an insert because there’s no old value, a delete because there’s no new value, or an update because there are both old and new values. We can also see the difference between old and new values in an update and react accordingly.

As with all queries, we can also apply filters when we are getting the changes to records. This will cause our cursor only to include records that match this filter. This works even for insert, where the record didn’t exist at the execution time:

Result<Map> cursor = r.db(DB_NAME).table(tableName)
  .filter(r -> r.g("index").eq(5))
  .changes()
  .run(conn, Map.class);

6. Conclusion

We’ve seen here a very brief introduction to the RethinkDB database engine, showing how we can use it for all our traditional database tasks, as well as how to leverage its unique feature of pushing out changes to our clients automatically. This is only a quick tour, and there is much more to this system, so why not try it out yourself?

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.