I usually post about Persistence on Twitter - you can follow me there:

1. Introduction

In this introduction to the Couchbase SDK for Java, we demonstrate how to interact with a Couchbase document database, covering basic concepts such as creating a Couchbase environment, connecting to a cluster, opening data buckets, using the basic persistence operations, and working with document replicas.

2. Maven Dependencies

If you are using Maven, add the following to your pom.xml file:

<dependency>
    <groupId>com.couchbase.client</groupId>
    <artifactId>java-client</artifactId>
    <version>2.2.6</version>
</dependency>

3. Getting Started

The SDK provides the CouchbaseEnvironment interface and an implementation class DefaultCouchbaseEnvironment containing default settings for managing access to clusters and buckets. The default environment settings can be overridden if necessary, as we will see in section 3.2.

Important: The official Couchbase SDK documentation cautions users to ensure that only one CouchbaseEnvironment is active in the JVM, since the use of two or more may result in unpredictable behavior.

3.1. Connecting to a Cluster with a Default Environment

To have the SDK automatically create a CouchbaseEnvironment with default settings and associate it with our cluster, we can connect to the cluster simply by providing the IP address or hostname of one or more nodes in the cluster.

In this example, we connect to a single-node cluster on our local workstation:

Cluster cluster = CouchbaseCluster.create("localhost");

To connect to a multi-node cluster, we would specify at least two nodes in case one of them is unavailable when the application attempts to establish the connection:

Cluster cluster = CouchbaseCluster.create("192.168.4.1", "192.168.4.2");

Note: It is not necessary to specify every node in the cluster when creating the initial connection. The CouchbaseEnvironment will query the cluster once the connection is established in order to discover the remaining nodes (if any).

3.2. Using a Custom Environment

If your application requires fine tuning of any of the settings provided by DefaultCouchbaseEnvironment, you can create a custom environment and then use that environment when connecting to your cluster.

Here’s an example that connects to a single-node cluster using a custom CouchbaseEnvironment with a ten-second connection timeout and a three-second key-value lookup timeout:

CouchbaseEnvironment env = DefaultCouchbaseEnvironment.builder()
  .connectTimeout(10000)
  .kvTimeout(3000)
  .build();
Cluster cluster = CouchbaseCluster.create(env, "localhost");

And to connect to a multi-node cluster with the custom environment:

Cluster cluster = CouchbaseCluster.create(env,
  "192.168.4.1", "192.168.4.2");

3.3. Opening a Bucket

Once you have connected to the Couchbase cluster, you can open one or more buckets.

When you first set up a Couchbase cluster, the installation package automatically creates a bucket named “default” with a blank password.

Here’s one way to open the “default” bucket when it has a blank password:

Bucket bucket = cluster.openBucket();

You can also specify the bucket name when opening it:

Bucket bucket = cluster.openBucket("default");

For any other bucket with a blank password, you must supply the bucket name:

Bucket myBucket = cluster.openBucket("myBucket");

To open a bucket that has a non-blank password, you must supply the bucket name and password:

Bucket bucket = cluster.openBucket("bucketName", "bucketPassword");

4. Persistence Operations

In this section, we show how to perform CRUD operations in Couchbase. In our examples, we will be working with simple JSON documents representing a person, as in this sample document:

{
  "name": "John Doe",
  "type": "Person",
  "email": "[email protected]",
  "homeTown": "Chicago"
}

The “type” attribute is not required, however it is common practice to include an attribute specifying the document type in case one decides to store multiple types in the same bucket.

4.1. Document IDs

Each document stored in Couchbase is associated with an id that is unique to the bucket in which the document is being stored. The document id is analogous to the primary key column in a traditional relational database row.

Document id values must be UTF-8 strings of 250 or fewer bytes.

Since Couchbase does not provide a mechanism for automatically generating the id on insertion, we must provide our own.

Common strategies for generating ids include key-derivation using a natural key, such as the “email” attribute shown in our sample document, and the use of UUID strings.

For our examples, we will generate random UUID strings.

4.2. Inserting a Document

Before we can insert a new document into our bucket, we must first create an instance of JSONObject containing the document’s contents:

JsonObject content = JsonObject.empty()
  .put("name", "John Doe")
  .put("type", "Person")
  .put("email", "[email protected]")
  .put("homeTown", "Chicago");

Next, we create a JSONDocument object consisting of an id value and the JSONObject:

String id = UUID.randomUUID().toString();
JsonDocument document = JsonDocument.create(id, content);

To add a new document to the bucket, we use the insert method:

JsonDocument inserted = bucket.insert(document);

The JsonDocument returned contains all of the properties of the original document, plus a value known as the “CAS” (compare-and-swap) value that Couchbase uses for version tracking.

If a document with the supplied id already exists in the bucket, Couchbase throws a DocumentAlreadyExistsException.

We can also use the upsert method, which will either insert the document (if the id is not found) or update the document (if the id is found):

JsonDocument upserted = bucket.upsert(document);

4.3. Retrieving a Document

To retrieve a document by its id, we use the get method:

JsonDocument retrieved = bucket.get(id);

If no document exists with the given id, the method returns null.

4.4. Updating or Replacing a Document

We can update an existing document using the upsert method:

JsonObject content = document.content();
content.put("homeTown", "Kansas City");
JsonDocument upserted = bucket.upsert(document);

As we mentioned in section 4.2, upsert will succeed whether a document with the given id was found or not.

If enough time has passed between when we originally retrieved the document and our attempt to upsert the revised document, there is a possibility that the original document will have been deleted from the bucket by another process or user.

If we need to guard against this scenario in our application, we can instead use the replace method, which fails with a DocumentDoesNotExistException if a document with the given id is not found in Couchbase:

JsonDocument replaced = bucket.replace(document);

4.5. Deleting a Document

To delete a Couchbase document, we use the remove method:

JsonDocument removed = bucket.remove(document);

You may also remove by id:

JsonDocument removed = bucket.remove(id);

The JsonDocument object returned has only the id and CAS properties set; all other properties (including the JSON content) are removed from the returned object.

If no document exists with the given id, Couchbase throws a DocumentDoesNotExistException.

5. Working with Replicas

This section discusses Couchbase’s virtual bucket and replica architecture and introduces a mechanism for retrieving a replica of a document in the event that a document’s primary node is unavailable.

5.1. Virtual Buckets and Replicas

Couchbase distributes a bucket’s documents across a collection of 1024 virtual buckets, or vbuckets, using a hashing algorithm on the document id to determine the vbucket in which to store each document.

Each Couchbase bucket can also be configured to maintain one or more replicas of each vbucket. Whenever a document is inserted or updated and written to its vbucket, Couchbase initiates a process to replicate the new or updated document to its replica vbucket.

In a multi-node cluster, Couchbase distributes vbuckets and replica vbuckets among all the data nodes in the cluster. A vbucket and its replica vbucket are kept on separate data nodes in order to achieve a certain measure of high-availability.

5.2. Retrieving a Document From a Replica

When retrieving a document by its id, if the document’s primary node is down or otherwise unreachable due to a network error, Couchbase throws an exception.

You can have your application catch the exception and attempt to retrieve one or more replicas of the document using the getFromReplica method.

The following code would use the first replica found:

JsonDocument doc;
try{
    doc = bucket.get(id);
}
catch(CouchbaseException e) {
    List<JsonDocument> list = bucket.getFromReplica(id, ReplicaMode.FIRST);
    if(!list.isEmpty()) {
        doc = list.get(0);
     }
}

Note that it is possible, when writing your application, to have write operations block until persistence and replication are complete. However the more common practice, for reasons of performance, is to have the application return from writes immediately after writing to memory of a document’s primary node, because disk writes are inherently slower than memory writes.

When using the latter approach, if a recently updated document’s primary node should fail or go offline before the updates have been fully replicated, replica reads may or may not return the latest version of the document.

It is also worth noting that Couchbase retrieves replicas (if any are found) asynchronously. Therefore if your bucket is configured for multiple replicas, there is no guarantee as to the order in which the SDK returns them, and you may want to loop through all the replicas found in order to ensure that your application has the latest replica version available:

long maxCasValue = -1;
for(JsonDocument replica : bucket.getFromReplica(id, ReplicaMode.ALL)) {
    if(replica.cas() > maxCasValue) {
        doc = replica;
        maxCasValue = replica.cas();
    }
}

6. Conclusion

We have introduced some basic usage scenarios that you will need in order to get started with the Couchbase SDK.

Code snippets presented in this tutorial can be found in the github project.

You can learn more about the SDK at the official Couchbase SDK developer documentation site.

I usually post about Persistence on Twitter - you can follow me there:


  • Note the discussion on consistency here is wrong. Couchbase is consistent, but replica reads can be inconsistent. There application can choose to block until the replica is updated, but that is not as common. It’s a mistake to say that Couchbase trades off consistency for performance.

    • Hey Matt, that’s a good point – I talked to the author and updated the article.
      Cheers,
      Eugen.