Sometimes, we need the ID of a document we just inserted into a MongoDB database. For instance, we may want to send back the ID as a response to a caller or log the created object for debugging.
In this tutorial, we'll see how IDs are implemented in MongoDB and how to retrieve the ID of a document we just inserted in a collection via a Java program.
2. What Is the ID of a MongoDB Document?
As in every data storage system, MongoDB needs a unique identifier for each document stored in a collection. This identifier is equivalent to the primary key in relational databases.
In MongoDB, this ID is composed of 12 bytes:
- a 4-byte timestamp value represents the seconds since the Unix epoch
- a 5-byte random value generated once per process. This random value is unique to the machine and the process.
- a 3-byte incrementing counter
The ID is stored in a field named _id and is generated by the client. This means that the ID must be generated before sending the document to the database. On the client side, we can either use a driver-generated ID or generate a custom ID.
We can see that documents created by the same client in the same second will have the first 9 bytes in common. Therefore, the uniqueness of the ID relies on the counter in this case. The counter lets a client create over 16 million documents in the same second.
Although it starts with a timestamp, we should be careful that the identifier is not used as a sorting criterion. This is because documents created in the same second are not guaranteed to be sorted by creation date, as the counter is not guaranteed to be monotonic. Also, different clients may have different system clocks.
The Java driver uses a random number generator for the counter, which is not monotonic. That's why we should not use the driver-generated ID for sorting by creation date.
3. The ObjectId Class
The unique identifier is stored in an ObjectId class which provides convenient methods to get the data stored in the ID without parsing it manually.
For example, here's how we can get the creation date of the ID:
Date creationDate = objectId.getDate();
Likewise, we can retrieve the timestamp of the ID in seconds :
int timestamp = objectId.getTimestamp();
The ObjectId class also provides methods to get the counter, the machine identifier, or the process identifier, but they're all deprecated.
4. Retrieving the ID
The main thing to remember is that, in MongoDB, the client generates the unique identifier of a Document before sending it to the cluster. This is in contrast to sequences in relational databases. This makes the retrieval of this ID quite easy.
4.1. Driver-generated ID
The standard and easy way to generate the unique ID of a Document is by letting the driver do the job. When we insert a new Document to a Collection, if no _id field exists in the Document, the driver generates a new ObjectId before sending the insert command to the cluster.
Our code to insert a new Document into your Collection may look like this :
Document document = new Document(); document.put("name", "Shubham"); document.put("company", "Baeldung"); collection.insertOne(document);
We can see that we never indicate how the ID must be generated.
When the insertOne() method returns, we can get the generated ObjectId from the Document :
ObjectId objectId = document.getObjectId("_id");
We can also retrieve the ObjectId like a standard field of the Document and then cast it to ObjectId:
ObjectId oId = (ObjectId) document.get("_id");
4.2. Custom ID
The other way to retrieve the ID is to generate it in our code and put it in the Document like any other field. If we send a Document with an _id field to the driver, it will not generate a new one.
We might require this in some cases where we need the ID of the MongoDB Document before inserting the Document in the Collection.
We can generate a new ObjectId by creating a new instance of the class :
ObjectId generatedId = new ObjectId();
Or, we can also invoke the static get() method of the ObjectId class:
ObjectId generatedId = ObjectId.get();
Then, we just have to create our Document and use the generated ID. To do so, we can provide it in the Document constructor:
Document document = new Document("_id", generatedId);
Alternatively, we can use the put() method:
When using a user-generated ID, we must be cautious to generate a new ObjectId before each insertion, as duplicated IDs are forbidden. Duplicate IDs will result in a MongoWriteException with a duplicate key message.
The ObjectId class provides several other constructors which allow us to set some parts of the identifier:
public ObjectId(final Date date) public ObjectId(final Date date, final int counter) public ObjectId(final int timestamp, final int counter) public ObjectId(final String hexString) public ObjectId(final byte bytes) public ObjectId(final ByteBuffer buffer)
But, we should be very careful when we use those constructors as the uniqueness of the ID provided to the driver relies entirely on our code. We can get duplicate keys error in these particular cases:
- if we use the same date (or timestamp) & counter combo several times
- If we use the same hexadecimal String, byte array, or ByteBuffer several times
In this article, we learned what the MongoDB unique identifier for documents is and how it works. Then, we saw how to retrieve it both after inserting a Document in a Collection and even before inserting it.
As always, the code for these examples is available over on GitHub.