Persistence top

I just announced the new Learn Spring course, focused on the fundamentals of Spring 5 and Spring Boot 2:

>> CHECK OUT THE COURSE

1. Overview

In this quick tutorial, we'll learn about the performance difference between save() and saveAll() methods in Spring Data.

2. Application

In order to test the performance, we'll need a Spring application with an entity and a repository.

Let's create a book entity:

@Entity
public class Book {

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private long id;

    private String title;
    private String author;

    // constructors, standard getters and setters
}

In addition, let's create a repository for it:

public interface BookRepository extends JpaRepository<Book, Long> {
}

3. Performance

To test the performance, we'll save 10,000 books using both methods.

First, we'll use the save() method:

for(int i = 0; i < bookCount; i++) {
    bookRepository.save(new Book("Book " + i, "Author " + i));
}

Then, we'll create a list of books and use the saveAll() method to save all of them at once:

List<Book> bookList = new ArrayList<>();
for (int i = 0; i < bookCount; i++) {
    bookList.add(new Book("Book " + i, "Author " + i));
}

bookRepository.saveAll(bookList);

In our tests, we noticed that the first method took around 2 seconds, and the second one took approximately 0.3 seconds.

Furthermore, when we enabled JPA Batch Inserts, we observed a decrease of up to 10% in the performance of the save() method, and an increase of up to 60% on the saveAll() method.

4. Differences

Looking into the implementation of the two methods, we can see that saveAll() iterates over each element and uses the save() method in each iteration. This implies that it shouldn't be such a big performance difference.

Looking more closely, we observe that both methods are annotated with @Transactional.

Furthermore, the default transaction propagation type is REQUIRED, which means that, if not provided, a new transaction is created each time the methods are called.

In our case, each time we call the save() method, a new transaction is created, whereas when we call saveAll(), only one transaction is created, and it's reused later by save().

This overhead translates into the performance difference that we noticed earlier.

Finally, the overhead is bigger when batching is enabled due to the fact that it's done at the transaction level.

5. Conclusion

In this article, we've learned about the performance difference between the save() and saveAll() methods in Spring Data.

Ultimately, choosing whether to use one method over another can have a big performance impact on the application.

As always, the code for these examples is available over on GitHub.

Persistence bottom

I just announced the new Learn Spring course, focused on the fundamentals of Spring 5 and Spring Boot 2:

>> CHECK OUT THE COURSE
guest
0 Comments
Inline Feedbacks
View all comments