Partner – DBSchema – NPI (tag = SQL)
announcement - icon

DbSchema is a super-flexible database designer, which can take you from designing the DB with your team all the way to safely deploying the schema.

The way it does all of that is by using a design model, a database-independent image of the schema, which can be shared in a team using GIT and compared or deployed on to any database.

And, of course, it can be heavily visual, allowing you to interact with the database using diagrams, visually compose queries, explore the data, generate random data, import data or build HTML5 database reports.

>> Take a look at DBSchema

1. Introduction

Java Database Connectivity (JDBC) is a Java API used for interacting with databases. Batch processing groups multiple queries into one unit and passes it in a single network trip to a database.

In this article, we’ll discover how JDBC can be used for batch processing of SQL queries.

For more on JDBC, you can check out our introduction article here.

2. Why Batch Processing?

Performance and data consistency are the primary motives to do batch processing.

2.1. Improved Performance

Some use cases require a large amount of data to be inserted into a database table. While using JDBC, one of the ways to achieve this without batch processing, is to execute multiple queries sequentially.

Let’s see an example of sequential queries sent to database:

statement.execute("INSERT INTO EMPLOYEE(ID, NAME, DESIGNATION) "
 + "VALUES ('1','EmployeeName1','Designation1')"); 
statement.execute("INSERT INTO EMPLOYEE(ID, NAME, DESIGNATION) "
 + "VALUES ('2','EmployeeName2','Designation2')");

These sequential calls will increase the number of network trips to database resulting in poor performance.

By using batch processing, these queries can be sent to the database in one call, thus improving performance.

2.2. Data Consistency

In certain circumstances, data needs to be pushed into multiple tables. This leads to an interrelated transaction where the sequence of queries being pushed is important.

Any errors occurring during execution should result in a rollback of the data pushed by previous queries if any.

Let’s see an example of adding data to multiple tables:

statement.execute("INSERT INTO EMPLOYEE(ID, NAME, DESIGNATION) "
 + "VALUES ('1','EmployeeName1','Designation1')"); 
statement.execute("INSERT INTO EMP_ADDRESS(ID, EMP_ID, ADDRESS) "
 + "VALUES ('10','1','Address')");

A typical problem in the above approach arises when the first statement succeeds and the second statement fails. In this situation there is no rollback of the data inserted by the first statement, leading to data inconsistency.

We can achieve data consistency by spanning a transaction across multiple insert/updates and then committing the transaction at the end or performing a rollback in case of exceptions, but in this case, we’re still hitting the database repeatedly for each statement.

3. How To Do Batch Processing

JDBC provides two classes, Statement and PreparedStatement to execute queries on the database. Both classes have their own implementation of the addBatch() and executeBatch() methods which provide us with the batch processing functionality.

3.1. Batch Processing Using Statement

With JDBC, the simplest way to execute queries on a database is via the Statement object.

First, using addBatch() we can add all SQL queries to a batch and then execute those SQL queries using executeBatch().

The return type of executeBatch() is an int array indicating how many records were affected by the execution of each SQL statement.

Let’s see an example of creating and executing a batch using Statement:

Statement statement = connection.createStatement();
statement.addBatch("INSERT INTO EMPLOYEE(ID, NAME, DESIGNATION) "
 + "VALUES ('1','EmployeeName','Designation')");
statement.addBatch("INSERT INTO EMP_ADDRESS(ID, EMP_ID, ADDRESS) "
 + "VALUES ('10','1','Address')");
statement.executeBatch();

In the above example, we are trying to insert records into the EMPLOYEE and EMP_ADDRESS tables using Statement. We can see how SQL queries are being added in the batch to be executed.

3.2. Batch Processing Using PreparedStatement

PreparedStatement is another class used to execute SQL queries. It enables reuse of SQL statements and requires us to set new parameters for each update/insert.

Let’s see an example using PreparedStatement. First, we set up the statement using an SQL query encoded as a String:

String[] EMPLOYEES = new String[]{"Zuck","Mike","Larry","Musk","Steve"};
String[] DESIGNATIONS = new String[]{"CFO","CSO","CTO","CEO","CMO"};

String insertEmployeeSQL = "INSERT INTO EMPLOYEE(ID, NAME, DESIGNATION) "
 + "VALUES (?,?,?)";
PreparedStatement employeeStmt = connection.prepareStatement(insertEmployeeSQL);

Next, we loop through an array of String values and add a newly configured query to the batch.

Once the loop is finished, we execute the batch:

for(int i = 0; i < EMPLOYEES.length; i++){
    String employeeId = UUID.randomUUID().toString();
    employeeStmt.setString(1,employeeId);
    employeeStmt.setString(2,EMPLOYEES[i]);
    employeeStmt.setString(3,DESIGNATIONS[i]);
    employeeStmt.addBatch();
}
employeeStmt.executeBatch();

In the example shown above, we are inserting records into the EMPLOYEE table using PreparedStatement. We can see how values to be inserted are set in the query and then added to the batch to be executed.

4. Conclusion

In this article, we saw how batch processing of SQL queries are important while interacting with databases using JDBC.

As always, the code related to this article can be found over on Github.

Course – LSD (cat=Persistence)

Get started with Spring Data JPA through the reference Learn Spring Data JPA course:

>> CHECK OUT THE COURSE
res – Persistence (eBook) (cat=Persistence)
Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.