Often, we find it difficult to decide whether a calculation should be performed in the database (RDBMS) or application code to get good performance along with convenience at the same time.
In this article, we'll explore the advantages and disadvantages of performing calculations in the database and application code.
We'll consider a few factors that can influence this decision, and we'll discuss which layer (database or application) is better suited to handle them.
2. Calculation in the Database
2.1. Data Selection and Aggregation
Relational databases are highly optimized for the handling, selection, and aggregation of data. We can easily group, order, filter, and aggregate data using SQL.
For example, we can easily select and deselect datasets from multiple tables using LEFT and RIGHT JOIN.
Similarly, aggregate functions like MIN, MAX, SUM, and AVG are quite handy and faster than a Java implementation.
Also, we can fine-tune the performance of the disk IO by leveraging indexes while aggregating data.
2.2. Volume of Data
All popular RDBMS provide unmatched performance in handling a large volume of data from tables for performing a calculation.
However, we'll require a lot of resources like memory and CPU processing to process a similar volume of data in the application as compared to a database.
Also, to save bandwidth, it's advised to perform data-centric calculations in the database, thereby avoiding the transfer of large volumes of data over the network.
3. Calculation in the Application
Unlike the database, higher-level languages like Java are well equipped in dealing with complex calculations.
For example, we can leverage asynchronous programming, parallel execution, and multithreading in Java to solve a complex problem.
Similarly, the database provides minimal support for logging and debugging. However, today's higher-level languages have excellent support for such critical features, which are often handy in implementing a complex calculation.
For instance, we can easily add logging in a Java application by using SLF4J and use popular IDEs like Eclipse and IntelliJ IDEA for debugging. Therefore, performing a calculation in the application is a convenient option for a developer as compared to the database.
Likewise, another argument is that we can easily unit test our calculations in the application code, which is fairly complex to perform in the database.
Unit testing proves quite handy in keeping a check on the changes in the implementations. So, when performing a calculation in the Java application, we can use JUnit to add unit tests.
3.2. Advanced Data Analysis and Transformation
The database provides limited support for advanced data analysis and transformation. However, it's simple to perform complex computations using the application code.
For instance, a variety of libraries like Deeplearning4J, Weka, and TensorFlow are available for advanced statistics and machine learning support.
Another common use-case is that we can easily objectify the data using ORM technologies like Hibernate, use APIs like Java Streams to process it, and produce results in various formats through XML or JSON parsing libraries.
Achieving database scalability can be a daunting task as RDBMS can only scale up. However, the application code offers a more scalable solution.
We can easily scale out the app-servers and handle a large number of requests using a load balancer.
4. Database vs. Application
Now that we've seen the advantages of performing a calculation based on certain factors at each of the layers, let's summarize their differences:
- The database is a preferred choice for data selection, aggregation, and handling large volumes
- However, performing a calculation in the application code looks a better candidate when considering factors like complexity, advanced-data transformation, third-party integrations, and scalability
- Also, higher-level languages provide extra benefits like logging, debugging, error handling, and unit testing capabilities
It's always a good idea to mix and leverage the best of both layers to solve a complex problem.
In other words, use the database for selection and aggregation of data, then transmit useful lean data to the application and perform complex operations over it using an efficient higher-level language.
In this article, we explored the pros and cons of performing calculations in the application and database.
First, we discussed the advantages of performing calculations in both the database and application layers. Then, we summarized our conclusions about performing a calculation based on all the factors we discussed.
Course – LSD (cat=Persistence) res – Persistence (eBook) (cat=Persistence)