1. Introduction

Adapting the availability of computing resources according to the demand is a historical demand in the computing scenario.

In the recent past, adding or removing resources from a computer system was a great challenge. These processes typically involved stopping services to modify software configurations and replace the hardware of local servers.

With the emergence of the Internet, cloud computing, and virtualization, the processes of adapting the available resources to the demand became simple and even automatic. Specifically, the X-as-a-Service paradigm brings multiple new features to tackle these processes.

In this way, the concepts of scalability and elasticity appeared and transformed over time. From simple local software and hardware adjustments, increasing and decreasing resources became a research field by itself with different strategies and protocols.

In this tutorial, we’ll study the concepts of scalability and elasticity. At first, we’ll particularly learn about scalability. Next, we’ll understand what exactly elasticity means. Finally, we’ll compare both processes in a systematic summary.

2. Scalability

In short, scalability consists of the ability of a system to be responsive as the demand (load) increases over time. Furthermore, scalable systems must tackle the increasing workload without interrupting the provided service.

Thus, we can first understand scalability as the characteristic of a computing system to meet future demands based on the increasing workload patterns of a given system.

Although scalability handles increasing demand by definition, the system’s workload may decrease in the near future. In such a way, scaling also considers processes to reduce the resources available in the system.

However, it is relevant to highlight that scalability is always looking for the future. It aims to avoid the system suffering from a lack of resources based on demand predictions.

In this way, we’ll explore characteristics and processes related to systems scalability in the following subsections.

2.1. Vertical and Horizontal Scaling

There are two manners to add (or remove) resources from a computing system: modifying the total amount of resources of a computing node or modifying the number of computer nodes.

When we work on modifying the available resources of a specific computing node, we execute a vertical scaling. Vertical scaling includes two particular processes:

  • Scale Up: add computing resources, such as memory, storage, network cards, and processing cores, to a given node of a computing system
  • Scale Down: remove computing resources from a given node of a computing system

The image next shows an example of scaling up and down processes considering a single computing node:

Up Down

On the other hand, modifying the number of available computing nodes consists of horizontal scaling. It also includes two processes:

  • Scale Out: add new computing nodes to a computer system to tackle the increasing demand
  • Scale In: remove computing nodes from a computer system to save or redirect resources

The following image depicts simple scaling out and in processes in a multi-node computing system:

Out In 1

2.2. Scalability Domains

Although we mainly focus on the scalability of computing resources, which we call load scalability, there are other contexts in which scalability processes fit well. Examples are presented below:

  • Heterogeneous Scalability: represents the ability of a computing system to adopt nodes and components of multiple and different vendors
  • Generation Scalability: represents the possibility of replacing old components with other ones from the last generation in a computing system
  • Administrative Scalability: works with the increasing number of customers using a given computing system
  • Functional Scalability: consists of the ability of a computing system to tackle requests and implementation of an increasing number of new functionalities

3. Elasticity

In a few words, elasticity consists of the ability to tackle changes in the workload of a computing system. 

The central idea behind elasticity is to provide sufficient resources to a computing system to deal with momentary demand. If the workload increases, more resources are released to the system; on the contrary, resources are immediately removed from the system when the workload decreases.

Technically, elastic systems execute the same processes shown in the scalability context: vertical scaling (scale up and scale down) and horizontal scaling (scale out and scale in).

But, executing the same processes does not mean having the same purposes. Scalability focuses on the general behavior and average workload of a system, trying to predict demands in the medium-term future.

Elasticity, in turn, works with the current workload of a system, executing several scaling processes to deal with, for example, punctual or unexpected events. These events are outliers considering the systems’ average workload and typically occur for a short period.

For instance, let’s consider an online shop with an average number of accesses of X. This shop does a big sale on a product. Before starting the sale, the managers predict traffic two times greater than the average and scale the system.

However, the big sale became a huge success, and people accessing the shop generated a traffic four times greater than the average. As the shop system is elastic, several scaling processes got triggered to accomplish this unexpected traffic, automatically increasing and decreasing resources according to the traffic fluctuations.

The following figure depicts the previously presented example:

Elasticity 1

3.1. Over-provisioning and Under-provisioning

Elasticity aims to avoid both lack and waste of resources by matching the needs of a system in real-time or in a very short-term future.

However, dealing with unexpected demands is a great challenge. Two relevant problems may occur with elastic systems: over-provisioning and under-provisioning.

If a system gets more resources than necessary to deal with the current workload, it is involved in an over-provisioning scenario. So, if these resources are obtained in a pay-as-you-go model, wasting them may result in substantial economic losses.

On the contrary, an under-provisioning scenario happens when the system gets fewer resources than necessary. Thus, the system gets overloaded, reducing the quality of service and even refusing to attend to new customers. It can finally result in economic losses too.

So, it is crucial to tun elastic systems to make the best possible decisions to avoid over-provisioning and under-provisioning situations.

4. Systematic Summary

The emergence of cloud computing and virtualization technologies opened new horizons on how to maintain a computing system providing good quality of service and experience to its customers.

The flexibility of these paradigms and technologies enabled managers and developers to create strategies to meet the present and future systems’ workload demands.

These strategies, in turn, are intrinsically related to the system’s scalability and elasticity.

In summary, scalability represents strategies to meet the services’ expected operation and quality considering a continuously increasing workload. So, scalability is typically associated with medium long-term maintenance strategies in the system.

Elasticity also intends to keep service quality over time. However, it tackles momentum scenarios where the workload unexpectedly increases or decreases, thus immediately adjusting the systems’ resources to meet these transitory requirements.

The following table evinces and compares some relevant aspects of scalability and elasticity:

Scalability Elasticity
Resources Availability To exceed future demands To meet present demands
Execution Scenario Considering medium and long-term predictions of workload Considering short-term scenarios of much or little workload
Execution Manner Typically scheduled by a system manager Typically triggered by an automatic system
Most Relevant Processes Scale up Scale out Scale up Scale down Scale out Scale in

5. Conclusion

In this tutorial, we studied the scalability and elasticity of a computing system. At first, we explored scalability, its characteristics, and its most relevant processes. So, we investigated the concepts of elasticity. Finally, we reviewed and compared scalability and elasticity in a summarized way.

We can conclude that both scalability and elasticity are undeniably improvements for computing systems. Scalable and elastic systems can successfully operate in different scenarios, providing a good quality of services and a good experience for end-users.