Graphs: Sparse vs Dense | Baeldung on Computer Science

1. Overview

In this tutorial, we’ll study the difference between sparse and dense graphs in graph theory.

We’ll first start by discussing the concepts of size and order in a graph, from which we’ll derive a definition of graph density.

In relation to the density of a graph, we’ll then define the two categories of sparse and dense graphs.

At the end of this tutorial, we’ll be familiar with the concept of density in a graph and also know its implication in terms of memory storage.

2. Density in Graphs

2.1. The General Idea of Graph Density

In our introductory article on graph theory, we discussed the foundational concepts of the discipline. We can refer to it while reading this article if we need a refreshment on the meaning of ideas such as vertices or edges.

In this article, instead, we address the concept of graph density in relation to the graph’s size and order. To understand intuitively what the concept of graph density means, we can use a sample undirected graph containing five vertices and four edges:

We can see that in this graph, many vertices aren’t connected, such as $(2, 5)$ or $(3,4)$ . This may lead us to imagine that, if we add some more edges without changing the number of vertices, then the graph would become somewhat “denser”:

We can also imagine that two extreme cases exist, in which either the graph is maximally dense:

Or, on the contrary, minimally dense:

By comparing these graphs, we could derive the following general notion about graph density. A density is some metric that tells us how “full” a graph is, in terms of the number of edges that it possesses, and in relation to the number of its vertices.

This lets us formulate some expectations on what the density of a graph should be. We want a density metric that’s continuous in nature, and that describes the relationship between the number of edges and the maximum number of edges, as a function of the number of vertices. Lastly, we also want this metric to be somewhat standardized, and allow comparison between graphs with a different number of vertices.

This means that we’re looking for a measure $D$ for the density of a graph $G(V, E)$ that has the form of $D(V, E)$ . This measure should vary between $D(V, E) = 0$ , which describes perfectly sparse graphs, and $D(V, E) = 1$ , which describes perfectly dense graphs. We’ll see shortly how to compute it. But first, to do so, we have to define size and order in graphs.

2.2. Size and Order of a Graph

We mentioned that to determine the density, we’ll need to know the maximum number of edges in a graph. To calculate it, we have to use two additional measures: size and orders of a graph.

The size of a graph is simply the number $|E|$ of edges contained in it. If $|E|=0$ , then the set $E = \O$ of edges is empty, and we can thus say that the graph $G(V, \O)$ is itself also empty:

The order of the graph is, instead, the number $|V|$ of vertices $V$ contained in it. Since a graph of the form $G(\O, \O)$ isn’t a graph, we can say that $|V| \in \mathbb{N}^{+}$ . If two graphs $G_1 (V_1, E_1)$ and $G_2 (V_2, E_2)$ are such that $|V_1| = |V_2| \wedge |E_1| = |E_2| = \O$ , then the two graphs are empty and equivalent. As a consequence, we expect for them the density to be the same, thus $D = 0$ .

We can extend this consideration to cases in which the order of two empty graphs isn’t identical. To this regard, we can say that two empty graphs $G_1 (V_1, \O)$ and $G_2 (V_2, \O)$ of different orders, and therefore with $|V_1| \neq |V_2|$ , are also equally dense:

This lets us state that the density of any empty graphs has to be zero. But if the size of the graph is non-null, then graphs of the same size don’t necessarily have the same density:

This means that the density is proportional to the size $|E|$ of a graph, but also inversely proportional to some function of its order $|V|$ .

2.3. Maximum Number of Edges

The function of the order $|V|$ which is inversely proportional to the density is the maximum number of edges, which depends on the order but not on the size of the graph. We can define the maximum number of edges $\text{Max}_U(V)$ in relation to the order $|V|$ of an undirected graph, as:

$\text{Max}_U (V) = \binom{|V|}{2} = \frac{|V| \cdot ( |V| - 1 )}{2}$

Notice that, so far, we have assumed that the graph is undirected. However, we can extend this formula to directed graphs.

As we discussed in our article on the comparison between directed and undirected graphs, the latter always possesses a symmetric adjacency matrix, whereas the former does not necessarily do. Directed graphs, since they don’t satisfy this condition, have twice as many possible edges. As a consequence, we can calculate their maximum number of edges as:

$\text{Max}_D (V) = 2 \cdot \binom{|V|}{2} = |V| \cdot ( |V| - 1 ) = 2 \cdot \text{Max}_U (V)$

And with this, we now have all the elements we need to define the density of a graph formally.

3. The Density of a Graph

3.1. Formal Definition of Density

The equation for the density $D$ of a graph $G(V, E)$ that we use leverages the definitions of size, order, and maximum number of edges that we provided above. For an undirected graph, the density $D_U(V, E)$ is:

$D_U(V, E) = \frac {|E|} {\text{Max}_U (V)} = \frac {|E|} {\frac{|V| \cdot (|V|-1)}{2}} = \frac {2 \cdot |E|} {|V| \cdot (|V|-1)}$

Similarly, we can also define a density measure $D_D$ for directed graphs. As mentioned above, if the graph is directed, then the maximum number of edges $\text{Max}_D(V)$ is twice the one we calculated for the corresponding undirected graph. As a consequence, for directed graphs, we can calculate their density $D_D$ as half that of the corresponding undirected graph, or:

$D_D(V, E) = \frac {|E|} {\text{Max}_D (V)} = \frac {|E|} {|V| \cdot (|V|-1)} = \frac{1}{2} \cdot D_U(V, E)$

Notice also how both densities are comprised in the interval $[0, 1]$ , as expected, because $0 \leq |E| \leq \text{Max}(V)$ . Additionally, notice how $D = 0$ indicates an empty graph and $D = 1$ indicates a fully connected graph. After defining density in this manner, we can now give a definition for sparse and dense graphs in relation to their density.

3.2. Sparse Versus Dense Graphs

The sparse graph is a graph whose density $D$ is in the lower range of the density’s codomain, or $0 \leq D < \frac {1} {2}$ . Analogously, a dense graph is a graph whose density $D$ is in the higher range of its codomain, or $\frac {1} {2} < D \leq 1$ . The graph for which $D = \frac{1}{2}$ can be treated indifferently as a sparse or a dense graph, but we suggest to consider them as neither.

We can now express some general characteristics of the density of a graph, in relation to its type:

the density of a graph of order $|V| = 1$ is undefined, both for algebraic reasons and, intuitively, because it can either be seen as perfectly sparse or perfectly dense
all empty graphs have a density of 0 and are therefore sparse
all complete graphs have a density of 1 and are therefore dense
an undirected traceable graph has a density of at least $\frac{2}{|V|}$ , so it’s guaranteed to be dense for $|V| < 4$
a directed traceable graph is never guaranteed to be dense
a tournament has a density of $\frac{1}{2}$ , regardless of its order

3.3. Examples of Density in Graphs

Now that we know how to compute the density of a graph, we can apply it in a practical exercise. We’ll take as an example the first graph we encountered in this tutorial:

This graph has a form $G(V, E)$ , where $V = \{1, 2, 3, 4, 5\}$ and $E = \{(1, 2), (1, 3), (1, 4), (1, 5)\}$ . Therefore, its first two characteristics are $|V| = 5$ and $|E| = 4$ . Because the graph is undirected, we can calculate its maximum number of edges as:

$\text{Max}_U (V) = \binom{|V|}{2} = \frac{5 \cdot ( 5 - 1 )}{2} = 10$

From this, we can then calculate the density of the graph as:

$D_U(V, E) = \frac {|E|} {\text{Max}_U (V)} = \frac {4} {10} = \frac {2} {5}$

Because $D_U(V, E) < \frac {1} {2}$ , we can conclude that $G(V, E)$ is a sparse graph.

If, instead, the graph had just two extra edges; say, $(2, 3)$ and $(3, 4)$ , then it would look like this:

And the related calculations would change as follows:

$|E| = 6 \to D_U(V, E) = \frac {|E|} {\text{Max}_U (V)} = \frac {6} {10} = \frac {3} {5}$

This, in turn, makes the extended graph a dense graph, because $D_U > \frac {1} {2}$ .

4. Graph Density and Memory Storage

In conclusion to this article, we can point at a practical reason why the density of graphs in programming matters. This has to do with the storage of the graph in memory. Graphs tend to be very large data structures, and for some applications such as knowledge representation, they may end up being untreatable unless we take precautions.

One such precaution consists in storing the graph in the format that’s more efficient, in relation to its density. Any graph can be expressed as a data structure in two manners:

As a list of edges $E = [e_1, e_2, ... , e_{|E|}]$
Or, as an adjacency matrix $A_{|V|, |V|}$

These objects perform in a significantly different manner in terms of memory usage and access. As a general rule, however, the density of the graph on which we’re working dictates the preferable method for storage:

If the graph is sparse, we should store it as a list of edges
Alternatively, if the graph is dense, we should store it as an adjacency matrix

5. Conclusion

In this article, we studied the definition of density in a graph in relation to its size, order, and the maximum number of edges.

Then, we discussed the characteristics of some special types of graphs in terms of their density.

In conclusion, we also considered the role that the density of a graph has in the decision on the data structure that we should use to represent it.

Learn Java Collections

Learn Spring

Learn Maven

View All Courses

Core Concepts

Operating Systems

Neural Networks

Graph Theory

Latex

Full Archive

About Baeldung