1. Overview

In this tutorial, we’ll study the foundation of first-order logic and become accustomed to its theoretical and conceptual bases.

We’ll first start by studying the relationship between natural and formal languages. Subsequently, we’ll compare first-order logic with propositional logic. In doing so, we’ll learn the specific characteristics that the former has and when it’s advantageous to use it over the latter.

We’ll finally study the notation and syntax of first-order structures. This will allow us to translate expression in natural language into valid expressions in first-order logic.

At the end of this tutorial, we’ll know what first-order logic is and how it differs from propositional logic. We’ll also know its syntactic rules, and will be able to translate natural language expressions into first-order expressions.

2. First-Order Logic and Natural Languages

2.1. Natural Languages and Logical Computing

First-order logic, like all other systems of formal logic, is a method for formalizing natural languages into a computable format. This, in turn, allows us to treat problems expressed through linguistic sentences in a formal manner. From the formalization of natural language, we then derive the capacity to formulate and prove theorems, which in turn makes it possible to conduct inferential reasoning in disciplines such as mathematics, computer science, and philosophy.

The problem of formalization of natural languages is as old as philosophy itself. In the field of computer science, though, it finds its bases in the 1950s when the first systems for the processing of logical expressions were developed. The first of such systems had severe limitations though, and in particular, they could only work with propositional but not predicate logic.

Propositional logic doesn’t allow the conduct of reasoning over variables and functions having general and mutable content, which means that its capacity for abstraction is limited. This means also that the first logical computing systems couldn’t solve problems whose solution lies in vector spaces of which the propositional space is a subspace. This problem was overcome thanks to the development of a formal logical system, first-order logic, which includes variables and thus allows for abstraction.

2.2. Automatic Reasoning and Inference

Today, first-order reasoning is a fundamental component of symbolic reasoning for machine learning systems. Modern expert systems all use first- or higher-order logic, which allows the conduct of abstract reasoning and inference in an automated manner.

There are also specialized programming languages for first-order logic. The most famous of these is Prolog, where the acronym stands for “propositional logic” in a rather confusing manner, but whose syntax corresponds to a major extent to that of first-order logic. Prolog is characterized by the rapidity with which knowledge bases can be built, and requires little training by human analysts in order for them to be able to encode their knowledge.

Other solutions for knowledge representation and reasoning that use first-order logic are Drools and jBPM, and we can read more about them in their respective tutorials.

3. Differences Between Propositional and First-Order Logic

In our previous tutorial on propositional logic, we discussed the peculiar characteristics that this logical system possesses. It’s therefore useful to now identify what similarities and differences exist between propositional and first-order logic since the latter derives from the former. This, in turn, will let us understand what are the advantages and disadvantages of the two approaches to the logical formalization of language.

The first difference between the two relates to the fact that first-order logic includes propositional logic, but the opposite isn’t true. This means that all problems expressed in propositional logic can be treated under first-order logic, but not all problems in first-order logic can be treated in propositional logic. This concerns, in particular, problems that can only be expressed by making use of quantifiers, such as “all”, “any”, “some”, “none”, as we’ll see later in more detail.

The second difference relates to the nature of the elementary unit which constitutes the formulas of the two formal systems. Propositional logic uses propositions and logical operators in order to constitute its Well-Formed Formulas. First-order logic, in addition to those, also uses variables, quantifiers, and relationships. We’re going to see in the next sections what these new concepts mean.

The third difference concerns the capacity for the abstraction of formulas expressed by the two systems. In propositional logic, the system doesn’t allow to handle problems that involve mutating or undetermined parts. This means that the validity of a solution to a formula found in propositional logic is restricted to that formula. In first-order logic, in contrast, it’s possible to create formulas that possess a higher capacity for generalization.

The table below sums up the primary differences between the two systems:

Characteristics Propositional Logic First-Order Logic
Includes the other No Yes
Uses variables No Yes
Units of analysis Propositions and operators Predicates, objects, relationships, quantifiers

4. Syntax of First-Order Logic

4.1. Logical Symbols

First-order logic has a syntax that’s comprised of symbols belonging to one of two classes: logical and non-logical symbols.

Logical symbols are those that correspond to logical operators or connectives. Examples of such symbols include the logical operators \wedge, \vee, \neg, which serve functions identical to their corresponding operators in propositional logic. Logical symbols of this type are always and only interpreted in the sense of the logical operation that they represent, and their meaning is never conditioned by the domain of discussion in which we use first-order logic.

In other words, whether we discuss theories of chemistry or physics or computer science, the meaning of a formula of the form x \wedge y may vary in its x and y components, but the symbol \wedge always means and. This means that logical symbols always have meanings that are unambiguous.

4.2. Non-Logical Symbols

Non-logical symbols comprise predicates and relationships, but also constants and functions. The meaning associated with non-logical symbols is domain-specific, and their conversion to sentences in a natural language requires conversion rules and interpretation.

Let’s take as an example the formula P(x), which indicates that x is P. Let’s now see how this formula can intuitively be interpreted in different contexts, by means of appropriate conversion rules:

  • If we talk about fruits, if x means apple and P is green, then P(x) means apple is green
  • In chemistry, if x means hydrogen and P is atom, then P(x) means hydrogen is (an) atom
  • In physics, if x means electron and P is lepton, then P(x) means electron is (a) lepton

We’re going to study formally how to interpret formulas later in this article.

4.3. Arity in First-Order Logic

Predicates and functions also have arity, which indicates the number of arguments or parameters that predicates hold. The most common arities are nullary, unary, binary, and ternary, and we can refer to this table for their definitions and examples:

N.\;arguments Arity Examples
0 Nullary 5,\;False,\;constants
1 Unary P(x),\;\neg{x}
2 Binary x\;\vee\;y,\;x\;\wedge\;y
3 Ternary if\;p\;then\;q\;else\;r,\;(p\;\rightarrow\;q)\;\wedge\;(\neg{p}\;\rightarrow\;r)

The arity of a function means exactly the same thing as the arity of a predicate; i.e., the number of its variables.

4.4. Predicates

Predicates are the fundamental components of formulas and indicate relations between objects. These relationships can be any of those that are allowed in a discursive domain.

For example, in the domain of family relationships, predicates can be brother of, father of, mother of. In the context of business relationships, predicates can be employee of, subsidiary of, controlled by. More complex predicates, such as son of this father and this mother can also be defined.

In general, we indicate a relationship between a finite set of variables \{x_1, x_2, ... , x_n\} with an uppercase letter and, between brackets, the ordered elements of that set, such as P(x_1, x_2, ... , x_n). We can then say that the predicate P(x_1, x_2, ... , x_n) has arity n, or, equivalently, that it’s n-ary, because it refers to n terms.

Predicates can have arity higher than unary. A predicate P can refer not only to one but also to two or more terms. In this case, we indicate the extra variables by separating them with additional commas, as in P(x,y) or P(x,y,z) and so forth.

The ternary relationship son of this father and this mother, for example, can be indicated as the predicate P(x, y, z) between three variables x, y, z. If we then assign the values x = \text{John}, y=\text{Jack}, z = \text{Susan} to x, y, and z, we can then translate the predicate P(x, y, z) as:

P(\text{John}, \text{Jack}, \text{Susan}) = \text{John is the son of Jack and Susan}

4.5. Translation of Predicates to Natural Language

As per the convention of many programming languages that use first-order logic, the first argument of a predicate normally translates to the subject of a sentence, while the second usually translates to its genitive or possessive:

  • \text{brother(Paul, Richard)} may translate to Paul is the brother of Richard
  • \text{employee(Mr. Smith, Mr. Doe)} may translate to Mr. Smith is the employee of Mr. Doe
  • \text{fruit(apple, tree)} may translate to apple is a fruit of a tree

For predicates of arity higher than 1, the convention to assign the first argument to the subject derives from the attempt to standardize the interpretation of first-order formulas. If this convention weren’t followed, predicates such as the second one in the examples above could ambiguously be translated as Mr. Smith is the employee of Mr. Doe, or Mr. Doe is the employee of Mr. Smith.

Predicates can also, however, have arity equal to 0, in what case we simply indicate them with an uppercase letter without brackets. The nullary predicate P, for example, may stand for a logical proposition that we treat as a constant in the context of a given formula. The proposition P = \text{Socrates is a mortal} can be treated as a predicate of arity zero if we’re not interested in generalizing it to persons other than Socrates.

Predicates with arity zero have to be included in first-order logic so that we can treat it as a generalization over propositional logic. An expression in first-order logic comprised exclusively of nullary predicates and logical operators is, in fact, a well-formed formula in propositional logic.

We can alternatively consider nullary predicates as terms in first-order logic, which in turn allows us to treat them as well-formed formulas in that context. As we’ll see shortly, terms are in fact the foundational elements of well-formed formulas for first-order logic.

4.6. Terms, Variables, and Functions

We studied how in propositional logic the fundamental units of well-formed formulas were atomic propositions. In first-order logic, these fundamental units are terms, which are comprised of variables and functions.

Variables are mutable parts of the formula and are all individually considered to be terms. If, for instance, the symbol x indicates odd natural numbers smaller than 10, then x is a variable that can assume the value of any element from the set \{1, 3, 5, 7, 9\}. We would then call the symbol x a term in a formula that contains it.

Functions are also terms but, in contrast to variables, refer to other terms such as variables or other functions. If, for example, we define the function f(x, y) = x^{y}, then f(x, y) is a term referring to the two variables x and y.

Functions are characterized by arity, in the same manner as predicates, such that a function with n terms is said to be n-ary. In the edge case, a function can also have arity zero, in which case it’s said to be a constant. The function c = 5, for example, is a nullary function that can be replaced by the symbol 5 wherever it occurs. This, in turn, allows us to treat constants as terms for the purpose of constructing well-formed formulas.

4.7. Quantifiers in Natural Language

Quantifiers are a special component of first-order logic, that allows the definition of formulas that consider numbers or quantities in relation to some predicates. They are, in the English language, also called indefinite adjectives. These include adjectives such as any, some, all, or none.

Examples of sentences that include quantifiers are:

  • Some apples are green
  • All apples are green
  • No apples are green
apples quantifiers

Notice how, in propositional logic, these three sentences correspond to as many distinct atomic propositions. In first-order logic, though, they all correspond to the same predicate-object relationship, short of a different quantifier. We can preliminarily express this idea, before we see its formal definition, in an informal manner as:

  • (some \vee all \vee no) green(apples)

4.8. Quantifiers in Logic

We can more formally define quantifiers through the usage of two symbols: \forall and \exists. They translate, respectively, all and there exists in the English language. They both precede variables in the expressions that contain them, and we can write them as \forall{x} and \exists{x}. If multiple variables use quantifiers, each quantifier is normally repeated for each variable: \forall{x}\forall{y}, \exists{x}\forall{y}.

If the variable refers to a predicate P(x), then the quantifier as a general rule preceeds the predicate: \forall{x}P(x), \exists{x}P(x). Simple expressions containing quantifiers are immediate to translate:

  • \forall{x}P(x)\ \text{and}\ (P(x) = \text{x is green}) \rightarrow \text{all x are green}
  • \exists{x}P(x)\ \text{and}\ (P(x) = \text{x is green}) \rightarrow \text{there is a x that's green}

If a sentence in natural language contains quantifiers that first-order logic doesn’t explicitly define, such as some or none, then we can use the following table to realize a conversion:

English First-Order
At\;least\;one\;x\;is\;P \exists{x}P(x)
All\;x\;are\;P \forall{x}P(x)
Some\;x\;are\;P \exists{x}P(x)
Not\;all\;x\;are\;P \exists{x}\neg{P(x)}
No\;x\;are\;P \forall{x}\neg{P(x)}

5. Well-Formed Formulas

We can finally define well-formed formulas for first-order logic, by compounding terms, predicates, and quantifiers, according to the following rules.

The first rule says that predicates consisting of terms are, on their own, valid formulas. This means that a predicate of the form P(x_1, x_2, ... , x_n) constitutes by itself a valid formula.

The second rule says that the equivalence between terms is a valid formula. This means that x = y, c = 5, f(x) = x, and f(x,y) = g(a,b) are all valid formulas, because they comprise exclusively of terms and the equal sign.

The third rule says that unary logical operators applied to formulas also constitute formulas. That means that if \varphi is a formula, then \neg{\varphi} is also a formula. If, for example, \varphi corresponds to the formula P(x, y), then \neg{\varphi} corresponding to \neg{P(x,y)} is also a formula.

The fourth rule concerns logical operators of arity higher than one. Formulas connected by those logical operators are also formulas. If \varphi and \psi are two formulas, then \varphi \vee \psi, \varphi \wedge \psi, and \varphi \rightarrow \psi are also all valid formulas.

The last rule relates to quantifiers, and it states that if \varphi is a formula that contains a term x, then \forall{x}\varphi and \exists{x}\varphi are also formulas. We say that a variable x to which a quantifier is assigned for a given formula is called a bound term for that formula, whereas in the opposite case we call it free term. A formula that contains only bound terms is special because it can have only one and one truth value that doesn’t depend on the specific value of its bound terms.

6. Examples of First-Order Formulas

6.1. Some Apples Are Green

We can now see some examples of first-order formulas and their interpretation for a specific discursive domain. This will allow us to understand how to extract first-order formulas from natural language, and conversely how to convert from formulas to it.

The first formula corresponds to the sentence “Some apples are green” that we studied earlier:

some apples green

To translate it into a first-order formula we need to define:

  • A variable x
  • A unary predicate A(x) corresponding to “x is an apple”, and a unary predicate G(x) corresponding to “x is green”

We can then convert that sentence into the formula \exists{x}(A(x) \wedge G(x)).

6.2. Restaurants, Cinemas, and Popcorns

The second sentence that we’ll convert is “There isn’t any restaurant that sells popcorns, but cinemas do”:

restaurants cinemas

We can first rewrite this sentence as “If x is a restaurant then x doesn’t sell popcorns, and if x is a cinema then it sells popcorns and if x is a restaurant then x it’s not a cinema and if x is a cinema then it’s not a restaurant”. We can then define the following:

  • A variable x
  • Three unary predicates R(x), C(x), and P(x), referring respectively to “x is a restaurant”, “x is a cinema”, and “x sells popcorns”

This lets us convert the sentence in natural language into:

\forall{x}((R(x)\rightarrow \neg{P(x)}) \wedge (C(x) \rightarrow P(x)) \wedge (R(x) \rightarrow \neg{C(x)}) \wedge (C(x) \rightarrow \neg{R(x)}))

6.3. Paternity and Maternity

The third formula corresponds to the sentence in natural language “All humans have a father and a mother”:

father mother

To convert this sentence to a first-order formula we need to define the following:

  • Three variables x, y, z
  • A unary predicate H(x) “x is a human”
  • Two binary predicates P(y, x) and Q(z, x) corresponding, respectively, to the relationships “is a father of” and “is a mother of”

If we do that, we can then convert the sentence “All human have a father and a mother” first into “if x is a human, there is a y and a z such that y is a human and z is a human and y is the father of x and z is the mother of x and x is not y and x is not z and y is not z”. We can then translate this very long sentence into the more manageable \exists{y}\exists{z}\forall{x}(H(x) \rightarrow (H(y) \wedge H(z) \wedge P(y,x) \wedge Q(z, x) \wedge x \neq y \wedge x \neq z \wedge y \neq z)

7. Conclusions

In this tutorial, we studied the conceptual bases of first-order logic and learned how to derive it as a generalization from propositional logic. We’ve also studied well-formed formulas and their elementary components, plus the rules according to which we can connect them into valid formulas.