1. Introduction

“Like a moth to a flame” is an old expression relating a strong attraction to something similar to the way moths are attracted to flames. The flying behavior of moths makes it seem as if they converge on a flame.

Since, in optimization problems, we look for algorithms that can discover and converge towards optimal solutions, perhaps moths can help us design one such algorithm. Like many solutions to optimization problems, Moth Flame Optimization (MFO) is inspired by the natural world. The algorithm we explore exploits the spiraling behavior of moths around flames.

In this tutorial, we’ll cover the MFO algorithm in detail.

2. The Optimization Problem

Moths often appear to cluster around bright lights at night. It very much seems like they are attracted to these lights. In reality, it is a consequence of their natural navigation strategy, known as transverse orientation. Moths rely on a distant light source in order to fly in a straight line. They do this by keeping the light source at a fixed angle to themselves. With no unnatural light, this would, usually, be the moon. However, new light sources much closer to the moths distort this behavior. In order to keep a constant angle to this closer light source, the moths end up spiraling around the light.

We frame our optimization algorithm by considering a population of moths. As a population-based algorithm, we represent our moths using an N \times D matrix called M. N is the number of moths, and D is the dimensionality of the solution vector that a moth represents.

As we do not know where the solutions are, we need to approximate them. We do this using another matrix of moths, called the Flame matrix F. F is also an N \times D matrix, the same size as M. F represents the best solutions found so far. We think of F as the flame matrix since these are the beacons that our search agents in M will orbit. Each of M and F have an associated vector, OM and OF respectively, holding the fitness values of the proposed solutions.

2.1. Algorithm Structure

We visualize the algorithm in the presented flowchart. The algorithm is relatively compact. We can see from the flow chart that we need to initialize the moth population. We then need to initialize or update the flames. After updating the flames, the moths move towards their respective flames. This process then continues until convergence, or the process reaches the step limit. What we have left to do is to define the update behavior for moths and flames:

MFO Flowchart

2.2. Spiral Behavior

Moths need to be able to fly around and approach light sources. We achieve this by defining a logarithmic spiral. Other options for spirals are also possible. Let’s now define the formula for the logarithmic spiral:

(1)    \begin{equation*} S(M,F) = D \cdot e^{bt}\cdot \cos{2\pi t} + F \end{equation*}

(2)    \begin{equation*} D = \| M - F \| \end{equation*}

The spiral depends on the distance D between a moth and its corresponding flame F. We also include b as a user parameter and t as a random variable that decreases the range of the spiral over time.

2.3. Exploration vs. Exploitation

As in many agent-based optimization paradigms, exploration/exploitation must be considered. In our case, moths are forced to exploit their one corresponding flame. This prevents over-exploitation early on and promotes exploration. On the other hand, if we kept the number of flames constant over time, we may end up under-exploiting the best solutions discovered so far. So, to avoid this, the number of flames is also decreased over time according to equation 3:

(3)    \begin{equation*} No.Flames = N - l \times \frac{N - 1}{T} \end{equation*}

l is the current iteration, N is the maximum number of flames, and T is the maximum number of iterations.

We update the moths in relation to the best corresponding flames. Since we progressively decrease the number of flames, the remaining moths are updated with respect to the last flame.

2.4. Features and Applications

We have seen how we can model moth behavior and use it to solve optimization problems. We can now re-iterate some of the features of the algorithm and discuss why they work.

Local optima can be a problem for optimization algorithms. As a population-based algorithm, many potential solutions are found and iterated upon, allowing for the avoidance of many local optima. Further, by assigning moths to flames and updating the sequence of flames, local optima are also avoided.

Solutions can be found by searching near our existing good solutions, represented by flames. The adaptive decrease in the number of flames encourages the exploitation of good solutions. The algorithm, therefore, converges over time while also being able to explore alternative areas of the search space.

3. MFO Pseudocode

Let’s see the pseudocode for MFO:

Rendered by QuickLaTeX.com

We can see the outline of the algorithm in the pseudocode above. First, we initialize a population of moths. We then initialize the flames and then perform an update step. After that, we combine the previous flames and newly updated moths to generate a new list of flames. Finally, we see that we repeat these update steps for a total of MaxIter time-steps.

4. Example

Suppose there is a campfire somewhere in the wilderness, and we want to locate it using the MFO algorithm. Further, suppose that moths benefit from the campfire’s heat, and that benefit is inversely proportional to their distance from the campfire. This gives us an easy to compute fitness function. Our wilderness is a 2-dimensional plane. The campfire is located at coordinates \langle 5,5 \rangle on our grid. We set the max iterations to 100.

4.1. Generate the Moths and Flames

We begin by generating N moths. For this example we set N = 3. We see the moth positions in the table below:

Rendered by QuickLaTeX.com

4.2. Move the Moths

We have our moths and we have scored them. We now need flames to optimize towards. These are the moths that perform the best. Achieved by sorting the first set of moths by their fitness values. We show the flames in the table below:

Rendered by QuickLaTeX.com

We now want to calculate the movement of each moth using equation 1. This requires first calculating the distance between the searching moth and it’s flame. That is to say moth 1 and flame 1. The distance is going to be simply the absolute value of the difference between the two vectors. We show this in equation 4:

(4)   \begin{equation*} \langle 3,3\rangle - \langle 1,2\rangle = \langle 2,1\rangle \end{equation*}

We then multiply this distance by our logarithmic spiral term and finally add the position of our flame. For our purposes we will set b = 1 and t our random exploration variable will sample as 0.3. We work this out in equation 5. The new position is closer to the moth representing the flame. It is also closer to our target campfire:

(5)    \begin{equation*} S(M,F) = \langle 2,1\rangle \cdot e^{.3}\cdot \cos{2\pi .3} + \langle 3,3\rangle = \langle 2.7,2.4\rangle \end{equation*}

4.3. Update and Repeat

We repeat this calculation for each moth. After that we then update the number of flames, we then update out list of flames using our old flames and the new moth positions:

Rendered by QuickLaTeX.com

In this case, we do not decrease the number of flames this iteration, because we are running the algorithm for 100 iterations. So when we combine our old flames and new moths we are still looking for 3 flames. This means our new flame matrix contains 2 flames from the previous table and 1 flame from the new moths matrix:

Rendered by QuickLaTeX.com

Finally we can repeat these calculations until convergence.

5. MFO Complexity

MFO relies on sorting the moths. The original paper suggests quicksort, which has a worst-case run time of O(N^2). Other sorting algorithms with better worst-case run times are possible. However, we will use quicksort. For more on choosing a sorting algorithm, see our article on choosing a sorting algorithm.

We compose the overall run-time in equation 6. We see from this that we have to run a sort at each iteration and then run the MFO update loop. Consequently, this gives us the big O complexity shown in 7:

(6)    \begin{equation*} O(MFO) = O(t(O(QuickSort)+O(MFO update))) \end{equation*}

(7)    \begin{equation*} O(MFO) = O(t(n^2 + n)) = O(tn^2 + tn) \end{equation*}

6. Conclusion

In this article, we have covered the Moth Flame Optimization algorithm. We further described the inspiration for the algorithm and then illustrated how to formalize that inspiration.

MFO is a numerical optimization algorithm with many potential uses, from hyper-parameter selection to optimizing multi-layer perceptrons. There are many similar nature-inspired population-based optimization algorithms. For further reading on this topic, we can see some of our other articles on Gray Wolf Optimization, Grasshopper Optimization, and Ant Colony Optimization.

To conclude, we summarize the benefits of MFO. As a population-based algorithm, there is some robustness to getting stuck in local minima. The approach maintains the best solutions discovered so far as flames and progressively exploits areas with good solutions. Adaptively updating the number of flames allows for balancing the exploration/exploitation trade-off. So, on the whole, MFO provides an algorithm that balances exploration against exploitation. It is also easy to understand and implement.

Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.