 ## 1. Introduction

Very often when working with video streams, it’s necessary to characterize and quantify the motions of the objects moving in the video. This can be done by estimating the optical flow between two frames.

In this tutorial, we will explore what optical flow is and how to calculate it.

The concept of optical flow was actually first introduced in the field of psychology. In the 1940s, an American psychologist named James J. Gibson used the term to describe the visual stimulus provided to animals moving through the world.

Like animals, we’re often very interested in the part that is moving in a scene. But in a video, a computer sees the world in numbers. That is why in computer vision, the term optical flow refers to a vector field between two images that describe how the pixels of an object in the first image change in relation to the second image.

Since video streams are widely used, optical flow is very important for applications like camera image stabilization, traffic control, and autonomous navigation.

## 2. Example

Let’s look at an example: Here we have two video frames of an arrow approaching an apple. Consider a pixel point inside the arrow object containing the very tip of the arrow. Between the two frames, it has traveled some pixels on the -axis and some on the -axis. If we label the distances as and accordingly, the so-called optical flow corresponding to the arrow tip will be . This is what we want to measure.

## 3. The Optical Flow Equation

Estimating the optical flow, however, turns out to be a difficult task. One obvious problem is that we have no way of finding the corresponding pixels of the objects between the two images, meaning we’re not sure where exactly the arrow lies.

For a solution to be possible we have to assume that the intensity or brightness of all the image pixels stays the same. If the object changed its brightness as it moves, the problem becomes extremely complex. This allows us to associate the pixels and put forth an equation:

A point at location with intensity will have moved by , and between the two image frames: Another assumption that has to be made is that the pixel displacement and difference in time between the two frames are sufficiently small. This is needed for us to come up with an approximate equation, that hopefully can be solved.

Based on this assumption we can expand the equation using the Taylor series: If we consider just the linear terms and discard the higher-order terms we get: which can be then divided by : where are the and components of the optical flow .