Convolutional Networks

Some patterns contain objects within them which may be located in different positions in different objects, but nevertheless represent the same image. An obvious example is an image containing an object to be recognised. A standard feedforward network would see two images of say the digit zero quite differently if the zeros were in different positions within the image. Convolutional networks provide a solution to this problem.

The inputs to a convolutional neuron consist of a subset of colocated elements in the input pattern. You can think of this as a window which scans over the input generating a single output for each window position. The process is illustrated in the above animation. The output of a convolutional neuron is called a feature map which is nearly as large as the input image. The feature map is usually compressed by pooling adjacent outputs and replacing each pool by its maximum value. As in standard feedforward networks, there will usually be many convolutional neurons in each layer and layers will be stacked together. The final output is typically input to a feedforward network in order to classify the object of interest.

Scroll to Top