
Understanding the perceptron
First, we need to understand the basics of neural networks. A neural network consists of one or multiple layers of neurons, named after the biological neurons in human brains. We will demonstrate the mechanics of a single neuron by implementing a perceptron. In a perceptron, a single unit (neuron) performs all the computations. Later, we will scale the number of units to create deep neural networks:

A perceptron can have multiple inputs. On these inputs, the unit performs some computations and outputs a single value, for example a binary value to classify two classes. The computations performed by the unit are a simple matrix multiplication of the input and the weights. The resulting values are summed up and a bias is added:

These computations can easily be scaled to high dimensional input. An activation function (φ) determines the final output of the perceptron in the forward pass:

The weights and bias are randomly initialized. After each epoch (iteration over the training data), the weights are updated based on the difference between the output and the desired output (error) multiplied by the learning rate. As a consequence, the weights will be updated towards the training data (backward pass) and the accuracy of the output will improve. Basically, the perceptron is a linear combination optimized on the training data. As an activation function we will use a unit step function: if the output is above a certain threshold the output will be activated (hence a 0 versus 1 binary classifier). A perceptron is able to classify classes with 100% accuracy if the classes are linearly separable. In the next recipe, we will show you how to implement a perceptron with NumPy.