## Info

Note that w0 = 1 is the bias. The complete set of weights, w0, w1, w2, ... w400, is called the weight vector, w. Similarly, the sensor outputs, x1, x2, x3, ... x400, constitute the input vector, x. Thus, the net input to the output "neuron" is y = wT x = w0 + w1 x1 + w2 x2 + ... + w400 x400. A simple pattern is "shown" to the sensor array, generating x. The perceptron operates as an iterative machine. It adjusts w repetitively so that after a finite number of iterations if the pattern belongs to class 0, y' = 0, and if it belongs to class 1, y' = 1. The basic perceptron training law (TL) determines the next, (k + 1)th, set of values for the weights. C is the correct class number ( 0 or 1) of the object presented, x. y is the perceptron output (0 or 1). a is the positive constant that adjusts the learning rate. The TL can be written:

In this TL, if the perceptron makes an error (C - y) in its output, this error indicates a need to reorient the w line in x0, x1 space so that the perceptron will be less likely to make an error on this particular x vector again. Note that the output error (C -y) = 0 if the perceptron output is correct and w will not be changed. Otherwise, (C - y) can be ±1, and w will be modified to improve performance.

It should be noted that perceptions can have more than one layer, and more than one output element. For example, x is the input vector, connected by weights w1 to a first layer of K "neurons," y1. y1 is connected to an output layer of M "neurons," yo by weights w2. The y1 layer is called a hidden layer. Lippmann (1987) argues that no more than three layers (excluding the receptors) are required in perceptron-like, feed-forward ANNs because a three-layer net can generate arbitrarily complex decision regions when the ANN is trained for binary discrimination. (Recall that the simple Mark I perceptron, a one-layer ANN, is only capable of a straight line decision boundary in x1, x0 space that will separate (classify) x1 and x0 members.) The multilayer (S 2) perceptrons are made to converge more swiftly on trained weight vectors (w1, w2, w3) by use of more sophisticated learning algorithms, a topic that beyond the scope of this chapter. 