Neural Processing Element

Artificial Neural Networks

by Tim Dorney

Neural Processing Element

The brain controls many complex tasks with relative ease, but in actuality, it is the collection of small, interconnected processing elements that give the brain its functionality. This observation is not lost on ANNs. Research relies on the reasonable operation of a single processing element so that the interconnection of those elements into networks provide adequate performance. The processing element of an ANN is designed to take the inputs from input signals or from other processing elements to produce an output.

[Text if no grafic]

Figure 1. Generic Neural Processing Element

In reference to Figure 1, the inputs, (o_1, o_2, o_3,..., o_j,..., o_J), are multiplied by the corresponding weights, w_kj, which is the connection from the jth input to the kth processing element. The sum of all these inputs to the processing element

net_k = o_1*w_k1 + o_2*w_k2 + ... + o_j*w_kj + ... + o_J*w_kJ           (1)

is then fed through an activation function which determines the output of the element, o_k. One of the more common activation functions is the sigmoid shown here both in mathematical and graphical form (Figure 2).

o_k = 1/(1+exp(-1*(net_k + theta_k)/theta_o)                            (2)

Figure 2. Sigmoid Activation Function with Slope and Threshold Adjustments

The rate of transition from a low to a high state, theta_o, and the threshold, theta_k, of the sigmoid are of interest. The slope of the sigmoid activation function is extremely important due to the possibility of saturation. For instance, if the slope is too steep, the output, o_k, may become so close to one or zero that a learning process will terminate at an undesirable state. Also, a negative theta_k shifts the activation function to the right along the horizontal axis. From expression 2, an algebraic manipulation allows the threshold of the activation function to be learned as if it were another weight. The scalar, theta_o, is divided into the other terms so that

netp_k = (o_1*w_k1 + o_2*w_k2 + ... + o_j*w_kj + ... + o_J*w_KJ)/theta_o (3)

and

theta = theta_k/theta_o                                                  (4)

The result in shown in the following expression.

o_k = 1/(1+exp(-1*(netp_k + theta))                                      (5)

In equation 5, the weights and threshold of the network are modified to account for the removal of theta_o. The J+1 input link, shown in Figure 1, is provided so that theta is the J+1 weight with an input of one. The weights are learned during training to account for the scalar multiplication of 1/theta_o.

Previous Page Next Page

jchen@micro.ti.com
tdorney@ti.com
sparr@owlnet.rice.edu