NeuralNetworkTechnology |
##
Neuralyst |

Last modified August 20, 1996 by Ross Berteig.

- Input Scaling
- Forward Calculation
- Backpropagation Calculation
- Output Scaling
- References
- Release History
- Legal Notices

Each input value (from both Input and Target columns) is scaled
from the user's coordinate system into an internal coordinate system
with values ranging from 0 to 1. The internal coordinates leave
headroom for the scale margin and noise as requested by the
**Neural | Set Network Parameters...** and **Neural | Set Enhanced
Parameters...** dialogs.

User defined symbols are assigned sequential integer values. In the Min row, the values begin at 0 for the first symbol. In the MAX row, the values begin at 1 for the first symbol. In the TEST and TRAIN rows, the values begin at 0.5.

That is, if the SYMBOL row contains "A,B,C", then A is translated to 0.5, B to 1.5, and C to 2.5 on all TEST and TRAIN rows. The special treatment of the MIN and MAX rows causes each symbol to be represented as the center point of a range of values.

The Scale Margin reserves some headroom in the scaling calculation for inputs which exceed the MIN and MAX row values. It represents the percentage of the range to reserve. The reserved headroom is split between the low and high ends of the range.

For example, if the MIN is 0 and the MAX is 10, then a Scale Margin of 0.1 (10%) causes the scale calculation to permit inputs ranging from -0.5 to 10.5 (5% of the range is added above and below).

For some data sets, it is important to inject some noise during training. If this is happening, then the input scaling process must leave room for the noise to be added.

The Noise parameter is specified as ranging from 0 (no noise) to 1 (lots of noise). Internally, this parameter is used to derive a noise scale which represents the headroom reserved for noise addition.

The internal noise scale is calculated as follows:

`dNoise = (2 * Noise) / (1 + 2 * Noise);`

This results in an internal value dNoise which ranges from 0 to 2/3 as the Noise parameter ranges from 0 to 1.

The input scale calculation is shown below as performed for a single input value. In practice, each value found in an Input column must be scaled as described here individually.

Values in Target columns are scaled similarly, with the small difference that the Noise Parameter does not apply to target values.

In the following formulae,
`Input` or `Target` are the user's input or target value,
`InMin` is the user's MIN row value,
`InMax` is the user's MAX row value,
`Margin` is the Scale Margin Parameter,
`Noise` is the Noise Parameter.
and `X` or `T` are the final scaled input or target value,
In addition, the following intermediate values make the calculation
much easier to represent:
`dNoise` is the noise scale value,
`dMin` is the min value, compensated by scale margin,
`dMax` is the max value, compensated by scale margin,
and `dRange` is the internal range value.

For Input Columns:

`dNoise = (2 * Noise) / (1 + 2 * Noise);``dMin = InMin - (Margin / 2) * (InMax - InMin);``dMax = InMax + (Margin / 2) * (InMax - InMin);``dRange = dMax - dMin;``X = (1 - dNoise) * (Input - dMin) / dRange + dNoise / 2;`

For Target Columns:

`dMin = InMin - (Margin / 2) * (InMax - InMin);``dMax = InMax + (Margin / 2) * (InMax - InMin);``dRange = dMax - dMin;``T = (Target - dMin) / dRange;`

For a clearer description of the mathematics, please refer to
Chapter 3 of the **Neuralyst User's Guide**.

Each neuron of a layer of the network has a vector of weighting
values to be multiplied by the outputs of the previous layer. These
weight values are stored in the working area of a Neuralyst
spreadsheet with eight values per row, on as many rows as are
required. The **Neural | Unpack Weights** command will unpack that
weight array into a representation showing the vector for each neuron
individually. Notice that there is one more weight for each neuron
than there are neurons in the preceding layer: the last weight for
each neuron is the *threshold value*.

In the formulae that follow, `W _{i,j}` represents
the weight applied to input

Each neuron is implemented as a combination of a dot product of the input and weight vectors and an activation function. The activation function serves to introduce a non-linear response characteristic to the neuron. It also forces the output of the neuron to be restricted to the range 0 to 1.

The usual activation function is the sigmoid, which has the following form:

`y = (1.0 / (1.0 + exp(-dScale * x)));`

Where `x` is the input value, `dScale` is the value
of the Gain parameter set in the **Neural | Set Enhanced
Parameters...** dialog box, `exp()` represents the
exponential function, and `y` is the output value.

The rest of the available activation functions have the following representations:

- Augmented Ratio of Squares
`tmp = (dScale * x) * (dScale * x);`

y = tmp / (1.0 + tmp);- Gaussian
`tmp = 0.5 * dScale * x;`

y = exp(-0.5 * tmp * tmp);- Linear
`If x < -10 / dScale Then y = 0;`

If x > 10 / dScale Then y = 1;

Otherwise y = 0.5 + x * dScale / 20;- Sigmoid
`y = (1.0 / (1.0 + exp(-dScale * x)));`- Step
`If x < 0 Then y = 0`

If x = 0 Then y = 0.5 BR> If x > 0 Then y = 1

In the formulae that follow, `Activation()` will represent
a call to the activation function selected in the **Neural | Set
Enhanced Parameters...** dialog box.

The following formulae are written in terms of the following values:

`X`_{i}- either
`i`-th input for the first hidden layer, or is the`i`-th output of the previous layer for all other layers. `W`_{i,j}- weight associated with input
`i`of neuron`j`. `th`_{j}- threshold value associated with neuron
`j`. `Y`_{j}- output of Neuron
`j`in the current layer.

The formula for calculation of the output value of a single neuron is then:

`Y _{j} = Activation( th_{j} + Sum over i ( W_{i,j} * X_{i} ) )`

Beginning with the first neuron of the first hidden layer, the calculation for each neuron is carried out in turn.

The first hidden layer takes the scaled user input values as the outputs of the input layer.

The outputs of the output layer are subsequently scaled and presented to the user as the computed outputs of the network.

The **Neural | Run/Predict with Network** command performs the
complete forward calculation, scales the results, and updates the
Output columns for each TEST and TRAIN row of the sheet.

Training is performed by propagating errors backwards from the output layer through to the first hidden layer, as modifications to the weights for each layer.

For the output layer, the error signal `e _{j}` for the

`e _{j} = Y_{j} * (1 - Y_{j}) * (T_{j} - Y_{j})`

For the hidden layers, the error signal `e _{j}` for the

`e _{j} = Y_{j} * (1 - Y_{j}) * Sum over k (e_{k} * W'_{j,k})`

The error signals are applied to the weight `W _{i,j}` for
the

`W _{i,j} = W'_{i,j} + (1 - M) * LR * e_{j} * X_{i} + M * (W'_{i,j} -
W''_{i,j})`

Note that the Momentum `M` may never have the value 1, or
all learning will halt.

In the special case where the Momentum is 0, this formula may be simplified:

`W _{i,j} = W'_{i,j} + LR * e_{j} * X_{i}`

In practice, this backpropagation calculation is done for each
TRAIN row in the sheet, updating the weights with each row. A single
complete pass through all rows is called an **epoch**. For typical
network applications, many hundreds of epochs will be required for
complete training.

If the current activation function is any other than the
Hyperbolic Tangent, then the scaled output value `Output` is
computed as follows:

`Output = Y * dRange + dMin;`

For the case of the Hyperbolic Tangent activation function, the
scaled output value `Output` is computed as follows:

`Output = ((Y + 1.0) / 2.0) * dRange + dMin;`

The scaled output is then translated into a symbol for those columns which have symbols defined in the corresponding Target column by looking up the symbol which is closest to the scaled output.

Unfortunately, much of what is documented here is only described
in development documentation and the source code itself. However, the
**Neuralyst User's Guide** does have a good overview of the
forward and backpropagation calculations in Chapter 3, and of input
and output scaling in Chapter 6.

- August 1996
- Initial Release

Copyright © 1996-1999, Cheshire Engineering Corporation. All Rights Reserved.

The information in this document is subject to change without notice and should not be construed as a commitment by Cheshire Engineering Corporation. Cheshire Engineering Corporation assumes no responsibility for errors that may appear in this document.

The EPIC logo and Neuralyst^{TM} are
trademarks of EPIC Systems Corporation licensed to Cheshire
Engineering Corporation.

120 West Olive Avenue

Monrovia, California 91016

+1 626 303 1602 Neuralyst Sales

+1 626 303 1602 Customer Service and Support

+1 626 303 1590 FAX