ABSTRACT

A review of artificial neural network (ANN) technique is presented. An application of a neural network in an embedded real-time feed-forward control system is described. Performance of the example system is quantified with respect to accuracy and execution time. The resulting 11 node network was converted to an embedded 8 bit micro-controller application and executed in 0.07 seconds per prediction.

INTRODUCTION

Artificial neural networks (ANN) evolved from research focused on modeling the function of nerve cells. The potential to encapsulate in a simulation, the recognition and remembering capabilities that are embodied in a human brain drove development of ANNs. ANN modeling has developed to the point that commercial tools are available that allow functional models to be developed (NeuralWare 1991; Hertz et. al. 1991). Useful ANN models have been developed for many purposes including recognition of patterns in images or in sound signals, to modeling chemical processes. The focus of this paper is to provide the reader with a brief introduction to artificial neural network modeling and to present the development of a ANN applied in a feed forward control system.

Neural network based models derive their structure as a model of living neurons. Figure 1 depicts a neuron and Figure 2 shows a typical neural network element that emulates the neuron.

Figure 1. Diagram of a Neuron

Figure 2. Block Diagram of a Neural Network Element

A neural network model is normally composed of a network of elements as shown in Figure 3. The network shown is known as a fully connected network and every possible connection from a node in one layer is made to nodes in the next layer. More complex networks can be constructed that contain feedback and have the ability to better model dynamic responses.

Figure 3. Typical Neural Network Element Arrangement

The output from a given node, j, in a network is given in terms of inputs to that node by equation 1.

(1

Where:

is the current output state of the j-th processing element in layer s.

is the weight on the connection joining the i-th element in layer s-1 to the j-th element in layer s.

f is traditionally a hyperbolic tangent, sigmoid, or sine function.

The function f, known as the activation function, must be continuously differentiable and may be selected to allow the model to be optimized for a particular application. Hyperbolic tangent, used in the application described later, varies between -1 and 1 as the input to the function varies between - and . The weights, W, in the equation can be thought of as modeling memory. They are the weighting parameters in the equation system formed by the network and are fitted to allow the output of the neural network to produce the proper outputs. Neural networks are trained to produce proper outputs. This process sets the weighting parameters and is similar to performing a regression analysis. The neural network, formulated in the above fashion creates a set of non-linear equations that are combined in a way in which the parameters can be readily found. With this view, a process that can be modeled with a simple linear regression analysis should not require the more complex neural network approach. In addition, a neural network model is a form of multi-variable non-linear regression analysis.

Several techniques are available for determining weights in neural networks. The most common method is to use back propagation. In this technique, a set of correct input and output data are presented to the network. The inputs are used to calculate the output states. An error at the output is then calculated. A correction for the output error is then computed and propagated backwards through the network distributing the error based on the current weight parameters. The error at each node is then used to adjust the weight parameters. This process is repeated many times for all of the data sets that are available for training. The technique minimizes global error in the prediction by manipulating local error in the neuron elements in the model. A rigorous discussion of the technique can be found in Rumelhart et al. (1986).

Neural network models may be used for process models and are suited for handling non-linear systems. Where an analytical model is not available, neural network techniques allow a general non-linear model to be fit. Dynamic processes may also be modeled by including inputs with different times. This allows the model to emulate a difference equation. The output or outputs from a neural network are analog values. A discrete digital value may be obtained by comparing the output with a threshold value or comparing two output levels. The later technique is demonstrated in the later example. The network will accept discrete digital inputs in the same way they might be used in a regression analysis.

Neural network models share many of the limitations of regression analysis. The models generated with neural network techniques are not reliably used beyond the range of the training conditions. It is therefore important to acquire training data that span the potential input conditions. Neural networks do use continuous functions as a part of the model but it is important to have the test conditions well distributed about the span of the expected input conditions. Neural network models will be biased if the input data are concentrated within certain intervals of the test conditions. In this case, error will be minimized primarily for the intervals where the training data are concentrated. A companion problem is the potential inadequacy of training data in representing the imprecision in the application conditions. If it is necessary that the model perform well with imprecision in the input data, then the model must be trained with imprecisions in the training data. If this is not done the model may transmit input imprecisions to the output in unexpected ways. The necessity to represent imprecisions in the training data and to completely span the input conditions with data that are well distributed across the span requires careful collection of the input data. These same conditions exist for regression based models. An additional undesirable characteristic of neural network based models is the computationally intensive training process. This translates into very long training times. We have experienced training times on the order of days for 12 input variables and several hundred sets of training data. This problem also exists with non-linear regression analysis where complex functions are fitted.

AN ANN BASED PROCESS MODEL APPLICATION

An ANN was developed to allow color patterns to be recognized in an agricultural weed sprayer application. Figure 4 presents a schematic of a sensor and spray nozzle element component of the sprayer. The complete sprayer consists of many of the sensor-nozzle elements places parallel on a single spray boom.

Figure 4. Sprayer Diagram

A sensor was fabricated to detect color on the surface of the ground in a 7.5 by 50 cm wide image. Three color bands, green, red, and near infra-red were sensed. The signals from the sensor were digitized with a 68HC11 based controller using the on-chip 8 bit A/D converter. The 68HC11 based computer was also used to activate a solid state switch that energized a solenoid valve in the spray nozzle. The intent of control in the system was to sense the presence of a weed by color and to activate the nozzle to spray the plant at the point in time that the plant was under the nozzle. A time budget is shown in the figure. If computing time and the time required for the fluid to reach the ground once it emerges from the nozzle was insignificant, the sensor and nozzle could be located together. The 0.25 second time period between when the fluid emerges from the nozzle and when it reaches the ground cannot be changed. The configuration of the system places a practical limit on computing time. For a sprayer traveling in the field at 3 m/s, a typical ground speed, the separation between sensor and spray nozzle must be 1.1 meters. This physical separation is near the maximum limit practical for the machine.

Agricultural sprayers based on optical sensing and control of spray nozzle activation currently exist on the market. The current designs rely on look-up table based models. This approach limits the number of inputs that can be practically used in the controller. A look-up table with three or more variables and with 8 bit precision will not fit conveniently in a low-cost micro-controller memory. An alternative to a look-up table is to encode the necessary response into an equation. Determination of a simple equation to model the problem is not a simple task.

Many potential interferences exist in detecting the plant, including: amount of target plant in the image, light level, dead plant matter, many soil colors, and variation in the color of the target plant. The inteferences result in an unusual map of sensor response based on color inputs. A non-linear model of some type is necessary. An ANN appeared to be a suitable model for the problem.

Training data were created by exposing the sensor to many different conditions intended to span the possible conditions that would be seen in actual application of the system. Soils of different colors were collected from six locations in Oklahoma. The soils were exposed to the sensor dry and wet. In addition, various percentages of plant cover including 0%, 10% and 100% were placed on the soils. In addition, the system was tested under various natural lighting conditions from heavy overcast to bright. Early testing revealed that in-door conditions could not easily be made to model out-door lighting. All combinations of the input conditions were tested resulting in nearly 80 sets of training conditions. The tests were repeated with similar conditions resulting in nearly 80 sets of test conditions that could be used to determine the performance of the system. Lighting, and plant placement could not be repeated exactly resulting in significant variations between the test and training data.

Neural networks with one and two hidden layers were tested with different numbers of nodes in each layer. Table 1 presents results of training different configurations. The table presents the performance of the model after training on the training data and evaluation of errors that were found comparing model predictions of the test data. Two types of error were computed to evaluate performance of the model, the percentage of tests where the plant was present and detected (Plant % correct) and the percentage of tests where the plant was not present and not detected (No Plant % correct). Many more training iterations were performed and are not shown. The table presents only tests where the "% correct when the plant was present" was maximized. For the current application of the sprayer it is much more important to assure a plant has been sprayed than to avoid spraying when unnecessary. Current sprayer designs operate continuously and would have values of 0 and 100% for the "No Plant % Correct" and the "Plant % Correct" performance measures.

Table 1. Performance of Network Configurations

Nodes in Hidden Layer 1
Nodes in Hidden Layer 2
No Plant

% Correct
Plant

% Correct
Epochs
3
2
75
90
85000
3
3
80
92.5
60000
4
4
70
92.5
55000
5
5
70
92.5
55000
6
6
70
92.5
50000
7
7
65
92.5
70000
8
8
75
90
55000
3
0
45
100
20000
4
0
45
97.5
20000
5
0
50
100
20000
6
0
50
100
30000
7
0
40
100
25000
10
0
92.5
70
20000

Models with a single hidden layer in general were able to detect plants but were not as effective at rejecting situations where no plant was present. The training "epochs", the number of cases that the network was optimized for ranged between 20,000 and 85,000. As expected the number of epochs required to optimize the single hidden layer model was less than the two layer models. Training was done on a SUN IPX workstation using NeuralWare's NeuralWorks Professional II/Plus neural network development package. Training required 2 to 5 minutes for each configuration.

The model described in the second entry in table 1 with three nodes in each of two hidden layers was selected for use in the prototype sprayer. More complex models did not improve the accuracy, and the single layer models were judged to have to large of error when no plant was present. The less complex model also allowed faster executing and smaller code to be used in the embedded application.

The resulting model was coded in C, compiled, and placed in the micro-controller. Two approaches were used to develop the code for the application. NeuralWare's developers package, DPACK, was used to automatically convert the network description into C. Additionally, the network was hand coded in C using equation 1. Table 2 presents code size and execution times for different optimizations of the embedded code.

Several techniques were tested to reduce the execution time of the code. The hand coded version of the model was converted to use a look-up table rather than the built-in C function htan(). The look-up table increased code size but more than halved execution time. An alternative activation function, f(x) = 1/(1-x) was also tested and compiled in a floating point form. The model had to be retrained using alternative activation function with the same results as the originally selected activation function, f(x) = htan(x). The look-up table based model performed better than using f(x) = 1/(1-x) coded in floating point. Finally, the whole implementation of the model in C was coded in integer arithmetic. Some components of the calculation required double precision integers to retain accuracy. The resulting code produced an output in 0.07 seconds after supplying inputs. This computational speed allows the time budget presented in figure 4 to be met and allows a feasible geometry for the physical components of the system.

Table 2. Execution time and code size for the production model

Model DescriptionArithmetic Activation

Function

Code Length

(Bytes)

Execution

Time

(s)

DPACK1 GeneratedFloating Point Floating Point

f(x)=htan(x)

36K-
Hand CodedFloating Point Floating Point

f(x)=htan(x)

3.5K 0.3
Hand CodedFloating Point Floating Point

f(x)=x/(1-x)

3.5K 0.15
Hand CodedFloating Point Look-Up Table f(x)=htan(x)3.8K 0.13
Hand CodedIntegerLook-Up Table

f(x)=htan(x)

3.5K0.07

1. Proprietary NeuralWare neural network code generator.

Some degradation in the accuracy in detection of plants was expected when the model was converted to look-up tables and integerized. Testing of the resulting models on the original test data revealed no significant difference between the integerized model and the original floating point model.

A prototype sprayer using nozzle elements based on the design described above was tested during the fall of 1993. Though performance of the prototype was consistent with initial testing of the model, field tests revealed several unexpected operating limitations. The sensitivity of the prototype detectors was greatly diminished under low light conditions. Training of the model was not done under light conditions as low as those experienced in the field. In addition, during dawn and dusk periods, the color of the natural light is shifted toward red. Both conditions resulted in reduced sensitivity.

SUMMARY

Artificial neural network techniques offer a method for creating computational models of complex systems. The models that are produced may be converted to standard languages and embedded into control systems. An application of a neural network in feedforward control of an agricultural sprayer was presented. The neural network model was developed to detect a color pattern. The model was expressed in a C language program and the compiled and embedded into a micro-controller based control system. The model worked acceptably under the conditions for which it was trained and was simple enough to be computed in the time available. Field testing of the model revealed inadequacies in the selection of training data and reinforces the caution that the span of training of neural network based models must cover all of the potential applications of the model.

Rumelhart, D. E. and J. L. McClelland editors. 1986. Parallel Distributed Processing: Explorations in the Micro-structure of Cognition. Vol I. Foundations. MIT Press. Cambridge Mass.

NeuralWare. 1991. Neural Computing. NeuralWare, Inc. Pittsburg, PA.

Hertz, J., A. Krogh, and R. G. Palmer. 1991. Introduction to the theory of Neural Computation. Addison-Wesley Publishing Co., New York.