|
Method
of consecutive properties elimination (Del-Method) [1].
Imagine the following problem: the most informative system, including 25
properties, must be selected from the multitude of 50 properties. Let us
use the quantity of recognition errors v as the criterion of information
consistency.
Let us estimate the recognition error at application of all 50 properties
(v0). Then let us exclude the first property and find the error (v1) provided
by other 49 properties. Afterwards let us change the places of the first
and the second property and find error (v2) in new 49-dimension space.
The mentioned operation of alternate exclusion of one property will be
repeated 50 times.
Let us find the smallest value from the obtained series v1,..., vj,...,
v50. It will show us the property, which exclusion was the less perceptible
for the system. Let us exclude this property and start to investigate the
remaining 49. Their exclusion in turn from the system will help us to find
the following less informative property and decrease the dimension of the
space to 48. These procedures will be repeated (g-n) times to decrease
the number of properties to the given value (n).
The number L of tested property systems is significantly lower than number
of all possibly variants. In our case L=900, what is lower than the full
running over volume by an order of 12. The examples of good enough solution
of problem by this method are given in the literature [1].
Method
of consecutive properties addition (Ad-Method) [2].
This algorithm differs from the previous one only by the order of testing
of properties subsystems – the testing starts not from the g-dimensional
space but from the one-dimensional one. In the beginning all g properties
are tested for informativeness. To provide it the recognition of a control
sequence is performed for each of g properties separately and the property,
giving the minimum number of errors, is included into the informative subsystem.
Then all (g-1) properties are added in turn to the first one. Two-dimensional
spaces obtained are estimated by number of recognition errors. The most
informative couple of properties are selected. In the same way the best
third property from remaining (g-2) is selected and the process is repeated
until the creation of the system with n properties.
The labour consumption of this algorithm is approximately the same as for
Del, but the Ad results are usually better. It is explained by the effect
of low representativeness of learning set: for the same selection volume
the higher is the dimension of properties space the lower is validity of
obtained statistical conclusions (in our case – informativeness estimation).
Average dimension of selection space is equal to (g+n)/2 for Del algorithm
and n/2 – for Ad, so the risk of wrong recognition of informative property
as noninformative one in Del is higher than in Ad.
Both algorithms described give the optimal solution at every step, but
it does not provide the achievement of global optimum. The reason of that
phenomenon may be illustrated by an example from the psychology of small
groups: it is known that two best friends are not always a members of the
best friends trio.
To decrease the influence of errors at first steps of solution the relaxation
method is applied. Ad algorithm selects some amount (n1) of informative
properties and then a part of them (n2<n1) is excluded by Del. Afterwards
Ad algorithm increases the dimension of informative properties by n1 up
to the value of (2n1-n2). In that moment the Del algorithm is switched
on again to exclude n2 less valuable properties from the system. Such alternation
of Ad and Del algorithms, called AdDel algorithm, is repeated until the
achievement of given number of properties n.
The reversed strategy is also possible: the solution is started by Del
algorithm and then the Ad algorithm switches on after reduction of initial
system by n1 properties and returns n2 of wrongly excluded properties back
into the system. Alternation of these procedures (DelAd algorithm) is repeated
until creation of the system with n mostly informative properties.
Our experiments with these algorithms (with n2 taken equal to n1/3) showed
that AdDel algorithm leads to the better results, than Ad, Del and DelAd.
Method
of random search with adaptation (SPA Algorithm) [3,4].
Let us divide the segment (0-1) into equal pieces and confront each piece
with related property: 1st piece relates to the first property, 2nd piece
– to the second property and so on. Every piece will have a width of 1/g.
The generator of random numbers with uniform distribution within the range
(0-1) is switched on. If the number corresponds to j-th piece, then j-th
property is included into the properties set to be tested. After n generations
of random numbers n properties are selected. The quality of this randomly
selected subsystem is estimated by means of one of criteria, e.g. by number
of recognition errors v obtained.
The described procedure of random selection of n-dimension subsystems of
properties and their quality estimation is repeated r times. Observation
of obtained list of estimations v1, v2,…, vi,…, vr gives the way to choose
the best and the worst tested subsystem. On this basis the «encouragement»
and the «punishment» procedures are performed: the pieces,
related to the properties from the best subsystem are «encouraged»
by expansion of their width by the value of h and the pieces, related to
the properties from the worst subsystem are «punished» by decreasing
their width by h (h<1/g). The total width of all pieces is still equal
to 1.
Then next r subsystems are randomly selected and tested, but at this step
the probability of choosing of property into subsystem is not uniform:
encouraged properties, represented by wider pieces have more chances to
join the following subsystem than punished ones. According to the results
of testing of this part of subsystems the adaptation procedure (encouragement
and punishment) is repeated.
If some property randomly joins both the best and the worst subsystems
then its width remains unchanged. Would it regularly join the most informative
subsystem, then the width of related piece will rise will every adaptation
step. As well, the property which systematically joins the worst subsystem
will decrease its width with every step, thus decreasing the probability
of its selection into the property multitude to be tested.
After some number (R) of search and adaptation cycles the process stabilises:
the pieces, related to successful properties, captures practically whole
segment (0-1) and the same properties are chosen into the subsystem to
be tested. This fact is the signal to stop the process of selection of
n-dimensional subsystem of the most informative properties.
The rate of coincidence and quality of obtained solution depends upon the
h value. For large h the process stops earlier, but quality of obtained
solution is usually lower, than for smaller h. Lower h values correspond
to the more mild strategy of encouragement and punishment.
The first tests of SPA algorithm were performed for a problem of medical
diagnostics. Three patterns were represented by treining set
of 250 realisations in 17-dimension space. It was necessary to find the
most informative subspace with dimension 3 and 6. Initially the optimal
solutions of this problem have been found by full running over method.
Then the same best solution were found by SPA method. It had been discovered
that SPA method requires less time than full running over by a factor 5
for n=3 and by a factor of 40 for n=6. This time saving increases with
the rise of g and n.
In more complicated cases, where full running over was impossible, the
quality of properties subsystem, selected by SPA method, was compared with
quality of initial subsystem of g properties. As a rule, SPA algorithm
during the appropriate time selected the subsystem with informativeness
closed to the initial one, thus making possible to accept the chosen subsystem
as closed to the optimum.
SPA algorithm showed better results than described Del and Ad algorithms
for the same problems.
Directed
taxonomic search of properties (NTPP Algorithm) [4,5].
If we have the method to measure the closeness (dependence) between properties,
then we can perform the
taxonomy
of the multitude of g properties into n taxons. And if under the
object taxonomy the most similar, closed to each other objects are united
into one taxon, then the taxon will unite the properties, equally revealing
themselves in the objects of learning selection.
As the properties are united in the taxon according to the principle of
maximum likelihood (dependence), so the maximum unlikelihood (independence)
is provided between taxons.
Having selected one «typical» property from each taxon, we
will obtain n-dimension subsystem, including the properties with maximum
independence from each other. Such subsystem in the best way corresponds
to the solution rule, oriented to application of independent properties.
In actual problems one can meet the case, when the initial system contains
the groups of properties, connected with each other by strong dependence,
but carrying no useful information for recognition of given patterns. Such
properties are united into taxons, their representatives will join the
n-dimension subsystem will destroy the compactness of patterns and will
stand in the way decision making.
To avoid such trap NTPP algorithm applies the following trick. Taxonomy
of g properties is made into n’ taxons (g>n’>n). n’ properties are selected
and full running over of all combinations from n’ by n properties is performed.
These combinations are compared to each other by informativeness criterion.
As a result it becomes possible to choose the most informative n-dimension
subsystem of properties.
REFERENCES
1. Merill T.,
Green O.M. On the effectiveness of receptors in recognition systems.- JRE
Trans. Inform. Theory, 1963, vol.JT-9, p.11-17.
2. Barabash Yu.L., Varskii
B.V. et al. Automated pattern recognition. Published by KVAIU, Kiev, 1964
(in Russian).
3. Lbov G.S. Methods of
treatment different-type experimental data. Published by «Nauka»
(Siberian Branch), Novosibirsk, 1981 (in Russian).
4.Zagoruiko N.G., Elkina
V.N., Lbov G.S., Emelianov S.V. OTEKS applied software. Published by «Finances
and Statistics», Moscow, 1986 (in Russian).
5. Elkina V.N., Zagoruiko
N.G., Temirkaev V.S. Algorithms of directed taxonomy search of informative
properties susbsystems. Computing Systems, Institute of Mathematics SB
AS USSR reports, Novosibirsk, vol.59, 1974 (in Russian). |