Method of consecutive properties elimination (Del-Method) [1].

     Imagine the following problem: the most informative system, including 25 properties, must be selected from the multitude of 50 properties. Let us use the quantity of recognition errors v as the criterion of information consistency.
     Let us estimate the recognition error at application of all 50 properties (v0). Then let us exclude the first property and find the error (v1) provided by other 49 properties. Afterwards let us change the places of the first and the second property and find error (v2) in new 49-dimension space. The mentioned operation of alternate exclusion of one property will be repeated 50 times.
     Let us find the smallest value from the obtained series v1,..., vj,..., v50. It will show us the property, which exclusion was the less perceptible for the system. Let us exclude this property and start to investigate the remaining 49. Their exclusion in turn from the system will help us to find the following less informative property and decrease the dimension of the space to 48. These procedures will be repeated (g-n) times to decrease the number of properties to the given value (n).
     The number L of tested property systems is significantly lower than number of all possibly variants. In our case L=900, what is lower than the full running over volume by an order of 12. The examples of good enough solution of problem by this method are given in the literature [1].

Method of consecutive properties addition (Ad-Method) [2].

     This algorithm differs from the previous one only by the order of testing of properties subsystems – the testing starts not from the g-dimensional space but from the one-dimensional one. In the beginning all g properties are tested for informativeness. To provide it the recognition of a control sequence is performed for each of g properties separately and the property, giving the minimum number of errors, is included into the informative subsystem.
     Then all (g-1) properties are added in turn to the first one. Two-dimensional spaces obtained are estimated by number of recognition errors. The most informative couple of properties are selected. In the same way the best third property from remaining (g-2) is selected and the process is repeated until the creation of the system with n properties.
     The labour consumption of this algorithm is approximately the same as for Del, but the Ad results are usually better. It is explained by the effect of low representativeness of learning set: for the same selection volume the higher is the dimension of properties space the lower is validity of obtained statistical conclusions (in our case – informativeness estimation). Average dimension of selection space is equal to (g+n)/2 for Del algorithm and n/2 – for Ad, so the risk of wrong recognition of informative property as noninformative one in Del is higher than in Ad.
     Both algorithms described give the optimal solution at every step, but it does not provide the achievement of global optimum. The reason of that phenomenon may be illustrated by an example from the psychology of small groups: it is known that two best friends are not always a members of the best friends trio. 
     To decrease the influence of errors at first steps of solution the relaxation method is applied. Ad algorithm selects some amount (n1) of informative properties and then a part of them (n2<n1) is excluded by Del. Afterwards Ad algorithm increases the dimension of informative properties by n1 up to the value of (2n1-n2). In that moment the Del algorithm is switched on again to exclude n2 less valuable properties from the system. Such alternation of Ad and Del algorithms, called AdDel algorithm, is repeated until the achievement of given number of properties n.
     The reversed strategy is also possible: the solution is started by Del algorithm and then the Ad algorithm switches on after reduction of initial system by n1 properties and returns n2 of wrongly excluded properties back into the system. Alternation of these procedures (DelAd algorithm) is repeated until creation of the system with n mostly informative properties.
     Our experiments with these algorithms (with n2 taken equal to n1/3) showed that AdDel algorithm leads to the better results, than Ad, Del and DelAd. 

Method of random search with adaptation (SPA Algorithm) [3,4].

     Let us divide the segment (0-1) into equal pieces and confront each piece with related property: 1st piece relates to the first property, 2nd piece – to the second property and so on. Every piece will have a width of 1/g.
     The generator of random numbers with uniform distribution within the range (0-1) is switched on. If the number corresponds to j-th piece, then j-th property is included into the properties set to be tested. After n generations of random numbers n properties are selected. The quality of this randomly selected subsystem is estimated by means of one of criteria, e.g. by number of recognition errors v obtained.
     The described procedure of random selection of n-dimension subsystems of properties and their quality estimation is repeated r times. Observation of obtained list of estimations v1, v2,…, vi,…, vr gives the way to choose the best and the worst tested subsystem. On this basis the «encouragement» and the «punishment» procedures are performed: the pieces, related to the properties from the best subsystem are «encouraged» by expansion of their width by the value of h and the pieces, related to the properties from the worst subsystem are «punished» by decreasing their width by h (h<1/g). The total width of all pieces is still equal to 1.
     Then next r subsystems are randomly selected and tested, but at this step the probability of choosing of property into subsystem is not uniform: encouraged properties, represented by wider pieces have more chances to join the following subsystem than punished ones. According to the results of testing of this part of subsystems the adaptation procedure (encouragement and punishment) is repeated.
     If some property randomly joins both the best and the worst subsystems then its width remains unchanged. Would it regularly join the most informative subsystem, then the width of related piece will rise will every adaptation step. As well, the property which systematically joins the worst subsystem will decrease its width with every step, thus decreasing the probability of its selection into the property multitude to be tested.
     After some number (R) of search and adaptation cycles the process stabilises: the pieces, related to successful properties, captures practically whole segment (0-1) and the same properties are chosen into the subsystem to be tested. This fact is the signal to stop the process of selection of n-dimensional subsystem of the most informative properties.
     The rate of coincidence and quality of obtained solution depends upon the h value. For large h the process stops earlier, but quality of obtained solution is usually lower, than for smaller h. Lower h values correspond to the more mild strategy of encouragement and punishment.
     The first tests of SPA algorithm were performed for a problem of medical diagnostics. Three patterns were represented by treining set   of 250 realisations in 17-dimension space. It was necessary to find the most informative subspace with dimension 3 and 6. Initially the optimal solutions of this problem have been found by full running over method. Then the same best solution were found by SPA method. It had been discovered that SPA method requires less time than full running over by a factor 5 for n=3 and by a factor of 40 for n=6. This time saving increases with the rise of g and n.
     In more complicated cases, where full running over was impossible, the quality of properties subsystem, selected by SPA method, was compared with quality of initial subsystem of g properties. As a rule, SPA algorithm during the appropriate time selected the subsystem with informativeness closed to the initial one, thus making possible to accept the chosen subsystem as closed to the optimum.
     SPA algorithm showed better results than described Del and Ad algorithms for the same problems.

Directed taxonomic search of properties (NTPP Algorithm) [4,5].

 
     If we have the method to measure the closeness (dependence) between properties, then we can perform the taxonomy of the multitude of g properties into n taxons. And if under the object taxonomy the most similar, closed to each other objects are united into one taxon, then the taxon will unite the properties, equally revealing themselves in the objects of learning selection.
     As the properties are united in the taxon according to the principle of maximum likelihood (dependence), so the maximum unlikelihood (independence) is provided between taxons. 
     Having selected one «typical» property from each taxon, we will obtain n-dimension subsystem, including the properties with maximum independence from each other. Such subsystem in the best way corresponds to the solution rule, oriented to application of independent properties.
     In actual problems one can meet the case, when the initial system contains the groups of properties, connected with each other by strong dependence, but carrying no useful information for recognition of given patterns. Such properties are united into taxons, their representatives will join the n-dimension subsystem will destroy the compactness of patterns and will stand in the way decision making.
     To avoid such trap NTPP algorithm applies the following trick. Taxonomy of g properties is made into n’ taxons (g>n’>n). n’ properties are selected and full running over of all combinations from n’ by n properties is performed. These combinations are compared to each other by informativeness criterion. As a result it becomes possible to choose the most informative n-dimension subsystem of properties.

REFERENCES

1. Merill T., Green O.M. On the effectiveness of receptors in recognition systems.- JRE Trans. Inform.  Theory,  1963,  vol.JT-9, p.11-17. 
2. Barabash Yu.L., Varskii B.V. et al. Automated pattern recognition. Published by KVAIU, Kiev, 1964 (in Russian).
3. Lbov G.S. Methods of treatment different-type experimental data. Published by «Nauka» (Siberian Branch), Novosibirsk, 1981 (in Russian).
4.Zagoruiko N.G., Elkina V.N., Lbov G.S., Emelianov S.V. OTEKS applied software. Published by «Finances and Statistics», Moscow, 1986 (in Russian). 
5. Elkina V.N., Zagoruiko N.G., Temirkaev V.S. Algorithms of directed taxonomy search of informative properties susbsystems. Computing Systems, Institute of Mathematics SB AS USSR reports, Novosibirsk, vol.59, 1974 (in Russian).