|
PROBLEM
OF THE TAXONOMY
The substantial formulation of the taxonomy problem may be read in the
work, written as long ago as in 2nd century BC [1]. Democrit in «Letter
to scientific neighbour» writes: «If you, my friend, need to
investigate the complex conglomeration of facts or things, you at first
distribute them into small number of heaps according to similarity. The
picture will clarify and you will understand the nature of these things».
The taxonomy or grouping of objects (the terms «automatic classification»,
«self-learning», «cluster analysis» etc. are often
used also) according to the similarity of their properties simplifies the
solution of many practical problems of data analysis.
The same multitude of m objects may be divided into k taxons (k<m) in
different ways. The man, performing the grouping, is guided by some criteria
(let us denote them F), which help him to tell the good groups from bad
ones and to select the best taxonomy variant.
The algorithms of FOREL
family use the F criterion, based of the compactness hypothesis:
one taxon must include the objects, similar in their properties to some
«central» object. As the result, the sphere-shaped taxons are
created. KRAB
family algorithms use the hypothesis
of l-compactness
and the unite the objects into the taxon, according to similarity of objects
to their neighbours. In that case the taxons of arbitrary shape are constructed.
Reference:
1. Materialists
of Ancient Greece. Published by "Mir". M.- 1957 (In Russian). |