MATCHING GEOGRAPHICAL DATA USING EVIDENCE THEORY
A.M. Olteanu
Cogit Laboratory, IGN
France
ana-maria.olteanu@ign.fr
Currently, there is a multiplicity of geographical information to describe the same reality. This multiplicity comes from the increasing number of geographical data, because data capture becomes easier, and because there are increasingly needs for geographical data and updates. Data is represented at different scales, they are means to use in various applications and they also come from different acquisition process. Nevertheless, we realise that there is independence between databases that represent the same reality and this affects both data users and producers. Integration is seemed to be the issue to that problem. In order to integrate databases, redundancy and inconsistency between data should be identified. Many steps are required to finalise the databases integration and one of them is automatic data matching.
Our goal is to develop an automatic and generic
matching algorithm that takes into account the imperfection in both
geographical data and data specification. Thus, the purpose of this paper is to
model explicitly the imperfection through a mathematical theory and to use it
for data matching.
Firstly, we study the taxonomies of imperfection
in both geographical information and Artificial Intelligence (AI) field. The
variety of taxonomies of imperfection is large. There are many concepts that
are used and there is no standard definition of these terms so that conflicts
may appear between their definitions. We adopted the taxonomy usually used in
AI that employs the concepts of imprecision, uncertainty and incompleteness.
Secondly, we focus on how to model
and compute imperfection. Many probabilistic and non-probabilistic theories
exist in literature and it appears that no single theory will satisfy all
applications. After briefly reviewing these theories, we present our approach based on
Evidence Theory. This theory is presented as a couple {Bel(p),
Pl(p)}, where Bel(p)
represents belief, and Pl (p) represents plausibility. These functions are computed from the mass
of belief m(p) which is calculated for each source of information. The
main difficulty using Evidence Theory is knowledge modelling. Thus, in order to
initialise a belief structure we use a probabilistic approach, i.e. the mass of
belief are modelled by a Gaussian function. The values of the standard
deviation and the average of the Gaussian function are calculated using a
supervised learning algorithm.
Finally, we present a matching algorithm based on Evidence Theory and evaluation using two geographical datasets that contains punctual geographical data representing the relief.