A GEOMETRY-DRIVEN APPROACH FOR THE SEMANTIC INTEGRATION OF GEODATA SETS
Institute of Cartography and Geoinformatics, Leibniz Universitaet Hannover, Germany
With the availability of geo-information systems and methods of automatic data acquisition a huge amount of spatial data sets of different origins is available. Although they all describe our environment with the same realworld-objects they differ in resolution, thematic focus, quality, topicality and accuracy. The user cannot benefit from this richness of information, due to the lack of explicit semantics and the poor documentation of the data. So it is difficult without prior expert knowledge to choose the appropriate data for the solution of a task. For integration of various data sets techniques for semantic and geometric harmonization are essential. Whereas for the geometric integration methods are being developed based on geometric matching techniques, semantic integration poses a big challenge, because discovering relationships automatically between different data sets is difficult. Currently, this is typically done manually, because the process requires experts, which know the precise terminology used by the organizations which capture and model the data sets, to identify correspondences. That makes the procedure time-consuming and cost-intensive.
Therefore our approach aims at developing a framework and a prototype that automatically identifies semantic correspondences between different topographic data sets and establishes semantic translation rules, without specific knowledge. The idea is based on the fact that the same realworld-object arises in different data sets, it is possible to infer transformation rules from the different descriptions of the object. These rules describe the semantic relationships between object classes like equivalence, inclusion, overlay or difference, to make an automatic semantic integration between the data sets possible. The identification of corresponding objects starts with a simple geometric overlay and a statistical analysis of two different data sets with the same geographical extent. However due to inhomogenity of the data a unique matching is not always possible, so additional criteria like comparing e.g. area and shape must be introduced. The derived rules can be validated for the test-data set outside the training-area.
In a next step the approach will be extended for other data sets that are not within the same area. The idea is to establish a function for a uniform, generic and objective description of subjectively perceived object groups like lakes, woodland or roads. This function contains geometrical and topological characteristics and existing attribute values and allows the allocation of the semantic as overall domain-ontology.