V. Walter, S. Volz

University of Stuttgart


Clustering spatial data can be interpreted as segmentation or classification processes which find meaningful patterns within geospatial databases. Such patterns can be used to adapt further algorithms to the individual characteristics of the detected classes or clusters. For example, cartographic generalisation works differently in urban and in rural areas. If this kind of information is not explicitly available in the database, it can be derived automatically with clustering algorithms.

The area of spatial data clustering has been extensively studied and various approaches are available. However, to the best of our knowledge, none of the existing techniques has tried to perform the clustering of vector data in the raster world. As we will show in this paper, this is a simple and straightforward approach that allows a fast computing of clusters.

In our study, we use vector street data from the Geographic Data Files (GDF), in order to derive clusters of different degrees of urbanity. At the beginning of the process, an operator can define two different parameters for generating the clusters: the grid size of the resulting raster map and the radius around the centre of each grid cell (cluster radius) so that the area for which the cluster indicators have to be observed can be calculated (area of influence). As indicators for recognizing different levels of urbanity, we use node density and rectangularity of streets, since we assume that (at least in Germany) in city centres there are usually more topological nodes and more irregular, non-orthogonal streets than in suburbs or rural areas.

After the operator has chosen the clustering parameters, the whole area of investigation is subdivided into equally sized, square-shaped grid cells. Then the area of influence is determined and the indicators, i.e. node density and rectangularity, are calculated. The result is a raster layer for each indicator. In order to join the different layers and to achieve a final categorization of each individual grid cell, a function has to be defined enabling the combination of the different raster layers. Before joining the different raster layers, it is possible to pre-process them with image processing techniques. For example a Gaussian filter can be used to smooth the raster layers or to decrease noise. This can also be done with the final result raster layer.