Spatial association between regionalizations using the information-theoretical V-measure
There is a keen interest in calculating spatial associations between two variables spanning the same study area. Many methods for calculating such associations have been proposed, but the case when both variables are categorical is underdeveloped despite the fact that many datasets of interest are in the form of either regionalizations or thematic maps. In this paper, we advance this case by adapting the so-called -measure method from its original information-theoretical formulation to the analysis of variance formulation which provides more insight for spatial analysis. We present a step-by-step derivation of the -measure from the perspective of the analysis of variance. The method produces three indices of global association and two sets of local association indicators which could be mapped to indicate spatial distribution of association strength. The open-source software for calculating all indices from vector datasets accompanies the paper. To showcase the utility of the -measure, we identified three different application contexts: comparative, associative, and derivative, and present an example of each of them. The -measure method has several advantages over the widely used Mapcurves method, it has clear interpretations in terms of mutual information as well as in terms of analysis of variance, it provides more precise assessment of association, it is ready-to-use through the accompanying software, and the examples given in the paper serves as a guide to the gamut of its possible applications. Two specific contributions stemming from our re-analysis of the -measure are the finding of the conceptual flaw in the Geographical Detector—a method to quantify associations between numerical and categorical spatial variables, and a proposal for the new, cartographically based algorithm for finding an optimal number of regions in clustering-derived regionalizations.