A method for universal superpixels-based regionalization (preliminary results)

FOSS4G

Open Source Geospatial Foundation (OSGeo)

Nowosad, Jakub Iwicki, Mateusz

Formal Metadata

Title

A method for universal superpixels-based regionalization (preliminary results)

Title of Series

FOSS4G Firenze 2022

Number of Parts

351

Author

Nowosad, Jakub

Iwicki, Mateusz

Contributors

Stepinski, Tomasz

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/68930 (DOI)

Publisher

FOSS4G

Open Source Geospatial Foundation (OSGeo)

Release Date

2024

Language

English

Production Year

2022

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Generalization is one of the fundamentals of scientific research. In the context of spatial information, generalization needs to allow for finding common properties but also for spatial contiguity. Therefore, such generalization is often made through regionalization - partitioning of space into spatial clusters or regions. This process is vital for environmental studies, where many patterns and processes are autocorrelated spatially. Examples of regionalizations include delineation of ecoregions, detection of homogeneous zones for precision agriculture, definition of climate regions, and so on. Traditionally spatial generalization was performed manually, often based on a compilation of pre-existing, independently conducted studies. This approach lack of quantitative framework, and thus no systematic checks, modifications or objective updates are possible. Currently, the abundance of remote sensing spatial data, such as satellite imagery, gridded climate data, or land cover maps, allows fast extraction of relevant spatial information on regional and global scales, making possible studies rooted in a clear quantitative framework. Such data, however, still requires spatially-aware generalization to formulate general concepts or claims. Remote sensing data stores information as a set of raster cells, where a single cell is unaware of its spatial context. This is often not enough to understand underlying objects or processes. (Geographic) object-based image analysis (OBIA) (Blaschke 2010) is frequently applied to resolve this issue. It is an approach to partition space consisting of raster cells into homogeneous objects and thus make spatial regionalization possible. Several generalization techniques were developed for OBIA, including a superpixels approach that proved to perform best for image processing and remote sensing data analysis (Csillik 2017). The main idea of superpixels is to create connected groupings of cells with similar values (Ren and Malik 2003; Achanta et al. 2012). Each superpixel represents a desired level of homogeneity while at the same time maintaining spatial structures. Superpixels also carry more information than each cell alone, and thus they can speed up the subsequent processing efforts (Ren and Malik 2003; Achanta et al. 2012). The original superpixels algorithm has, however, two major drawbacks for spatial data problems other than RGB images. Firstly, the algorithm uses the Euclidean distance, which is adequate in many cases, such as RGB images. However, it limits the possible usability for environmental datasets – Euclidean distance is not suitable for many types of spatial raster data (e.g., categorical rasters) and has undesirable properties for multi-dimensional data (e.g., a set of monthly climate data), where the results based on Euclidean distance contradict human intuition (Aggarwal, Hinneburg, and Keim 2001). Secondly, the superpixels technique does not result in regions per se but rather over-segmentation – some spatial objects/regions could be represented by one superpixel, while others could consist of many very similar superpixels. Our preliminary results presented during the GIScience 2021 conference (Nowosad and Stepinski 2021) provide a basis for using other distance measures to create superpixels. The proposed extension can also be used for various scenarios, such as creating regions of similar multi-dimensional spatial and temporal patterns or similarly ranked areas. The extension is also already available as an open-source software in the form of an R package. The supercells package has extensive documentation in the form of a help file and additional vignettes that can be found, together with its installation instructions, at https://jakubnowosad.com/supercells/. The second issue is, however, still not resolved. Many clustering methods exist that could be used for merging similar connected superpixels, including traditional ones such as hierarchical clustering and spatial-aware ones such as SKATER or REDCAP. Wang et al. (2018) developed a REDCAP-based workflow for merging superpixels, which showed good image results and outperformed similar techniques; however, their work was based on the original superpixels algorithm and thus used Euclidean distance on 3-dimensional RGB images only. Additionally, it could be worth testing how good modern unsupervised machine learning techniques would perform in this task. Our main goal is to present the work in progress related to developing a robust method for merging superpixels and thus creating high-quality regionalization. We will test clustering/grouping methods based on three main criteria: accuracy, universality, and computational performance. Accuracy will be obtained based on the resulting regions’ internal homogeneity and their isolation compared to the neighbors. Universality will be tested on several datasets to check if the method works for various scenarios, including RGB images, categorical rasters, spatial time-series, etc. The computational performance will be evaluated based on the time needed for each method’s calculation and their use of computer resources.

Keywords

foss4g2022

academictrack