SCCS Colloquium - Jan 9, 2020

From Sccswiki
Revision as of 11:45, 23 December 2019 by Makis (talk | contribs) (Created page with "{| class="wikitable" |- | '''Date:''' || January 9, 2020 |- | '''Room:''' || [https://portal.mytum.de/campus/roomfinder/roomfinder_viewmap?mapid=142&roomid=00.08.053@5608 00.0...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Date: January 9, 2020
Room: 00.08.053
Time: 15:00 - 16:00

Nikolaos Ioannis Bountos: Subpixel Classification of Anthropogenic Features Using Deep Learning on Sentinel-2 Data

Master's thesis submission talk. Nikolaos is advised by Prof. Thomas Huckle.

Urban landscapes are characterized as the fastest changing areas on the planet. The classification of specific urban features is important to monitor and manage the growth of settlements. To tackle this task, many papers have utilized the ad- vances of the Remote Sensing field. The growing availability of high-resolution images from unmanned aerial vehicles (UAVs) as well as airborne data has led to many studies and good performances in the classification and detection of features such as building footprints or roads. However, regularly monitoring of larger ar- eas using UAVs or costly airborne data is not feasible . In these situations, satellite data with a high temporal resolution and large field of view are more appropriate but suffer from the lower spatial resolution (deca-meters). In the present study we show that by using freely available Sentinel-2 data from the Copernicus program, we can extract features such as rivers and anthropogenic classes such as roads, railways and building footprints that are partly or completely on a subpixel level in this kind of data. Additionally, we propose a new metric for the evaluation of our methods on the subpixel objects. This metric measures the performance of the detection of an object while penalizing the false positive classifications. Given that our training samples contain one class, we define two thresholds that represent the lower bound of accuracy for the object to be classified and the background. We thus avoid a good score in occasions where we classify correctly our object, but a wide area of the background has been misclassified as our object. Our approach is inspired by the recent success of deep learning in solving complex problems and the growing availability of high-resolution data collected by UAVs that al- lows accurate labeling for training and testing deep networks. We investigate the performance of different deep-learning architectures for subpixel classification on Sentinel-2 multispectral data and the labels derived from the UAV data. Our study area is located close to Bern, Switzerland where very high-resolution UAV data was available from the University of Applied Sciences of Bern. Highly accurate labels for the respective classes were digitized in ArcGIS Pro and used as ground- truth for the Sentinel data. We trained different deep-learning models based on 9state of the art architectures for semantic segmentation, such as DeepLabv3 and U-Net. Our approach focuses on the exploitation of the multispectral information to increase the performance of the RGB channels. For that purpose, we make use of NIR and SWIR 10m and 20m bands of the Sentinel-2 data. We investigate early and late fusion approaches as well as the behavior and contribution of extra mul- tispectral bands to improve the performance in comparison to only using the RGB channels. In the early fusion approach, we stack nine (RGB, NIR, SWIR) Sentinel-2 bands together, pass them from two convolutions followed by batch normaliza- tion and relu layers and then feed the result to DeepLabv3. In the late fusion approach, we create a CNN with two branches with the first branch processing the RGB channels and the second branch the NIR/SWIR bands. We use modified DeepLabv3 layers for the two branches and then concatenate the outputs into a total output of 512 feature maps. We then reduce the dimensionality of the result into the final output equal to the number of classes. The dimension reduction step happens in two convolution layers. Furthermore we propose a data augmentation method based on acquiring images on the same area from different times of the year in order to improve the model’s generalization. We provide extensive quan- titative evaluation of our methods as well as visual experiments. Additionally, we compare the visual results with traditional methods such as SVMs and Ran- dom Forests. Furthermore, we compare our methods with other approaches from other researchers. These methods use various datasets with different spatial reso- lutions and classes of interests. They achieve overall accuracies ranging between 76% and 99.8%. We achieve a maximum of 87% overall accuracy for the models trained on the augmented dataset and 89% on the initial dataset. For the models trained on the augmented dataset we observe three models with equal overall ac- curacies at 87%. In these models we measure building accuracies in the range of 30%-36%, street accuracies between 44% and 50%, railway accuracies in the range of 67%-74% and river accuracies in the range of 94%-95%, while the background class accuracy is between 93% and 94%. For the smaller dataset in the best model, we observe 60% building accuracy, 59% street accuracy, 73% railway accuracy, 92% river accuracy and 94% background accuracy.

Keywords: Remote Sensing, Computer Vision, Subpixel classification