Comparison of Multimodal RGB-Thermal Fusion Techniques for Exterior Wall Multi-Defect Detection

Published in Journal of Infrastructure Intelligence and Resilience, 2023

Recommended citation: Yang, X., Guo, R., & Li, H. (2023). Comparison of multimodal RGB-thermal fusion techniques for exterior wall multi-defect detection. Journal of Infrastructure Intelligence and Resilience, 2(2), 100029. doi:https://doi.org/10.1016/j.iintel.2023.100029 https://authors.elsevier.com/sd/article/S277299152300004X

Highlights

  • There are three multimodal RGB-Thermal fusion techniques for multi-defect detection in facades
  • Early fusion technique is simple but efficient and effective to detect detached and missing tiles
  • Intermediate and late fusion techniques can be effective with an appropriate model structure

Abstract

Exterior wall inspections are critical to ensuring public safety around ageing buildings in the urban cities. Conventional manual approaches are dangerous, time-consuming and labor-intensive. AI-enabled drone platforms have recently become popular and provide an alternative to serving automated and intelligent inspections. However, current identification only investigates RGB images of visual defects or thermal images of thermal anomalies without considering the continuous monitoring and the conversion between multiple defects. To gain new insights with modality-specific information, this research therefore compares the performance of early, intermediate, and late multimodal RGB-Thermal images fusion techniques for multi-defect detection in facades, especially for detached tiles and missing tiles. Numerous RGB and thermals images from an ageing campus building were collected as a dataset and the classical UNet for image segmentation was modified as a benchmark. The comparative results regarding accuracy (mAP, ROC, and AUC) proved that early fusion model was ranked first in distinguishing detached tiles and missing tiles from complex and congested facades. Nevertheless, intermediate fusion model was proved to be more efficient and effective with an optimal architecture, achieving high mean average accuracy with much less parameters. In addition, the results also showed that multi-modal fusion techniques can significantly improve the performance of multi-defects detection without adding a large number of parameters to single-modal AI models.