https://www.mdu.se/

mdu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Multiscale self-attention for unmanned ariel vehicle-based infrared thermal images detection
Department of Computer and Software Engineering, National University of Sciences and Technology, Islamabad, Pakistan.ORCID iD: /0000-0002-7895-8353
Department of Computer and Software Engineering, National University of Sciences and Technology, Islamabad, Pakistan.
Mälardalen University, School of Innovation, Design and Engineering, Embedded Systems.
2025 (English)In: Engineering applications of artificial intelligence, ISSN 0952-1976, E-ISSN 1873-6769, Vol. 149, article id 110488Article in journal (Refereed) Published
Abstract [en]

Object detection and recognition in unmanned aerial vehicle-based images is critical for various applications but is often challenged by complex backgrounds, diverse object scales, densely clustered small objects, and uneven object distributions. This paper introduces a novel deep learning-based artificial intelligence framework that integrates the Multiscale Self-Attention Guidance and Feature Fusion Network with the You Only Look Once model, tailored explicitly for artificial intelligence-driven unmanned aerial vehicle-based infrared thermal image analysis. The proposed methodology offers four key advancements in the You Only Look Once architecture to enhance object detection performance. First, the Multi-Head Self-Attention Transformer module combines global and local information, enabling precise object localization while mitigating the influence of complex backgrounds. Second, the Multiscale Parallel Sampling Feature Fusion module optimizes the fusion of multiscale features. Third, fine-grained shallow feature maps are integrated into the fusion process to detect densely packed small objects accurately. Lastly, the Inverse-Residual Feature Enhancement module, positioned before the detection head, enhances feature extraction for small objects. Experimental evaluations on the High Altitude Infrared Thermal Unmanned Aerial Vehicle dataset demonstrate significant improvements, achieving a Mean Average Precision of 95.1%, Recall of 92.0%, and F1-Score of 91.0%. The framework's robustness is further validated on the Wildland-fire Infrared Thermal Unmanned Aerial System dataset, achieving a Mean Average Precision of 82.1%, Recall of 88.0%, and F1-Score of 82.0%. Comparative analyses with state-of-the-art methods confirm its superiority and offer a scalable artificial intelligence-driven solution for unmanned aerial vehicle applications, advancing object detection capabilities in critical scenarios.

Place, publisher, year, edition, pages
Elsevier, 2025. Vol. 149, article id 110488
Keywords [en]
Artificial intelligence, Deep learning, Feature fusion network, Infrared thermal image analysis, Object detection, Unmanned aerial vehicle
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:mdh:diva-70507DOI: 10.1016/j.engappai.2025.110488ISI: 001449577900001Scopus ID: 2-s2.0-86000792512OAI: oai:DiVA.org:mdh-70507DiVA, id: diva2:1947757
Available from: 2025-03-26 Created: 2025-03-26 Last updated: 2025-10-10Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Anwar, Muhammad Waseem

Search in DiVA

By author/editor
Ali, Muhammad ShahrozeAnwar, Muhammad Waseem
By organisation
Embedded Systems
In the same journal
Engineering applications of artificial intelligence
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 154 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf