Open this publication in new window or tab >>2026 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]
Deep learning has led to major progress in computer vision, but modern Deep Neural Networks (DNNs) are still highly vulnerable to input perturbations, which limits their robustness in safety-critical applications. This challenge becomes even more critical in real-world industrial environments, such as autonomous machinery operating on construction sites, where visual data is influenced by unpredictable weather conditions, variable lighting, and physical wear and degradation. In addition, data scarcity, privacy constraints, and domain shift prevent the direct application of conventional large-scale training pipelines.
This thesis addresses these challenges by proposing a comprehensive, multi-level framework that strengthens model-level robustness against adversarial attacks, enhances data-level robustness to natural environmental perturbations, and improves adaptive learning under distributed and data-constrained conditions, enabling reliable deployment of visual perception models in complex, safety-critical environments.
The first contribution focuses on the robustness of model-level attacks against adversarial attacks. A meta-heuristic search method is proposed to automatically discover activation functions that increase resistance to adversarial perturbations without requiring adversarial training. A hybrid search strategy further improves convergence efficiency, yielding Convolutional Neural Networks (CNNs) that outperform standard architectures under adversarial attacks while maintaining competitive clean-data accuracy.
The second contribution introduces ConstScene, a large-scale semantic segmentation dataset representing real and synthetic construction-site imagery under diverse weather and sensor degradation conditions. Experiments reveal significant performance drops when models trained on clean data are exposed to perturbed inputs, demonstrating the need for environment-specific robustness benchmarks.
The third contribution introduces an integrated framework that combines Federated Learning (FL) for decentralized collaborative training with Few-Shot Learning (FSL) for sample-efficient domain adaptation, supported by server-side Hyperparameter Optimization (HPO). The proposed approach enables effective model adaptation across distributed construction sites without sharing raw data, significantly improving robustness across heterogeneous client datasets.
In general, this thesis proposes three contributions to enhance robustness in perception systems: model-level robustness against adversarial attacks, introducing the ConstScene dataset for benchmarking performance under real-world degradations and data-level robustness against natural perturbations, and an integrated framework enabling decentralized, sample-efficient model adaptation across heterogeneous environments.
Place, publisher, year, edition, pages
Västerås: Mälardalen University, 2026. p. 225
Series
Mälardalen University Press Dissertations, ISSN 1651-4238 ; 452
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:mdh:diva-74467 (URN)978-91-7485-736-8 (ISBN)
Public defence
2026-01-23, Gamma, Mälardalens universitet, Västerås, 10:00 (English)
Opponent
Supervisors
2025-11-252025-11-212026-01-02Bibliographically approved