https://www.mdu.se/

mdu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Hierarchical Interpretable Imitation Learning for End-to-End Autonomous Driving
Hong Kong Baptist University, Kowloon, China.ORCID iD: 0000-0002-6860-9547
State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China.ORCID iD: 0000-0003-4925-0572
University of Chinese Academy of Sciences, Beijing, China.ORCID iD: 0000-0002-2673-1368
Mälardalen University, School of Business, Society and Engineering, Future Energy Center.
Show others and affiliations
2023 (English)In: IEEE Transactions on Intelligent Vehicles, ISSN 2379-8858, E-ISSN 2379-8904, Vol. 8, no 1, p. 673-683Article in journal (Refereed) Published
Abstract [en]

End-to-end autonomous driving provides a simple and efficient framework for autonomous driving systems, which can directly obtain control commands from raw perception data. However, it fails to address stability and interpretability problems in complex urban scenarios. In this paper, we construct a two-stage end-to-end autonomous driving model for complex urban scenarios, named HIIL (Hierarchical Interpretable Imitation Learning), which integrates interpretable BEV mask and steering angle to solve the problems shown above. In Stage One, we propose a pretrained Bird's Eye View (BEV) model which leverages a BEV mask to present an interpretation of the surrounding environment. In Stage Two, we construct an Interpretable Imitation Learning (IIL) model that fuses BEV latent feature from Stage One with an additional steering angle from Pure-Pursuit (PP) algorithm. In the HIIL model, visual information is converted to semantic images by the semantic segmentation network, and the semantic images are encoded to extract the BEV latent feature, which are decoded to predict BEV masks and fed to the IIL as perception data. In this way, the BEV latent feature bridges the BEV and IIL models. Visual information can be supplemented by the calculated steering angle for PP algorithm, speed vector, and location information, thus it could have better performance in complex and terrible scenarios. Our HIIL model meets an urgent requirement for interpretability and robustness of autonomous driving. We validate the proposed model in the CARLA simulator with extensive experiments which show remarkable interpretability, generalization, and robustness capability in unknown scenarios for navigation tasks.

Place, publisher, year, edition, pages
2023. Vol. 8, no 1, p. 673-683
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:mdh:diva-61142DOI: 10.1109/tiv.2022.3225340ISI: 000965615200001Scopus ID: 2-s2.0-85144050330OAI: oai:DiVA.org:mdh-61142DiVA, id: diva2:1716993
Available from: 2022-12-07 Created: 2022-12-07 Last updated: 2025-10-10Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Zhou, Yuanye

Search in DiVA

By author/editor
Teng, SiyuChen, LongAi, YunfengZhou, Yuanye
By organisation
Future Energy Center
In the same journal
IEEE Transactions on Intelligent Vehicles
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 239 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf