Discover the SciOpen Platform and Achieve Your Research Goals with Ease.
Search articles, authors, keywords, DOl and etc.
This paper investigates the potential of Vision-Language Models (VLMs) to enhance Human–Vehicle Interaction (HVI) in Autonomous Driving (AD) scenarios, particularly in interactions between vehicles and other traffic participants, with a focus on rationality and safety in external HVI. Leveraging recent advancements in large language models, VLMs demonstrate remarkable capabilities in understanding real-world contexts and generating significant interest in HVI applications. This paper provides an overview of AD, HVI, and VLMs, along with the historical context of large language model applications in HVI. The HVI discussed herein involves dynamic game processes encompassing perception and decision-making between vehicles and traffic participants, such as pedestrians. Furthermore, we examine the perceptual challenges associated with applying VLMs to HVI and compile relevant datasets. This research fills a gap in the existing literature by systematically analyzing the current status, challenges, and future opportunities of VLM applications in HVI. To advance VLM integration in AD, various implementation strategies are discussed. The findings highlight the potential of VLMs to transform HVI in AD, improving both passenger experience and driving safety. Overall, this study contributes to a comprehensive understanding of VLM applications in HVI and provides insights to guide future research and development.
The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).
Comments on this article