AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (2.6 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access | Just Accepted

Found-RL: Foundation model-enhanced reinforcement learning via asynchronous VLM feedback for autonomous driving

Yansong Qu1Zihao Sheng2Zilin Huang2Jiancong Chen1Yuhao Luo2Tianyi Wang3Yiheng Feng1Samuel Labi1Sikai Chen2( )

1 Lyles School of Civil and Construction Engineering, Purdue University, West Lafayette 47907, USA.

2 Department of Civil and Environmental Engineering, University of Wisconsin–Madison, Madison 53706, USA.

3 Department of Civil, Architectural, and Environmental Engineering, University of Texas at Austin, Austin 78712, USA.

Show Author Information

Abstract

Reinforcement Learning (RL) has emerged as a dominant paradigm for end-to-end autonomous driving (AD) with real-time inference. However, RL typically suffers from sample inefficiency and a lack of semantic interpretability in complex scenarios. To mitigate these limitations, Foundation Models (particularly, Vision-Language Models (VLMs)) can be integrated because they offer rich, context-aware knowledge. Yet still, deploying such computationally intensive models within high-frequency multi-environment RL training loops is severely hindered by prohibitive inference latency and the absence of unified integration platforms. To bridge this gap, we present Found-RL, a specialized platform tailored to leverage foundation models to efficiently enhance RL for AD. A core innovation of the proposed platform is its asynchronous batch inference framework, which decouples heavy VLM reasoning from the simulation loop. This design effectively resolves latency bottlenecks, supporting real-time or near-real-time RL learning from VLM feedback. Using the proposed platform, we introduce diverse supervision mechanisms to address domain-specific challenges: we first implement Value-Margin Regularization (VMR) and Advantage-Weighted Action Guidance (AWAG) to effectively distill expert-like VLM action suggestions into the RL policy. Furthermore, for dense supervision, we adopt high-throughput CLIP for reward shaping. We mitigate CLIP’s dynamic blindness and probability dilution via Conditional Contrastive Action Alignment, which conditions prompts on discretized speed/command and yields a normalized, margin-based bonus from context-specific action-anchor scoring. Found-RL delivers an end-to-end pipeline for fine-tuned VLM integration with modular support, and shows that a lightweight RL model with millions of parameters can achieve near-VLM performance compared with billion-parameter VLMs while sustaining real-time inference (~500 FPS). Code, data, and models will be publicly available at https://github.com/ys-qu/found-rl.

References

【1】
【1】
 
 
Communications in Transportation Research

{{item.num}}

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Close
Close
Cite this article:
Qu Y, Sheng Z, Huang Z, et al. Found-RL: Foundation model-enhanced reinforcement learning via asynchronous VLM feedback for autonomous driving. Communications in Transportation Research, 2026, https://doi.org/10.26599/COMMTR.2026.9640027

278

Views

45

Downloads

0

Crossref

0

Web of Science

0

Scopus

Received: 15 February 2026
Revised: 05 April 2026
Accepted: 12 May 2026
Available online: 15 May 2026

©The Author(s) 2026.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).