AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

PDF (2.6 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Research Article | Open Access | Just Accepted

Found-RL: Foundation model-enhanced reinforcement learning via asynchronous VLM feedback for autonomous driving

Yansong Qu^¹, Zihao Sheng^², Zilin Huang^², Jiancong Chen^¹, Yuhao Luo^², Tianyi Wang^³, Yiheng Feng^¹, Samuel Labi^¹, Sikai Chen^²(

)

¹ Lyles School of Civil and Construction Engineering, Purdue University, West Lafayette 47907, USA.

² Department of Civil and Environmental Engineering, University of Wisconsin–Madison, Madison 53706, USA.

³ Department of Civil, Architectural, and Environmental Engineering, University of Texas at Austin, Austin 78712, USA.

Show Author Information

Abstract

Reinforcement Learning (RL) has emerged as a dominant paradigm for end-to-end autonomous driving (AD) with real-time inference. However, RL typically suffers from sample inefficiency and a lack of semantic interpretability in complex scenarios. To mitigate these limitations, Foundation Models (particularly, Vision-Language Models (VLMs)) can be integrated because they offer rich, context-aware knowledge. Yet still, deploying such computationally intensive models within high-frequency multi-environment RL training loops is severely hindered by prohibitive inference latency and the absence of unified integration platforms. To bridge this gap, we present Found-RL, a specialized platform tailored to leverage foundation models to efficiently enhance RL for AD. A core innovation of the proposed platform is its asynchronous batch inference framework, which decouples heavy VLM reasoning from the simulation loop. This design effectively resolves latency bottlenecks, supporting real-time or near-real-time RL learning from VLM feedback. Using the proposed platform, we introduce diverse supervision mechanisms to address domain-specific challenges: we first implement Value-Margin Regularization (VMR) and Advantage-Weighted Action Guidance (AWAG) to effectively distill expert-like VLM action suggestions into the RL policy. Furthermore, for dense supervision, we adopt high-throughput CLIP for reward shaping. We mitigate CLIP’s dynamic blindness and probability dilution via Conditional Contrastive Action Alignment, which conditions prompts on discretized speed/command and yields a normalized, margin-based bonus from context-specific action-anchor scoring. Found-RL delivers an end-to-end pipeline for fine-tuned VLM integration with modular support, and shows that a lightweight RL model with millions of parameters can achieve near-VLM performance compared with billion-parameter VLMs while sustaining real-time inference (~500 FPS). Code, data, and models will be publicly available at https://github.com/ys-qu/found-rl.

Keywords

autonomous driving vision-language models reinforcement learning asynchronous inference

References

【1】

Crossref Google Scholar

Communications in Transportation Research

DOI: 10.26599/COMMTR.2026.9640027

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Qu Y, Sheng Z, Huang Z, et al. Found-RL: Foundation model-enhanced reinforcement learning via asynchronous VLM feedback for autonomous driving. Communications in Transportation Research, 2026, https://doi.org/10.26599/COMMTR.2026.9640027

614

Views

Downloads

Crossref

Web of Science

Scopus

Google Scholar
Citation

Received: 15 February 2026

Revised: 05 April 2026

Accepted: 12 May 2026

Available online: 15 May 2026

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).