Large Deviation Algorithms for Thresholding Bandit Problem

Manjing Zhang; Guangwu Liu; Shan Dai; Jiaqi Chen; Philippe Fournier-Viger

doi:10.26599/BDMA.2025.9020028

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

PDF (2.6 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Large Deviation Algorithms for Thresholding Bandit Problem

Manjing Zhang^¹, Guangwu Liu^², Shan Dai^³(

), Jiaqi Chen^⁴, Philippe Fournier-Viger^⁴

1Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen 518107, China

2Department of Decision Analytics and Operations, City University of Hong Kong, Hong Kong 518057, China

3Shenzhen Research Institute of Big Data, Shenzhen 518172, China, and also with The Chinese University of Hong Kong (Shenzhen), Shenzhen 518172, China

4College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China

Show Author Information

Abstract

The Thresholding Bandit (TB) problem is a popular sequential decision-making problem, which aims at identifying the systems whose means are greater than a threshold. Instead of working on the upper bound of a loss function, our approach stands out from conventional practices by directly minimizing the loss itself. Leveraging the large deviation theory, we firstly provide an asymptotically optimal allocation rule for the TB problem, and then propose a parameter-free Large Deviation (LD) algorithm to make the allocation rule implementable. Central limit theorem-based Large Deviation (CLD) algorithm is further proposed as a supplement to improve the computation efficiency using normal approximation. Extensive experiments are conducted to validate the superiority of our algorithms compared to existing methods, and demonstrate their broader applications to more general distributions and various kinds of loss functions.

Keywords

Thresholding Bandit (TB) problem Large Deviation (LD) theory optimal allocation rule parameter-free policy asymptotical optimality

References

【1】

Crossref Google Scholar

Big Data Mining and Analytics

Volume 8 Issue 5,
October 2025

Pages 1189-1209

DOI: 10.26599/BDMA.2025.9020028

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Zhang M, Liu G, Dai S, et al. Large Deviation Algorithms for Thresholding Bandit Problem. Big Data Mining and Analytics, 2025, 8(5): 1189-1209. https://doi.org/10.26599/BDMA.2025.9020028

1183

Views

101

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Received: 12 December 2024

Revised: 04 March 2025

Accepted: 05 March 2025

Published: 14 July 2025

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).