UMLN: Open-World Object Detection Empowered by Unsupervised Modeling and Location-Enhanced Network

Yangyang Huang; Jie Hu; Ronghua Luo

doi:10.26599/TST.2024.9010263

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

PDF (4.3 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

UMLN: Open-World Object Detection Empowered by Unsupervised Modeling and Location-Enhanced Network

Yangyang Huang^¹, Jie Hu^², Ronghua Luo^¹(

)

1School of Computer Science and Engineering, South China University of Technology, Guangzhou 510641, China

2College of Engineering, Jiangxi Agricultural University, Nanchang 330045, China

Show Author Information

Abstract

Open-world object detection (OWOD) is a challenging task requiring models to detect both known and unknown objects while incrementally learning from new data. Current OWOD methods typically label regions with high objectness scores as unknown objects, relying heavily on known object supervision, leading to label bias. To address this, we propose object reconstruction error modeling, using object-level semantic information for unsupervised foreground and background modeling. Additionally, we introduce an unsupervised proposal generation method, leveraging segment anything model’s zero-shot learning to generate pseudo-labels for unknown objects. However, classifiers trained on known categories tend to bias toward them during inference. To resolve this, we propose a location-enhanced network, reframing classification as a location quality prediction task. Our method achieves a significant 37% improvement in unknown category recall (52.1%) on the Microsoft common objects in context (MS-COCO) dataset, outperforming previous state-of-the-art methods while maintaining competitive performance on known objects. Furthermore, it surpasses deformable detection transformer (DETR)-based models, achieving 10.95 frames per second, with a speed advantage over faster region-based convolutional neural network (Faster R-CNN)-based methods.

Keywords

unsupervised open world incremental learning object detection

References

【1】

Crossref Google Scholar

Tsinghua Science and Technology

Volume 31 Issue 1,
February 2026

Pages 609-620

DOI: 10.26599/TST.2024.9010263

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Huang Y, Hu J, Luo R. UMLN: Open-World Object Detection Empowered by Unsupervised Modeling and Location-Enhanced Network. Tsinghua Science and Technology, 2026, 31(1): 609-620. https://doi.org/10.26599/TST.2024.9010263

2614

Views

184

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Received: 16 October 2024

Revised: 30 November 2024

Accepted: 26 December 2024

Published: 25 August 2025

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).