Scholar - SciOpen

The popularity of online home design and floor plan customization has been steadily increasing. However, the manual conversion of floor plan images from books or paper materials into electronic resources can be a challenging task due to the vast amount of historical data available. By leveraging neural networks to identify and parse floor plans, the process of converting these images into electronic materials can be significantly streamlined. In this paper, we present a novel learning framework for automatically parsing floor plan images. Our key insight is that the room type text is very common and crucial in floor plan images as it identifies the important semantic information of the corresponding room. However, this clue is rarely considered in previous learning-based methods. In contrast, we propose the Row and Column network (RC-Net) for recognizing floor plan elements by integrating the text feature. Specifically, we add the text feature branch in the network to extract text features corresponding to the room type for the guidance of room type predictions. More importantly, we formulate the Row and Column constraint module (RC constraint module) to share and constrain features across the entire row and column of the feature maps to ensure that only one type is predicted in each room as much as possible, making the segmentation boundaries between different rooms more regular and cleaner. Extensive experiments on three benchmark datasets validate that our framework substantially outperforms other state-of-the-art approaches in terms of the metrics of FWIoU, mACC and mIoU.

Open Access Research Article Issue

Joint specular highlight detection and removal in single images via Unet-Transformer

Zhongqi Wu, Jianwei Guo, Chuanqing Zhuang, Jun Xiao, Dong-Ming Yan, Xiaopeng Zhang

Computational Visual Media 2023, 9 (1): 141-154

Published: 18 October 2022

Abstract

PDF (7.4 MB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：59

Specular highlight detection and removal is a fundamental problem in computer vision and image processing. In this paper, we present an efficient end-to-end deep learning model for automatically detecting and removing specular highlights in a single image. In particular, an encoder–decoder network is utilized to detect specular highlights, and then a novel Unet-Transformer network performs highlight removal; we append transformer modules instead of feature maps in the Unet architecture. We also introduce a highlight detection module as a mask to guide the removal task. Thus, these two networks can be jointly trained in an effective manner. Thanks to the hierarchical and global properties of the transformer mechanism, our framework is able to establish relationships between continuous self-attention layers, making it possible to directly model the mapping between the diffuse area and the specular highlight area, and reduce indeterminacy within areas containing strong specular highlight reflection. Experiments on public benchmark and real-world images demonstrate that our approach outperforms state-of-the-art methods for both highlight detection and removal tasks.

Regular Paper Issue

Fast and Error-Bounded Space-Variant Bilateral Filtering

Meng-Ke Yuan, Long-Quan Dai, Dong-Ming Yan, Li-Qiang Zhang, Jun Xiao, Xiao-Peng Zhang

Journal of Computer Science and Technology 2019, 34 (3): 550-568

Published: 10 May 2019

Abstract

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

The traditional space-invariant isotropic kernel utilized by a bilateral filter (BF) frequently leads to blurry edges and gradient reversal artifacts due to the existence of a large amount of outliers in the local averaging window. However, the efficient and accurate estimation of space-variant kernels which adapt to image structures, and the fast realization of the corresponding space-variant bilateral filtering are challenging problems. To address these problems, we present a space-variant BF (SVBF), and its linear time and error-bounded acceleration method. First, we accurately estimate spacevariant anisotropic kernels that vary with image structures in linear time through structure tensor and minimum spanning tree. Second, we perform SVBF in linear time using two error-bounded approximation methods, namely, low-rank tensor approximation via higher-order singular value decomposition and exponential sum approximation. Therefore, the proposed SVBF can efficiently achieve good edge-preserving results. We validate the advantages of the proposed filter in applications including: image denoising, image enhancement, and image focus editing. Experimental results demonstrate that our fast and error-bounded SVBF is superior to state-of-the-art methods.

total 3