Sort:
Open Access Short Communication Issue
Can attention enable MLPs to catch up with CNNs?
Computational Visual Media 2021, 7 (3): 283-288
Published: 27 July 2021
Abstract PDF (3 MB) Collect
Downloads:47

Open Access Research Article Issue
PCT: Point cloud transformer
Computational Visual Media 2021, 7 (2): 187-199
Published: 10 April 2021
Abstract PDF (10.7 MB) Collect
Downloads:198

The irregular domain and lack of ordering make it challenging to design deep neural networks for point cloud processing. This paper presents a novel framework named Point Cloud Transformer (PCT) for point cloud learning. PCT is based on Transformer,which achieves huge success in natural language processingand displays great potential in image processing. It is inherently permutation invariant for processing a sequence of points, making it well-suited for point cloud learning. To better capture local context within the point cloud, we enhance input embedding with the support of farthest point sampling and nearest neighbor search. Extensive experiments demonstrate that the PCT achieves the state-of-the-art performance on shape classification, part segmentation, semantic segmentation, and normal estimation tasks.

Open Access Research Article Issue
Comfort-driven disparity adjustment for stereoscopic video
Computational Visual Media 2016, 2 (1): 3-17
Published: 01 March 2016
Abstract PDF (32.3 MB) Collect
Downloads:35

Pixel disparity—the offset of corresponding pixels between left and right views—is a crucial parameter in stereoscopic three-dimensional (S3D) video, as it determines the depth perceived by the human visual system (HVS). Unsuitable pixel disparity distribution throughout an S3D video may lead to visual discomfort. We present a unified and extensible stereoscopic video disparity adjustment framework which improves the viewing experience for an S3D video by keeping the perceived 3D appearance as unchanged as possible while minimizing discomfort. We first analyse disparity and motion attributes of S3D video in general, then derive a wide-ranging visual discomfort metric from existing perceptual comfort models. An objective function based on this metric is used as the basis of a hierarchical optimisation method to find a disparity mapping function for each input video frame. Warping-based disparity manipulation is then applied to the input video to generate the output video, using the desired disparity mappings as constraints. Our comfort metric takes into account disparity range, motion, and stereoscopic window violation; the framework could easily be extended to use further visual comfort models. We demonstrate the power of our approach using both animated cartoons and real S3D videos.

Total 3