Scholar - SciOpen

Generating selfie images on the surface of a celestial body poses several challenges, including the position of the robotic arm, camera field of view, and limited shooting time. To address these challenges, the PCMIS (3D Point Cloud Matching Based Image Stitching) algorithm is designed, along with a corresponding shooting plan. This algorithm establishes a correspondence between depth and color information, enabling the generation of stitching views under any given view parameter. Furthermore, the algorithm is accelerated using GPU processing, resulting in a significant reduction in stitching time. The algorithm is successfully applied to generate selfie images for the Chang’e-5 mission.

Open Access Research Article Issue

CTSN: Predicting cloth deformation for skeleton-based characters with a two-stream skinning network

Yudi Li, Min Tang, Yun Yang, Ruofeng Tong, Shuangcai Yang, Yao Li, Bailin An, Qilong Kou

Computational Visual Media 2024, 10(3): 471-485

Published: 19 April 2024

Abstract

PDF (5.8 MB) Collect Collected

Downloads：57

We present a novel learning method using a two-stream network to predict cloth deformation for skeleton-based characters. The characters processed in our approach are not limited to humans, and can be other targets with skeleton-based representations such as fish or pets. We use a novel network architecturewhich consists of skeleton-based and mesh-based residual networks to learn the coarse features and wrinkle features forming the overall residual from the template cloth mesh. Our network may be used to predict the deformation for loose or tight-fitting clothing. The memory footprint of our network is low, thereby resulting in reduced computational requirements. In practice, a prediction for a single cloth mesh for a skeleton-based character takes about $7$ ms on an nVidia GeForce RTX 3090 GPU. Compared to prior methods, our network can generate finer deformation results with details and wrinkles.

Open Access Research Article Issue

Sphere Face Model: A 3D morphable model with hypersphere manifold latent space using joint 2D/3D training

Diqiong Jiang, Yiwei Jin, Fang-Lue Zhang, Zhe Zhu, Yun Zhang, Ruofeng Tong, Min Tang

Computational Visual Media 2023, 9(2): 279-296

Published: 03 January 2023

Abstract

PDF (6.1 MB) Collect Collected

Downloads：193

3D morphable models (3DMMs) are generative models for face shape and appearance. Recent works impose face recognition constraints on 3DMM shape parameters so that the face shapes of the same person remain consistent. However, theshape parameters of traditional 3DMMs satisfy the multivariate Gaussian distribution. In contrast, the identity embeddings meet the hypersphere distribution, and this conflict makes it challenging for face reconstruction models to preserve the faithfulness and the shape consistency simultaneously. In other words, recognition loss and reconstruction loss can not decrease jointly due to their conflict distribution. To address this issue, we propose the Sphere Face Model (SFM), a novel 3DMM for monocular face reconstruction, preserving both shape fidelity and identity consistency. The core of our SFM is the basis matrix which can be used to reconstruct 3D face shapes, and the basic matrix is learned by adopting a two-stage training approach where 3D and 2D training data are used in the first and second stages, respectively. We design a novel loss to resolve the distribution mismatch, enforcing that the shape parameters have the hyperspherical distribution. Our model accepts 2Dand 3D data for constructing the sphere face models. Extensive experiments show that SFM has high representation ability and clustering performance in its shape parameter space. Moreover, it produces high-fidelity face shapes consistently in challenging conditions in monocular face reconstruction. The code will be released at https://github.com/a686432/SIR

Regular Paper Issue

BADF: Bounding Volume Hierarchies Centric Adaptive Distance Field Computation for Deformable Objects on GPUs

Xiao-Rui Chen, Min Tang, Cheng Li, Dinesh Manocha, Ruo-Feng Tong

Journal of Computer Science and Technology 2022, 37(3): 731-740

Published: 31 May 2022

Abstract Collect Collected

We present a novel algorithm BADF (Bounding Volume Hierarchy Based Adaptive Distance Fields) for accelerating the construction of ADFs (adaptive distance fields) of rigid and deformable models on graphics processing units. Our approach is based on constructing a bounding volume hierarchy (BVH) and we use that hierarchy to generate an octree-based ADF. We exploit the coherence between successive frames and sort the grid points of the octree to accelerate the computation. Our approach is applicable to rigid and deformable models. Our GPU-based (graphics processing unit based) algorithm is about 20x–50x faster than current mainstream central processing unit based algorithms. Our BADF algorithm can construct the distance fields for deformable models with 60k triangles at interactive rates on an NVIDIA GTX GeForce 1060. Moreover, we observe 3x speedup over prior GPU-based ADF algorithms.

Open Access Research Article Issue

3D corrective nose reconstruction from a single image

Yanlong Tang, Yun Zhang, Xiaoguang Han, Fang-Lue Zhang, Yu-Kun Lai, Ruofeng Tong

Computational Visual Media 2022, 8(2): 225-237

Published: 06 December 2021

Abstract

PDF (4.4 MB) Collect Collected

Downloads：77

There is a steadily growing range of applications that can benefit from facial reconstruction techniques, leading to an increasing demand for reconstruction of high-quality 3D face models. While it is an important expressive part of the human face, the nose has received less attention than other expressive regions in the face reconstruction literature. When applying existing reconstruction methods to facial images, the reconstructed nose models are often inconsistent with the desired shape and expression. In this paper, we propose a coarse-to-fine 3D nose reconstruction and correction pipeline to build a nosemodel from a single image, where 3D and 2D nose curve correspondences are adaptively updated and refined. We first correct the reconstruction result coarsely using constraints of 3D-2D sparse landmark correspondences, and then heuristically update a dense 3D-2D curve correspondence based on the coarsely corrected result. A final refinement step is performed to correct the shape based on the updated 3D-2D dense curve constraints. Experimental results show the advantages of our method for 3D nose reconstruction over existing methods.

Open Access Research Article Issue

A three-stage real-time detector for traffic signs in large panoramas

Yizhi Song, Ruochen Fan, Sharon Huang, Zhe Zhu, Ruofeng Tong

Computational Visual Media 2019, 5(4): 403-416

Published: 04 September 2019

Abstract

PDF (21.6 MB) Collect Collected

Downloads：78

Traffic sign detection is one of the key com-ponents in autonomous driving. Advanced autonomous vehicles armed with high quality sensors capture high definition images for further analysis. Detecting traffic signs, moving vehicles, and lanes is important for localization and decision making. Traffic signs, especially those that are far from the camera, are small, and so are challenging to traditional object detection methods. In this work, in order to reduce computational cost and improve detection performance, we split the large input images into small blocks and then recognize traffic signs in the blocks using another detection module. Therefore, this paper proposes a three-stage traffic sign detector, which connects a BlockNet with an RPN-RCNN detection network. BlockNet, which is composed of a set of CNN layers, is capable of performing block-level foreground detection, making inferences in less than 1 ms. Then, the RPN-RCNN two-stage detector is used to identify traffic sign objects in each block; it is trained on a derived dataset named TT100KPatch. Experiments show that our framework can achieve both state-of-the-art accuracy and recall; its fastest detection speed is 102 fps.

Open Access Research Article Issue

Efficient and robust strain limiting and treatment of simultaneous collisions with semidefinite programming

Zhendong Wang, Tongtong Wang, Min Tang, Ruofeng Tong

Computational Visual Media 2016, 2(2): 119-130

Published: 07 April 2016

Abstract

PDF (2.6 MB) Collect Collected

Downloads：71

We present an efficient and robust method which performs well for both strain limiting and treatment of simultaneous collisions. Our method formulates strain constraints and collision constraints as a serial of linear matrix inequalities (LMIs) and linear polynomial inequalities (LPIs), and solves an optimization problem with standard convex semidefinite programming solvers. When performing strain limiting, our method acts on strain tensors to constrain the singular values of the deformation gradient matrix in a specified interval. Our method can be applied to both triangular surface meshes and tetrahedral volume meshes. Compared with prior strain limiting methods, our method converges much faster and guarantees triangle flipping does not occur when applied to a triangular mesh. When performing treatment of simultaneous collisions, our method eliminates all detected collisions during each iteration, leading to higher efficiency and faster convergence than prior collision treatment methods.

Open Access Research Article Issue

GPU based real-time simulation of massive falling leaves

Chengyang Li, Jingye Qian, Ruofeng Tong, Jian Chang, Jianjun Zhang

Computational Visual Media 2015, 1(4): 351-358

Published: 14 November 2015

Abstract

PDF (3.5 MB) Collect Collected

Downloads：139

As an important autumn feature, scenes with large numbers of falling leaves are common in movies and games. However, it is a challenge for computer graphics to simulate such scenes in an authentic and efficient manner. This paper proposes a GPU based approach for simulating the falling motion of many leaves in real time. Firstly, we use a motion-synthesis based method to analyze the falling motion of the leaves, which enables us to describe complex falling trajectories using low-dimensional features. Secondly, we transmit a primitive-motion trajectory dataset together with the low-dimensional features of the falling leaves to video memory, allowing us to execute the appropriate calculations on the GPU.

Total 8