Sort:
Survey Issue
A Survey on 360° Images and Videos in Mixed Reality: Algorithms and Applications
Journal of Computer Science and Technology 2023, 38 (3): 473-491
Published: 30 May 2023

Mixed reality technologies provide real-time and immersive experiences, which bring tremendous opportunities in entertainment, education, and enriched experiences that are not directly accessible owing to safety or cost. The research in this field has been in the spotlight in the last few years as the metaverse went viral. The recently emerging omnidirectional video streams, i.e., 360° videos, provide an affordable way to capture and present dynamic real-world scenes. In the last decade, fueled by the rapid development of artificial intelligence and computational photography technologies, the research interests in mixed reality systems using 360° videos with richer and more realistic experiences are dramatically increased to unlock the true potential of the metaverse. In this survey, we cover recent research aimed at addressing the above issues in the 360° image and video processing technologies and applications for mixed reality. The survey summarizes the contributions of the recent research and describes potential future research directions about 360° media in the field of mixed reality.

Open Access Research Article Issue
Focusing on your subject: Deep subject-aware image composition recommendation networks
Computational Visual Media 2023, 9 (1): 87-107
Published: 18 October 2022
Downloads:54

Photo composition is one of the most important factors in the aesthetics of photographs. As a popular application, composition recommendation for a photo focusing on a specific subject has been ignored by recent deep-learning-based composition recommendation approaches. In this paper, we propose a subject-aware image composition recommendation method, SAC-Net, which takes an RGB image and a binary subject window mask as input, and returns good compositions as crops containing the subject. Our model first determines candidate scores for all possible coarse cropping windows. The crops with high candidate scores are selected and further refined by regressing their corner points to generate the output recommended cropping windows. The final scores of the refined crops are predicted by a final score regression module. Unlike existing methods that need to preset several cropping windows, our network is able to automatically regress cropping windows with arbitrary aspect ratios and sizes. We propose novel stability losses for maximizing smoothness when changing cropping windows along with view changes. Experimental results show that our method outperforms state-of-the-art methods not only on the subject-aware image composition recommendation task, but also for general purpose composition recommendation. We also have designed a multi-stage labeling scheme so that a large amount ofranked pairs can be produced economically. Weuse this scheme to propose the first subject-aware composition dataset SACD, which contains 2777 images, and more than 5 million composition ranked pairs. The SACD dataset is publicly available at https://cg.cs.tsinghua.edu.cn/SACD/.

Open Access Research Article Issue
Coherent video generation for multiple hand-held cameras with dynamic foreground
Computational Visual Media 2020, 6 (3): 291-306
Published: 03 September 2020
Downloads:25

For many social events such as public performances, multiple hand-held cameras may capture the same event. This footage is often collected by amateur cinematographers who typically have little control over the scene and may not pay close attention to the camera. For these reasons, each individually captured video may fail to cover the whole time of the event, or may lose track of interesting foreground content such as a performer. We introduce a new algorithm that can synthesize a single smooth video sequence of moving foreground objects captured by multiple hand-held cameras. This allows later viewers to gain a cohesive narrative experience that can transition between different cameras, even though the input footage may be less than ideal. We first introduce a graph-based method for selecting a good transition route. This allows us to automatically select good cut points for the hand-held videos, so that smooth transitions can be created between the resulting video shots. We also propose a method to synthesize a smooth photorealistic transition video between each pair of hand-held cameras, which preserves dynamic foreground content during this transition. Our experiments demonstrate that our method outperforms previous state-of-the-art methods, which struggle to preserve dynamic foreground content.

total 3