Sort:
Open Access Issue
Discriminatively Constrained Semi-Supervised Multi-View Nonnegative Matrix Factorization with Graph Regularization
Big Data Mining and Analytics 2024, 7 (1): 55-74
Published: 25 December 2023
Downloads:51

Nonnegative Matrix Factorization (NMF) is one of the most popular feature learning technologies in the field of machine learning and pattern recognition. It has been widely used and studied in the multi-view clustering tasks because of its effectiveness. This study proposes a general semi-supervised multi-view nonnegative matrix factorization algorithm. This algorithm incorporates discriminative and geometric information on data to learn a better-fused representation, and adopts a feature normalizing strategy to align the different views. Two specific implementations of this algorithm are developed to validate the effectiveness of the proposed framework: Graph regularization based Discriminatively Constrained Multi-View Nonnegative Matrix Factorization (GDCMVNMF) and Extended Multi-View Constrained Nonnegative Matrix Factorization (ExMVCNMF). The intrinsic connection between these two specific implementations is discussed, and the optimization based on multiply update rules is presented. Experiments on six datasets show that the effectiveness of GDCMVNMF and ExMVCNMF outperforms several representative unsupervised and semi-supervised multi-view NMF approaches.

Regular Paper Issue
SOCA-DOM: A Mobile System-on-Chip Array System for Analyzing Big Data on the Move
Journal of Computer Science and Technology 2022, 37 (6): 1271-1289
Published: 30 November 2022

Recently, analyzing big data on the move is booming. It requires that the hardware resource should be low volume, low power, light in weight, high-performance, and highly scalable whereas the management software should be flexible and consume little hardware resource. To meet these requirements, we present a system named SOCA-DOM that encompasses a mobile system-on-chip array architecture and a two-tier “software-defined” resource manager named Chameleon. First, we design an Ethernet communication board to support an array of mobile system-on-chips. Second, we propose a two-tier software architecture for Chameleon to make it flexible. Third, we devise data, configuration, and control planes for Chameleon to make it “software-defined” and in turn consume hardware resources on demand. Fourth, we design an accurate synthetic metric that represents the computational power of a computing node. We employ 12 Apache Spark benchmarks to evaluate SOCA-DOM. Surprisingly, SOCA-DOM consumes up to 9:4x less CPU resources and 13.5x less memory than Mesos which is an existing resource manager. In addition, we show that a 16-node SOCA-DOM consumes up to 4x less energy than two standard Xeon servers. Based on the results, we conclude that an array architecture with fine-grained hardware resources and a software-defined resource manager works well for analyzing big data on the move.

total 2