Sort:
Open Access Issue
MetaOJ: A Massive Distributed Online Judge System
Tsinghua Science and Technology 2021, 26 (4): 548-557
Published: 04 January 2021
Downloads:43

Online Judge (OJ) systems are a basic and important component of computer education. Here, we present MetaOJ, an OJ system that can be used for holding massive programming tests online. MetaOJ is designed to create a distributed, fault-tolerant, and easy-to-scale OJ system from an existing ordinary OJ system by adding several interfaces into it and creating multiple instances of it. Our case on modifying the TUOJ system shows that the modification adds no more than 3% lines of code and the performance loss on a single OJ instance is no more than 12%. We also introduce mechanisms to integrate the system with cloud infrastructure to automate the deployment process. MetaOJ provides a solution for those OJ systems that are designed for a specific programming contest and are now facing performance bottlenecks.

Open Access Issue
Heterogeneous Parallel Algorithm Design and Performance Optimization for WENO on the Sunway TaihuLight Supercomputer
Tsinghua Science and Technology 2020, 25 (1): 56-67
Published: 22 July 2019
Downloads:31

A Weighted Essentially Non-Oscillatory scheme (WENO) is a solution to hyperbolic conservation laws, suitable for solving high-density fluid interface instability with strong intermittency. These problems have a large and complex flow structure. To fully utilize the computing power of High Performance Computing (HPC) systems, it is necessary to develop specific methodologies to optimize the performance of applications based on the particular system’s architecture. The Sunway TaihuLight supercomputer is currently ranked as the fastest supercomputer in the world. This article presents a heterogeneous parallel algorithm design and performance optimization of a high-order WENO on Sunway TaihuLight. We analyzed characteristics of kernel functions, and proposed an appropriate heterogeneous parallel model. We also figured out the best division strategy for computing tasks, and implemented the parallel algorithm on Sunway TaihuLight. By using access optimization, data dependency elimination, and vectorization optimization, our parallel algorithm can achieve up to 172× speedup on one single node, and additional 58× speedup on 64 nodes, with nearly linear scalability.

Regular Paper Issue
Cacheap: Portable and Collaborative I/O Optimization for Graph Processing
Journal of Computer Science and Technology 2019, 34 (3): 690-706
Published: 10 May 2019

Increasingly there is a need to process graphs that are larger than the available memory on today’s machines. Many systems have been developed with graph representations that are efficient and compact for out-of-core processing. A necessary task in these systems is memory management. This paper presents a system called Cacheap which automatically and efficiently manages the available memory to maximize the speed of graph processing, minimize the amount of disk access, and maximize the utilization of memory for graph data. It has a simple interface that can be easily adopted by existing graph engines. The paper describes the new system, uses it in recent graph engines, and demonstrates its integer factor improvements in the speed of large-scale graph processing.

Open Access Issue
Auxo: A Temporal Graph Management System
Big Data Mining and Analytics 2019, 2 (1): 58-71
Published: 19 November 2018
Downloads:24

As real-world graphs are often evolving over time, interest in analyzing the temporal behavior of graphs has grown. Herein, we propose Auxo, a novel temporal graph management system to support temporal graph analysis. It supports both efficient global and local queries with low space overhead. Auxo organizes temporal graph data in spatio-temporal chunks. A chunk spans a particular time interval and covers a set of vertices in a graph. We propose chunk layout and chunk splitting designs to achieve the desired efficiency and the abovementioned goals. First, by carefully choosing the time split policy, Auxo achieves linear complexity in both space usage and query time. Second, graph splitting further improves the worst-case query time, and reduces the performance variance introduced by splitting operations. Third, Auxo optimizes the data layout inside chunks, thereby significantly improving the performance of traverse-based graph queries. Experimental evaluation showed that Auxo achieved 2.9× to 12.1× improvement for global queries, and 1.7× to 2.7× improvement for local queries, as compared with state-of-the-art open-source solutions.

total 4