Scholar - SciOpen

Online testing is critical to ensuring reliable operations of the next generation of supercomputers based on a kilo-core network-on-chip (NoC) interconnection fabric. We present a parallel software-based self-testing (SBST) solution that makes use of the bounded model checking (BMC) technique to generate test sequences and parallel packets. In this method, the parallel SBST with BMC derives the leading sequence for each router’s internal function and detects all functionally-testable faults related to the function. A Monte-Carlo simulation algorithm is then used to search for the approximately optimum configuration of the parallel packets, which guarantees the test quality and minimizes the test cost. Finally, a multi-threading technology is used to ensure that the Monte-Carlo simulation can reach the approximately optimum configuration in a large random space and reduce the generating time of the parallel test. Experimental results show that the proposed method achieves a high fault coverage with a reduced test overhead. Moreover, by performing online testing in the functional mode with SBST, it effectively avoids the over-testing problem caused by functionally untestable turns in kilo-core NoCs.

Open Access Issue

Loop Subgraph-Level Greedy Mapping Algorithm for Grid Coarse-Grained Reconfigurable Array

Naijin Chen, Fei Cheng, Chenghao Han, Jianhui Jiang, Xiaoqing Wen

Tsinghua Science and Technology 2023, 28 (2): 330-343

Published: 29 September 2022

Abstract

PDF (6.8 MB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：129

To solve the problem of grid coarse-grained reconfigurable array task mapping under multiple constraints, we propose a Loop Subgraph-Level Greedy Mapping (LSLGM) algorithm using parallelism and processing element fragmentation. Under the constraint of a reconfigurable array, the LSLGM algorithm schedules node from a ready queue to the current reconfigurable cell array block. After mapping a node, its successor’s indegree value will be dynamically updated. If its successor’s indegree is zero, it will be directly scheduled to the ready queue; otherwise, the predecessor must be dynamically checked. If the predecessor cannot be mapped, it will be scheduled to a blocking queue. To dynamically adjust the ready node scheduling order, the scheduling function is constructed by exploiting factors, such as node number, node level, and node dependency. Compared with the loop subgraph-level mapping algorithm, experimental results show that the total cycles of the LSLGM algorithm decreases by an average of 33.0 $%$ ( ${PEA}_{4 \times 4}$ ) and 33.9 $%$ ( ${PEA}_{7 \times 7}$ ). Compared with the epimorphism map algorithm, the total cycles of the LSLGM algorithm decrease by an average of 38.1 $%$ ( ${PEA}_{4 \times 4}$ ) and 39.0 $%$ ( ${PEA}_{7 \times 7}$ ). The feasibility of LSLGM is verified.

Open Access Issue

Accurate Reliability Analysis Methods for Approximate Computing Circuits

Zhen Wang, Guofa Zhang, Jing Ye, Jianhui Jiang, Fengyong Li, Yong Wang

Tsinghua Science and Technology 2022, 27 (4): 729-740

Published: 09 December 2021

Abstract

PDF (581.7 KB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：55

In recent years, Approximate Computing Circuits (ACCs) have been widely used in applications with intrinsic tolerance to errors. With the increased availability of approximate computing circuit approaches, reliability analysis methods for assessing their fault vulnerability have become highly necessary. In this study, two accurate reliability evaluation methods for approximate computing circuits are proposed. The reliability of approximate computing circuits is calculated on the basis of the iterative Probabilistic Transfer Matrix (PTM) model. During the calculation, the correlation coefficients are derived and combined to deal with the correlation problem caused by fanout reconvergence. The accuracy and scalability of the two methods are verified using three sets of approximate computing circuit instances and more circuits in EvoApprox8b, which is an approximate computing circuit open source library. Experimental results show that relative to the Monte Carlo simulation, the two methods achieve average error rates of 0.46% and 1.29% and time overheads of 0.002% and 0.1%. Different from the existing approaches to reliability estimation for approximate computing circuits based on the original PTM model, the proposed methods reduce the space overheads by nearly 50% and achieve time overheads of 1.78% and 2.19%.

Open Access Issue

Efficient Scheduling Mapping Algorithm for Row Parallel Coarse-Grained Reconfigurable Architecture

Naijin Chen, Zhen Wang, Ruixiang He, Jianhui Jiang, Fei Cheng, Chenghao Han

Tsinghua Science and Technology 2021, 26 (5): 724-735

Published: 20 April 2021

Abstract

PDF (923.9 KB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：51

Row Parallel Coarse-Grained Reconfigurable Architecture (RPCGRA) has the advantages of maximum parallelism and programmable flexibility. Designing an efficient algorithm to map the diverse applications onto RPCGRA is difficult due to a number of RPCGRA hardware constraints. To solve this problem, the nodes of the data flow graph must be partitioned and scheduled onto the RPCGRA. In this paper, we present a Depth-First Greedy Mapping (DFGM) algorithm that simultaneously considers the communication costs and the use times of the Reconfigurable Cell Array (RCA). Compared with level breadth mapping, the performance of DFGM is better. The percentage of maximum improvement in the use times of RCA is 33% and the percentage of maximum improvement in non-original input and output times is 64.4% (Given Discrete Cosine Transfor 8 (DCT8), and the area of reconfigurable processing unit is 56). Compared with level-based depth mapping, DFGM also obtains the lowest averages of use times of RCA, non-original input and output times, and the reconfigurable time.

Open Access Issue

Uniform Non-Bernoulli Sequences Oriented Locating Method for Reliability-Critical Gates

Jie Xiao, Zhanhui Shi, Weidong Zhu, Jianhui Jiang, Qianwei Zhou, Jungang Lou, Yujiao Huang, Qiou Ji, Ziwen Sun

Tsinghua Science and Technology 2021, 26 (1): 24-35

Published: 19 June 2020

Abstract

PDF (2.4 MB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：32

Hardening reliability-critical gates in a circuit is an important step to improve the circuit reliability at a low cost. However, accurately locating the reliability-critical gates is a key prerequisite for the efficient implementation of the hardening operation. In this paper, a probabilistic-based calculation method developed for locating the reliability-critical gates in a circuit is described. The proposed method is based on the generation of input vectors and the sampling of reliability-critical gates using uniform non-Bernoulli sequences, and the criticality of the gate reliability is measured by combining the structure information of the circuit itself. Both the accuracy and the efficiency of the proposed method have been illustrated by various simulations on benchmark circuits. The results show that the proposed method has an efficient performance in locating accuracy and algorithm runtime.

Open Access Issue

A Multi-Objective Optimization Method of Initial Virtual Machine Fault-Tolerant Placement for Star Topological Data Centers of Cloud Systems

Wei Zhang, Xiao Chen, Jianhui Jiang

Tsinghua Science and Technology 2021, 26 (1): 95-111

Published: 19 June 2020

Abstract

PDF (1.4 MB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：49

Virtualization is the most important technology in the unified resource layer of cloud computing systems. Static placement and dynamic management are two types of Virtual Machine (VM) management methods. VM dynamic management is based on the structure of the initial VM placement, and this initial structure will affect the efficiency of VM dynamic management. When a VM fails, cloud applications deployed on the faulty VM will crash if fault tolerance is not considered. In this study, a model of initial VM fault-tolerant placement for star topological data centers of cloud systems is built on the basis of multiple factors, including the service-level agreement violation rate, resource remaining rate, power consumption rate, failure rate, and fault tolerance cost. Then, a heuristic ant colony algorithm is proposed to solve the model. The service-providing VMs are placed by the ant colony algorithms, and the redundant VMs are placed by the conventional heuristic algorithms. The experimental results obtained from the simulation, real cluster, and fault injection experiments show that the proposed method can achieve better VM fault-tolerant placement solution than that of the traditional first fit or best fit descending method.

Regular Paper Issue

A Locating Method for Reliability-Critical Gates with a Parallel-Structured Genetic Algorithm

Jie Xiao, Zhan-Hui Shi, Jian-Hui Jiang, Xu-Hua Yang, Yu-Jiao Huang, Hai-Gen Hu

Journal of Computer Science and Technology 2019, 34 (5): 1136-1151

Published: 06 September 2019

Abstract

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

The reliability allowance of circuits tends to decrease with the increase of circuit integration and the application of new technology and materials, and the hardening strategy oriented toward gates is an effective technology for improving the circuit reliability of the current situations. Therefore, a parallel-structured genetic algorithm (GA), PGA, is proposed in this paper to locate reliability-critical gates to successfully perform targeted hardening. Firstly, we design a binary coding method for reliability-critical gates and build an ordered initial population consisting of dominant individuals to improve the quality of the initial population. Secondly, we construct an embedded parallel operation loop for directional crossover and directional mutation to compensate for the deficiency of the poor local search of the GA. Thirdly, for combination with a diversity protection strategy for the population, we design an elitism retention based selection method to boost the convergence speed and avoid being trapped by a local optimum. Finally, we present an ordered identification method oriented toward reliability-critical gates using a scoring mechanism to retain the potential optimal solutions in each round to improve the robustness of the proposed locating method. The simulation results on benchmark circuits show that the proposed method PGA is an efficient locating method for reliability-critical gates in terms of accuracy and convergence speed.

total 7