AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (517.7 KB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

HW/SW Co-optimization for Stencil Computation: Beginning with a Customizable Core

Yanhua LiYouhui Zhang( )Weiming Zheng
Department of Computer Science, Tsinghua University, Beijing 100084, China.
Show Author Information

Abstract

Energy efficiency is one of the most important issues for High Performance Computing (HPC) today. Heterogeneous HPC platform with some energy-efficient customizable cores (as application-specific accelerators) is believed as one of the promising solutions to meet ever-increasing computing needs and to overcome power density limitations. In this paper, we focus on using customizable processor cores to optimize the typical stencil computations — the kernel of many high-performance applications. We develop a series of effective software/hardware co-optimization strategies to exploit the instruction-level and memory-computation parallelism, as well as to decrease the energy consumption. These optimizations include loop tiling, prefetching, cache customization, Single Instruction Multiple Data (SIMD), and Direct Memory Access (DMA), as well as necessary ISA extensions. Detailed tests of power-efficiency are given to evaluate the effect of all these optimizations comprehensively. The results are impressive: the combination of these optimizations has improved the application performance by 341% while the energy consumption has been decreased by 35%; a preliminary comparison with X86, GPU, and FPGA platforms also showed that the design could achieve an order of magnitude higher performance efficiency. We believe this work can help understand sources of inefficiency in general-purpose chips and can be used as a beginning to customize an energy efficient CMP for further improvement.

References

【1】
【1】
 
 
Tsinghua Science and Technology
Pages 570-580

{{item.num}}

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Close
Close
Cite this article:
Li Y, Zhang Y, Zheng W. HW/SW Co-optimization for Stencil Computation: Beginning with a Customizable Core. Tsinghua Science and Technology, 2016, 21(5): 570-580. https://doi.org/10.1109/TST.2016.7590326

940

Views

46

Downloads

4

Crossref

N/A

Web of Science

4

Scopus

0

CSCD

Received: 07 September 2015
Accepted: 03 February 2016
Published: 18 October 2016
© The author(s) 2016