Loop Subgraph-Level Greedy Mapping Algorithm for Grid Coarse-Grained Reconfigurable Array

Naijin Chen; Fei Cheng; Chenghao Han; Jianhui Jiang; Xiaoqing Wen

doi:10.26599/TST.2022.9010001

Tsinghua Science and Technology 2023, 28(2): 330-343 https://doi.org/10.26599/TST.2022.9010001

Open Access | Issue | Published: 29 September 2022

Loop Subgraph-Level Greedy Mapping Algorithm for Grid Coarse-Grained Reconfigurable Array

Show Author's Information Hide Author's Information Naijin Chen^¹(

), Fei Cheng^¹(

), Chenghao Han^¹, Jianhui Jiang^², Xiaoqing Wen^³

1School of Computer and Information Science, Anhui Polytechnic University, Wuhu 241000, China

2School of Software Engineering, Tongji University, Shanghai 201804, China

3Department of Computer Science and Networks, Kyushu Institute of Technology, Fukuoka 820-8502, Japan

Keywords:

scheduling, mapping, Grid Coarse-Grained Reconfigurable Array (GCGRA), loop subgraph

Cite this article:

Chen N, Cheng F, Han C, et al. Loop Subgraph-Level Greedy Mapping Algorithm for Grid Coarse-Grained Reconfigurable Array. Tsinghua Science and Technology, 2023, 28(2): 330-343. https://doi.org/10.26599/TST.2022.9010001

Download citation

EndNote(RIS)

BibTeX

742

Views

129

Downloads

Citations

Crossref

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

To solve the problem of grid coarse-grained reconfigurable array task mapping under multiple constraints, we propose a Loop Subgraph-Level Greedy Mapping (LSLGM) algorithm using parallelism and processing element fragmentation. Under the constraint of a reconfigurable array, the LSLGM algorithm schedules node from a ready queue to the current reconfigurable cell array block. After mapping a node, its successor’s indegree value will be dynamically updated. If its successor’s indegree is zero, it will be directly scheduled to the ready queue; otherwise, the predecessor must be dynamically checked. If the predecessor cannot be mapped, it will be scheduled to a blocking queue. To dynamically adjust the ready node scheduling order, the scheduling function is constructed by exploiting factors, such as node number, node level, and node dependency. Compared with the loop subgraph-level mapping algorithm, experimental results show that the total cycles of the LSLGM algorithm decreases by an average of 33.0 $%$ ( ${PEA}_{4 \times 4}$ ) and 33.9 $%$ ( ${PEA}_{7 \times 7}$ ). Compared with the epimorphism map algorithm, the total cycles of the LSLGM algorithm decrease by an average of 38.1 $%$ ( ${PEA}_{4 \times 4}$ ) and 39.0 $%$ ( ${PEA}_{7 \times 7}$ ). The feasibility of LSLGM is verified.

Full text

Abstract

Full text

Outline

About this article

Loop Subgraph-Level Greedy Mapping Algorithm for Grid Coarse-Grained Reconfigurable Array

Show Author's information Hide Author's Information Naijin Chen^¹(

), Fei Cheng^¹(

), Chenghao Han^¹, Jianhui Jiang^², Xiaoqing Wen^³

1School of Computer and Information Science, Anhui Polytechnic University, Wuhu 241000, China

2School of Software Engineering, Tongji University, Shanghai 201804, China

3Department of Computer Science and Networks, Kyushu Institute of Technology, Fukuoka 820-8502, Japan

Abstract

Keywords: scheduling, mapping, Grid Coarse-Grained Reconfigurable Array (GCGRA), loop subgraph

References(15)

[1]

M. Brandalero, L. Carro, A. C. S. Beck, and M. Shafique, Multi-target adaptive reconfigurable acceleration for low-power IoT processing, IEEE Trans. Comput., vol. 70, no. 1, pp. 83–98, 2021.

DOI Google Scholar

[2]

I. Bae, B. Harris, H. Min, and B. Egger, Auto-tuning CNNs for coarse-grained reconfigurable array-based accelerators, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 37, no. 11, pp. 2301–2310, 2018.

DOI Google Scholar

[3]

J. M. P. Cardoso, P. C. Diniz, and M. Weinhardt, Compiling for reconfigurable computing: A survey, ACM Comput. Surv., vol. 42, no. 4, p. 13, 2010.

DOI Google Scholar

[4]

G. Charitopoulos, D. N. Pnevmatikatos, and G. Gaydadjiev, MC-DeF: Creating customized CGRAs for dataflow applications, ACM Trans. Archit. Code Optim., vol. 18, no. 3, p. 26, 2021.

DOI Google Scholar

[5]

L. B. Liu, J. F. Zhu, Z. S. Li, Y. N. Lu, Y. D. Deng, J. Han, S. Y. Yin, and S. J. Wei, A survey of coarse-grained reconfigurable architecture and design: Taxonomy, challenges, and applications, ACM Comput. Surv., vol. 52, no. 6, p. 118, 2020.

DOI Google Scholar

[6]

N. J. Chen, Z. Wang, R. X. He, J. H. Jiang, F. Cheng, and C. H. Han, Efficient scheduling mapping algorithm for row parallel coarse-grained reconfigurable architecture, Tsinghua Science and Technology, vol. 26, no. 5, pp. 724–735, 2021.

DOI Google Scholar

[7]

N. J. Chen and J. H. Jiang, Mapping algorithm of coarse grained reconfigurable cell array for multi-branch tree data flow graph, (in Chinese), J. Comput. -Aided Des. Comput. Graphics, vol. 28, no. 7, pp. 1180–1187, 2016.

Google Scholar

[8]

N. J. Chen and Z. Y. Feng, Interconnect delay performance evaluation for non-crossing level and row operands parallel RCA, (in Chinese), J. Tianjin Univ. (Sci. Technol.), vol. 50, no. 4, pp. 429–436, 2017.

Google Scholar

[9]

M. Balasubramanian and A. Shrivastava, CRIMSON: Compute-intensive loop acceleration by randomized iterative modulo scheduling and optimized mapping on CGRAs, IEEE Trans. Comput. -Aided Des. Integr. Circuits Syst., vol. 39, no. 11, pp. 3300–3310, 2020.

DOI Google Scholar

[10]

G. Lee, E. Cetin, and O. Diessel, Fault recovery time analysis for coarse-grained reconfigurable architectures, ACM Trans. Embedded Comput. Syst., vol. 17, no. 2, p. 42, 2018.

DOI Google Scholar

[11]

T. Kojima, N. A. V. Doan, and H. Amano, GenMap: A genetic algorithmic approach for optimizing spatial mapping of coarse-grained reconfigurable architectures, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 28, no. 11, pp. 2383–2396, 2020.

DOI Google Scholar

[12]

M. Hamzeh, A. Shrivastava, and S. Vrudhula, EPIMap: Using Epimorphism to map applications on CGRAs, in Proc. 2012 DAC Design Automation Conf., San Francisco, CA, USA, 2012, pp. 1280–1287.

DOI Google Scholar

[13]

M. Hamzeh, A. Shrivastava, and S. Vrudhula, REGIMap: Register-aware application mapping on Coarse-Grained Reconfigurable Architectures (CGRAs), in Proc.50^th ACM/EDAC/IEEE Design Automation Conf. (DAC), Austin, TX, USA, 2013, pp. 1–10.

DOI Google Scholar

[14]

N. J. Chen, J. H. Jiang, X. Chen, Z. Zhou, and Y. Xu, An improved level partitioning algorithm considering minimum execution delay and resource restraints, (in Chinese), Acta Electron. Sin., vol. 40, no. 5, pp. 1055–1066, 2012.

Google Scholar

[15]

N. J. Chen, Z. Y. Feng, and J. H. Jiang, Bypass node non-redundant adding algorithm for crossing-level data transmission in two-dimension reconfigurable cell array, (in Chinese), J. Commun., vol. 36, no. 4, p. 2015132, 2015.

Google Scholar

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 29 October 2021

Revised: 27 December 2021

Accepted: 24 January 2022

Published: 29 September 2022

Issue date: April 2023

Copyright

Acknowledgements

This research was supported by the Natural Science Foundation of Anhui Province (No. 1808085MF203) and the Natural Science Foundation of China (Nos. 61972438 and 61432017).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).