Efficient Knowledge Graph Embedding Training Framework with Multiple GPUs

Ding Sun; Zhen Huang; Dongsheng Li; Min Guo

doi:10.26599/TST.2021.9010067

Tsinghua Science and Technology 2023, 28(1): 167-175 https://doi.org/10.26599/TST.2021.9010067

Open Access | Issue | Published: 21 July 2022

Efficient Knowledge Graph Embedding Training Framework with Multiple GPUs

Show Author's Information Hide Author's Information Ding Sun^¹, Zhen Huang^¹(

), Dongsheng Li^¹, Min Guo^¹

1 College of Computer, National University of Defense Technology, Changsha 410073, China

Keywords:

knowledge graph embedding, parallel algorithm, partitioning graph framework, graphics processing unit (GPU)

Cite this article:

Sun D, Huang Z, Li D, et al. Efficient Knowledge Graph Embedding Training Framework with Multiple GPUs. Tsinghua Science and Technology, 2023, 28(1): 167-175. https://doi.org/10.26599/TST.2021.9010067

Download citation

EndNote(RIS)

BibTeX

493

Views

Downloads

Citations

Crossref

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

When training a large-scale knowledge graph embedding (KGE) model with multiple graphics processing units (GPUs), the partition-based method is necessary for parallel training. However, existing partition-based training methods suffer from low GPU utilization and high input/output (IO) overhead between the memory and disk. For a high IO overhead between the disk and memory problem, we optimized the twice partitioning with fine-grained GPU scheduling to reduce the IO overhead between the CPU memory and disk. For low GPU utilization caused by the GPU load imbalance problem, we proposed balanced partitioning and dynamic scheduling methods to accelerate the training speed in different cases. With the above methods, we proposed fine-grained partitioning KGE, an efficient KGE training framework with multiple GPUs. We conducted experiments on some benchmarks of the knowledge graph, and the results show that our method achieves speedup compared to existing framework on the training of KGE.

Full text

Abstract

Full text

Outline

About this article

Efficient Knowledge Graph Embedding Training Framework with Multiple GPUs

Show Author's information Hide Author's Information Ding Sun^¹, Zhen Huang^¹(

), Dongsheng Li^¹, Min Guo^¹

1 College of Computer, National University of Defense Technology, Changsha 410073, China

Abstract

Keywords: knowledge graph embedding, parallel algorithm, partitioning graph framework, graphics processing unit (GPU)

References(25)

[1]

Z. H. Liu, C. Y. Xiong, M. S. Sun, and Z. Y. Liu, Entity-duet neural ranking: Understanding the role of knowledge graph semantics in neural information retrieval, arXiv preprint arXiv:1805.07591, 2018.

Google Scholar

[2]

X. Huang, J. Y. Zhang, D. C. Li, and P. Li, Knowledge graph embedding based question answering, in Proc. 12th ACM Int. Conf. Web Search and Data Mining, Melbourne, Australia, 2019, pp. 105–113.

DOI Google Scholar

[3]

H. W. Wang, F. Z. Zhang, X. Xie, and M. Y. Guo, DKN: Deep knowledge-aware network for news recommendation, in Proc. 2018 World Wide Web Conf., Lyon, France, 2018, pp. 1835–1844.

DOI Google Scholar

[4]

P. Goyal and E. Ferrara, Graph embedding techniques, applications, and performance: A survey, Knowl.-Based Syst., vol. 151, pp. 78–94, 2018.

DOI Google Scholar

[5]

Q. Wang, Z. D. Mao, B. Wang, and L. Guo, Knowledge graph embedding: A survey of approaches and applications, IEEE Trans. Knowl. Data Eng., vol. 29, no. 12, pp. 2724–2743, 2017.

DOI Google Scholar

[6]

A. Bordes, N. Usunier, A. Garcia-Durán, J. Weston, and O. Yakhnenko, Translating embeddings for modeling multi-relational data, in Proc. 26th Int. Conf. Neural Information Processing Systems, Lake Tahoe, NV, USA, 2013, pp. 2787–2795.

Google Scholar

[7]

Y. K. Lin, Z. Y. Liu, M. S. Sun, Y. Liu, and X. Zhu, Learning entity and relation embeddings for knowledge graph completion, in Proc. 29th AAAI Conf. Artificial Intelligence, Austin, TX, USA, 2015, pp. 2181–2187.

DOI Google Scholar

[8]

B. S. Yang, W. T. Yih, X. D. He, J. F. Gao, and L. Deng, Embedding entities and relations for learning and inference in knowledge bases, arXiv preprint arXiv:1412.6575, 2015.

Google Scholar

[9]

T. Trouillon, J. Welbl, S. Riedel, É. Gaussier, and G. Bouchard, Complex embeddings for simple link prediction, in Proc. 33rd Int. Conf. Machine Learning, New York, NY, USA, 2016, pp. 2071–2080.

Google Scholar

[10]

D. Zheng, X. Song, C. Ma, Z. Y. Tan, Z. H. Ye, J. Dong, H. Xiong, Z. Zhang, and G. Karypis, DGL-KE: Training knowledge graph embeddings at scale, in Proc. 43rd Int. ACM SIGIR Conf. Research and Development in Information Retrieval, Xi’an, China, 2020, pp. 739–748.

DOI Google Scholar

[11]

A. Lerer, L. Wu, J. J. Shen, T. Lacroix, L. Wehrstedt, A. Bose, and A. Peysakhovich, PyTorch-BigGraph: A large-scale graph embedding system, arXiv preprint arXiv:1903.12287, 2019.

Google Scholar

[12]

J. Mohoney, R. Waleffe, H. Xu, T. Rekatsinas, and S. Venkataraman, Marius: Learning massive graph embeddings on a single machine, presented at the 15th USENIX Symp. Operating Systems Design and Implementation, Santa Clara, CA, USA, 2021, pp. 533–549.

[13]

S. Riedel, L. M. Yao, A. McCallum, and B. M. Marlin, Relation extraction with matrix factorization and universal schemas, in Proc. 2013 Conf. North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Atlanta, GA, USA, 2013, pp. 74–84.

Google Scholar

[14]

R. Ying, R. N. He, K. F. Chen, P. Eksombatchai, W. L. Hamilton, and J. Leskovec, Graph convolutional neural networks for web-scale recommender systems, in Proc. 24th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, London, UK, 2018, pp. 974–983, 2018.

DOI Google Scholar

[15]

L. Van der Maaten and G. Hinton, Visualizing data using t-SNE, Journal of Machine Learning Research, vol. 9, pp. 2579–2605, 2008.

Google Scholar

[16]

Y. F. Dai, S. P. Wang, N. N. Xiong, and W. Z. Guo, A survey on knowledge graph embedding: Approaches, applications and benchmarks, Electronics, vol. 9, no. 5, p. 750, 2020.

DOI Google Scholar

[17]

A. Rossi, D. Barbosa, D. Firmani, A. Matinata, and P. Merialdo, Knowledge graph embedding for link prediction: A comparative analysis, ACM Trans. Knowl. Discov. Data, vol. 15, no. 2, pp. 1–49, 2021.

DOI Google Scholar

[18]

G. L. Ji, S. Z. He, L. H. Xu, K. Liu, and J. Zhao, Knowledge graph embedding via dynamic mapping matrix, in Proc. 53rd Annu. Meeting of the Association for Computational Linguistics and the 7th Int. Joint Conf. Natural Language Processing (Volume 1: Long Papers), Beijing, China, 2015, pp. 687–696.

DOI Google Scholar

[19]

Z. Wang, J. W. Zhang, J. L. Feng, and Z. Chen, Knowledge graph embedding by translating on hyperplanes, in Proc. 28th AAAI Conf. Artificial Intelligence, Québec City, Canada, 2014, pp. 1112–1119.

DOI Google Scholar

[20]

S. M. Kazemi and D. Poole, SimplE embedding for link prediction in knowledge graphs, arXiv preprint arXiv:1802.04868, 2018.

Google Scholar

[21]

M. Nickel, V. Tresp, and H. P. Kriegel, A three-way model for collective learning on multi-relational data, in Proc. 28th Int. Conf. Machine Learning, Bellevue, WA, USA, 2011, pp. 809–816.

Google Scholar

[22]

S. Guo, Q. Wang, L. H. Wang, B. Wang, and L. Guo, Knowledge graph embedding with iterative guidance from soft rules, in Proc. 32nd AAAI Conf. Artificial Intelligence, New Orleans, LA, USA, 2018, pp. 4816–4823.

DOI Google Scholar

[23]

X. Z. Wang, T. Y. Gao, Z. C. Zhu, Z. Y. Zhang, Z. Y. Liu, J. Z. Li, and J. Tang, KEPLER: A unified model for knowledge embedding and pre-trained language representation, Transactions of the Association for Computational Linguistics, vol. 9, pp. 176–194, 2021.

DOI Google Scholar

[24]

K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor, Freebase: A collaboratively created graph database for structuring human knowledge, in Proc. 2008 ACM SIGMOD Int. Conf. Management of Data, Vancouver, Canada, 2008, pp. 1247–1250.

DOI Google Scholar

[25]

F. Mahdisoltani, J. Biega, and F. M. Suchanek, YAGO3: A knowledge base from multilingual wikipedias, presented at the 7th Biennial Conf. Innovative Data Systems Research, Asilomar, CA, USA, 2015.

About this article

Publication history

Rights and permissions

Publication history

Received: 25 July 2021

Revised: 26 August 2021

Accepted: 27 August 2021

Published: 21 July 2022

Issue date: February 2023

Copyright

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).