Journal Home > Volume 2 , Issue 3

With the progress and development of computer technology, applying machine learning methods to cancer research has become an important research field. To analyze the most recent research status and trends, main research topics, topic evolutions, research collaborations, and potential directions of this research field, this study conducts a bibliometric analysis on 6206 research articles worldwide collected from PubMed between 2011 and 2021 concerning cancer research using machine learning methods. Python is used as a tool for bibliometric analysis, Gephi is used for social network analysis, and the Latent Dirichlet Allocation model is used for topic modeling. The trend analysis of articles not only reflects the innovative research at the intersection of machine learning and cancer but also demonstrates its vigorous development and increasing impacts. In terms of journals, Nature Communications is the most influential journal and Scientific Reports is the most prolific one. The United States and Harvard University have contributed the most to cancer research using machine learning methods. As for the research topic, “Support Vector Machine,” “classification,” and “deep learning” have been the core focuses of the research field. Findings are helpful for scholars and related practitioners to better understand the development status and trends of cancer research using machine learning methods, as well as to have a deeper understanding of research hotspots.


menu
Abstract
Full text
Outline
About this article

A bibliometric analysis of worldwide cancer research using machine learning methods

Show Author's information Lianghong Lin1 Likeng Liang2Maojie Wang3,4,5Runyue Huang3,4,5Mengchun Gong6Guangjun Song7Tianyong Hao1,2( )
School of Artificial Intelligence, South China Normal University, Guangzhou, China
School of Computer Science, South China Normal University, Guangzhou, China
Guangdong Provincial Hospital of Chinese Medicine, Guangzhou, China
Guangdong Provincial Key Laboratory of Clinical Research on Traditional Chinese Medicine Syndrome, Guangzhou, China
State Key Laboratory of Dampness Syndrome of Chinese Medicine, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
Institute of Health Management, Southern Medical University, Guangzhou, China
Guangzhou BiaoQi Optoelectronics Co., Ltd., Guangzhou, China

Abstract

With the progress and development of computer technology, applying machine learning methods to cancer research has become an important research field. To analyze the most recent research status and trends, main research topics, topic evolutions, research collaborations, and potential directions of this research field, this study conducts a bibliometric analysis on 6206 research articles worldwide collected from PubMed between 2011 and 2021 concerning cancer research using machine learning methods. Python is used as a tool for bibliometric analysis, Gephi is used for social network analysis, and the Latent Dirichlet Allocation model is used for topic modeling. The trend analysis of articles not only reflects the innovative research at the intersection of machine learning and cancer but also demonstrates its vigorous development and increasing impacts. In terms of journals, Nature Communications is the most influential journal and Scientific Reports is the most prolific one. The United States and Harvard University have contributed the most to cancer research using machine learning methods. As for the research topic, “Support Vector Machine,” “classification,” and “deep learning” have been the core focuses of the research field. Findings are helpful for scholars and related practitioners to better understand the development status and trends of cancer research using machine learning methods, as well as to have a deeper understanding of research hotspots.

Keywords: machine learning, Latent Dirichlet Allocation, cancer, bibliometric analysis, research topic, topic evolution

References(42)

1

Chen X, Zhang X, Xie H, Tao X, Wang FL, Xie N, et al. A bibliometric and visual analysis of artificial intelligence technologies‐enhanced brain MRI research. Multimed Tools Appl. 2021;80(11):17335–63. https://doi.org/10.1007/s11042-020-09062-7

2

Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230–43. https://doi.org/10.1136/svn-2017-000101

3

Pesapane F, Volonté C, Codari M, Sardanelli F. Artificial intelligence as a medical device in radiology: ethical and regulatory issues in Europe and the United States. Insights Imaging. 2018;9(5):745–53. https://doi.org/10.1007/s13244-018-0645-y

4

Tredinnick L. Artificial intelligence and professional roles. Bus Inf Rev. 2017;34(1):37–41. https://doi.org/10.1177/0266382117692621

5

Wu X, Dong D, Zhang L, Fang M, Zhu Y, He B, et al. Exploring the predictive value of additional peritumoral regions based on deep learning and radiomics: a multicenter study. Med Phys. 2021;48(5):2374–85. https://doi.org/10.1002/mp.14767

6

Jiao Y, Li J, Qian C, Fei S. Deep learning‐based tumor microenvironment analysis in colon adenocarcinoma histopathological whole‐slide images. Comput Methods Programs Biomed. 2021;204:106047. https://doi.org/10.1016/j.cmpb.2021.106047

7

Deniffel D, Abraham N, Namdar K, Dong X, Salinas E, Milot L, et al. Using decision curve analysis to benchmark performance of a magnetic resonance imaging–based deep learning model for prostate cancer risk assessment. Eur Radiol. 2020;30(12):6867–76. https://doi.org/10.1007/s00330-020-07030-1

8

Aksoy S, Aksoy U, Orhan K. An overview of the 35 years of research in the oral radiology: a bibliometric analysis. Oral Radiol. 2021;38(2):183–91. https://doi.org/10.1007/s11282-021-00542-6

9

Guo Y, Hao Z, Zhao S, Gong J, Yang F. Artificial intelligence in health care: bibliometric analysis. J Med Internet Res. 2020;22(7):e18228. https://doi.org/10.2196/18228

10

Kokol P, Blažun Vošner H, Završnik J. Application of bibliometrics in medicine: a historical bibliometrics analysis. Health Info Libr J. 2021;38(2):125–38. https://doi.org/10.1111/hir.12295

11

Wang J, Fan Y, Zhang H, Feng L. Technology hotspot tracking: topic discovery and evolution of China's blockchain patents based on a dynamic LDA model. Symmetry. 2021;13(3):415. https://doi.org/10.3390/sym13030415

12
Chen X, Gao D, Lun Y, Zhou D, Hao T, Xie H. The analysis of worldwide research on artificial intelligence assisted user modeling. In: Carrera P, editor. International symposium on emerging technologies for education. Cham: Springer; 2019. p. 201–13.
DOI
13

Hu Y, Yu Z, Cheng X, Luo Y, Wen C. A bibliometric analysis and visualization of medical data mining research. Medicine. 2020;99(22):e20338. https://doi.org/10.1097/MD.0000000000020338

14

Chen X, Lun Y, Yan J, Hao T, Weng H. Discovering thematic change and evolution of utilizing social media for healthcare research. BMC Med Inform Decis Mak. 2019;19(2):50. https://doi.org/10.1186/s12911-019-0757-4

15

Wang W, Feng Y, Dai W. Topic analysis of online reviews for two competitive products using latent Dirichlet allocation. Electron Commer Res Appl. 2018;29:142–56. https://doi.org/10.1016/j.elerap.2018.04.003

16

Jian F, Yajiao W, Yuanyuan D. Microblog topic evolution computing based on LDA algorithm. Open Phys. 2018;16(1):509–16. https://doi.org/10.1515/phys-2018-0067

17

Sanchez JMP, Alejandro BA, Olvido MMJ, Alejandro IMV. An analysis of online classes tweets using gephi: inputs for online learning. Int J Inf Educ Technol. 2021;11(12):583–9. https://doi.org/10.18178/ijiet.2021.11.12.1568

18

Millet‐Lacombe L. Specifying a generic working environment on historical data, based on MetaindeX, Kibana and Gephi. Berlin, Germany: HistoInformatics@JCDL; 2021.

19

Fruchterman TMJ, Reingold EM. Graph drawing by force‐directed placement. J Softw Pract Exp. 1991;21(11):1129–64. https://doi.org/10.1002/spe.4380211102

20

Sahria Y, Dhomas Hatta Fudholi F. Analysis of health research topics in Indonesia using the lda (latent Dirichlet allocation) topic modeling method. Jurnal RESTI Rekayasa Sistem dan Teknologi Informasi. 2020;4(2):336–44. https://doi.org/10.29207/resti.v4i2.1821

21

Eum S, Lee S, Meng X, Cho SW, Lee C. Analysis of research trends of wireless power transfer system for locomotives using topic modeling based on LDA algorithm. J Korean Inst Ind Eng. 2019;45(4):284–301. https://doi.org/10.7232/JKIIE.2019.45.4.284

22

Jelodar H, Wang Y, Yuan C, Feng X, Jiang X, Li Y, et al. Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey. Multimed Tools Appl. 2019;78(11):15169–15211. https://doi.org/10.1007/s11042-018-6894-4

23

Tajbakhsh MS, Bagherzadeh J. Semantic knowledge LDA with topic vector for recommending hashtags: Twitter use case. Intell Data Anal. 2019;23(3):609–22. https://doi.org/10.3233/IDA-183998

24

Celebi ME, Kingravi HA, Vela PA. A comparative study of efficient initialization methods for the k‐means clustering algorithm. Expert Syst Appl. 2013;40(1):200–10. https://doi.org/10.1016/j.eswa.2012.07.021

25

Zang H, Zhang S, Hapeshi K. A review of nature‐inspired algorithms. J Bionic Eng. 2010;7(4):S232–7. https://doi.org/10.1016/S1672-6529(09)60240-7

26
Chakraborty A, Kar AK. Swarm intelligence: a review of algorithms. In: Patnaik S, Xin‐She Yang X‐S, Nakamatsu K, editors. Nature‐inspired computing and optimization. 2017. p. 475–94.
DOI
27

Mahdavi M, Abolhassani H. Harmony K‐means algorithm for document clustering. Data Min Knowl Discov. 2009;18(3):370–91. https://doi.org/10.1007/s10618-008-0123-0

28

Li MJ, Ng MK, Cheung Y, Huang JZ. Agglomerative fuzzy k‐means clustering algorithm with selection of number of clusters. IEEE Trans Knowl Data Eng. 2008;20(11):1519–34. https://doi.org/10.1109/TKDE.2008.88

29

Steele J, Iliinsky N. Beautiful visualization: looking at data through the eyes of experts. Sebastopol, CA: O'Reilly Media, Inc.; 2010.

30

Heimerl F, Koch S, Bosch H, Ertl T. Visual classifier training for text document retrieval. IEEE Trans Vis Comput Graph. 2012;18(12):2839–48. https://doi.org/10.1109/TVCG.2012.277

31

Garner RM, Hirsch JA, Albuquerque FC, Fargen KM. Bibliometric indices: defining academic productivity and citation rates of researchers, departments and journals. J Neurointerv Surg. 2018;10(2):102–106. https://doi.org/10.1136/neurintsurg-2017-013265

32

Valderrama P, Escabias M, Jiménez‐Contreras E, Rodríguez‐Archilla A, Valderrama MJ. Proposal of a stochastic model to determine the bibliometric variables influencing the quality of a journal: application to the field of dentistry. Scientometrics. 2018;115(2):1087–95. https://doi.org/10.1007/s11192-018-2707-9

33

Falagas ME, Charitidou E, Alexiou VG. Article and journal impact factor in various scientific fields. Am J Med Sci. 2008;335(3):188–91. https://doi.org/10.1097/MAJ.0b013e318145abb9

34

Yan Z, Wu Q, Li X. Do Hirsch‐type indices behave the same in assessing single publications? an empirical study of 29 bibliometric indicators. Scientometrics. 2016;109(3):1815–33. https://doi.org/10.1007/s11192-016-2147-3

35

Hu G, Wang L, Ni R, Liu W. Which h‐index? an exploration within the web of science. Scientometrics. 2020;123(3):1225–33. https://doi.org/10.1007/s11192-020-03425-5

36

Selek S, Saleh A. Use of h index and g index for American academic psychiatry. Scientometrics. 2014;99(2):541–8. https://doi.org/10.1007/s11192-013-1204-4

37
Wang C, Li Y. Applying H‐index within 5‐year citations window. 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS), IEEE; 2017. p. 882–885.
DOI
38

Hovden R. Bibliometrics for Internet media: applying the h‐index to You Tube. J Assoc Inf Sci Technol. 2013;64(11):2326–31. https://doi.org/10.1002/asi.22936

39

Bastian S, Ippolito JA, Lopez SA, Eloy JA, Beebe KS. The use of the h‐index in academic orthopaedic surgery. J Bone Jt Surg. 2017;99(4):e14. https://doi.org/10.2106/JBJS.15.01354

40

Hu Y, Yu Z, Cheng X, Luo Y, Wen C. A bibliometric analysis and visualization of medical data mining research. Medicine. 2020;99(22):e20338. https://doi.org/10.1097/MD.0000000000020338

41

Karmaoui A. Ordovician‐Cambrian palaeontological heritage of Zagora province: a bibliometric analysis from 1984 to 2020 (Anti‐Atlas, Morocco). Geoheritage. 2022;14(2):55. https://doi.org/10.1007/s12371-022-00695-8

42

Bankapur K, Singh H, Gupta A, Mathur H, Harikrishnan R, Wagle SA. Bibliometric analysis on hand gesture controlled robot. Libr Philos Pract (e‐journal). 2021:5584. https://digitalcommons.unl.edu/libphilprac/5584

Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 20 October 2022
Accepted: 05 March 2023
Published: 11 April 2023
Issue date: June 2023

Copyright

© 2023 The Authors. Tsinghua University Press.

Acknowledgements

ACKNOWLEDGMENTS

None.

Rights and permissions

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

Return