D. S. Kermany, M. Goldbaum, W. J. Cai, C. C. S. Valentim, H. Y. Liang, S. L. Baxter, A. Mckeown, G. Yang, X. K. Wu, F. B. Yan, et al., Identifying medical diagnoses and treatable diseases by image-based deep learning, Cell, vol. 172, no. 5, pp. 1122–1131.e9, 2018.
J. De Fauw, J. R. Ledsam, B. Romera-Paredes, S. Nikolov, N. Tomasev, S. Blackwell, H. Askham, X. Glorot, B. O’Donoghue, D. Visentin, et al, Clinically applicable deep learning for diagnosis and referral in retinal disease, Nat. Med., vol. 24, no. 9, pp. 1342–1350, 2018.
N. Coudray, P. S. Ocampo, T. Sakellaropoulos, N. Narula, M. Snuderl, D. Fenyö, A. L. Moreira, N. Razavian, and A. Tsirigos, Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning, Nat. Med., vol. 24, no. 10, pp. 1559–1567, 2018.
R. H. Xu, W. Wei, M. Krawczyk, W. Q. Wang, H. Y. Luo, K. Flagg, S. H. Yi, W. Shi, Q. L. Quan, K. Li, et al., Circulating tumour DNA methylation markers for diagnosis and prognosis of hepatocellular carcinoma, Nat. Mater., vol. 16, no. 11, pp. 1155–1161, 2017.
S. M. McKinney, M. Sieniek, V. Godbole, J. Godwin, N. Antropova, H. Ashrafian, T. Back, M. Chesus, G. S. Corrado, A. Darzi, et al., International evaluation of an AI system for breast cancer screening, Nature, vol. 577, no. 7788, pp. 89–94, 2020.
H. Y. Liang, B. Y. Tsui, H. Ni, C. C. S. Valentim, S. L. Baxter, G. J. Liu, W. J. Cai, D. S. Kermany, X. Sun, J. C. Chen, et al., Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence, Nat. Mater., vol. 25, no. 3, pp. 433–438, 2019.
N. Bien, P. Rajpurkar, R. L. Ball, J. Irvin, A. Park, E. Jones, M. Bereket, B. N. Patel, K. W. Yeom, K. Shpanskaya, et al., Deep-learning-assisted diagnosis for knee magnetic resonance imaging: Development and retrospective validation of MRNet, PLoS Med., vol. 15, no. 11, p. e1002699, 2018.
A. Park, C. Chute, P. Rajpurkar, J. Lou, R. L. Ball, K. Shpanskaya, R. Jabarkheel, L. H. Kim, E. Mckenna, J. Tseng, et al., Deep learning-assisted diagnosis of cerebral aneurysms using the HeadXNet model, JAMA Netw. Open, vol. 2, no. 6, p. e195600, 2019.
J. Chen, L. L. Wu, J. Zhang, L. Zhang, D. X. Gong, Y. L. Zhao, Q. X. Chen, S. L. Huang, M. Yang, X. Yang, et al., Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography, Sci. Rep., vol. 10, no. 1, p. 19196, 2020.
A. Rajkomar, E. Oren, K. Chen, A. M. Dai, N. Hajaj, M. Hardt, P. J. Liu, X. B. Liu, J. Marcus, M. M. Sun, et al., Scalable and accurate deep learning with electronic health records, npj Digital Med., vol. 1, no. 1, p. 18, 2018.
K. Yan, X. S. Wang, L. Lu, and R. M. Summers, DeepLesion: Automated mining of large-scale lesion annotations and universal lesion detection with deep learning, J. Med. Imaging, vol. 5, no. 3, p. 036501, 2018.
J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and F. F. Li, ImageNet: A large-scale hierarchical image database, in Proc. 2009 IEEE Int. Conf. on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009, pp. 1175–1181.
T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, Microsoft COCO: Common objects in context, in Proc. 13th European Conf. on Computer Vision, Zurich, Switzerland, 2014, pp. 740–755.
G. Mujtaba, L. Shuib, R. G. Raj, R. Rajandram, K. Shaikh, and M. A. Al-Garadi, Automatic ICD-10 multi-class classification of cause of death from plaintext autopsy reports through expert-driven feature selection, PLoS One, vol. 12, no. 2, p. e0170242, 2017.
M. Li, Z. H. Fei, M. Zeng, F. X. Wu, Y. H. Li, Y. Pan, and J. X. Wang, Automated ICD-9 coding via a deep learning approach, IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 16, no. 4, pp. 1193–1202, 2019.
J. Martineau and T. Finin, Delta TFIDF: An improved feature space for sentiment analysis, in Proc.3rd IEEE Int. Conf. on Weblogs and Social Media, San Jose, CA, USA, 2009.
P. Soucy and G. W. Mineau, Beyond TFIDF weighting for text categorization in the vector space model, in Proc. 19th Int. Joint Conf. on Artificial Intelligence, Edinburgh, UK, 2005, pp. 1130–1135.
X. M. Ye, X. M. Mao, J. C. Xia, and B. Wang, Improved approach to TF-IDF algorithm in text classification, (in Chinese), Comput. Eng. Appl., vol. 55, no. 2, pp. 104–109, 161, 2019.
D. M. Blei, A. Y. Ng, and M. I. Jordan, Latent dirichlet allocation, J. Mach. Learn. Res., vol. 3, pp. 993–1022, 2003.
L. Li and Y. Zhang, An empirical study of text classification using latent dirichlet allocation, Cs. Cmu. Edu., no. 1, 2018.
T. François and E. Miltsakaki, Do NLP and machine learning improve traditional readability formulas? in Proc. 1st Workshop on Predicting and Improving Text Readability for Target Reader Populations, Montréal, Canada, 2012, pp. 49–57.
X. E. Liu, X. X. You, X. Zhang, J. Wu, and P. Lv, Tensor graph convolutional networks for text classification, in Proc. 34th AAAI Conf. on Artificial Intelligence, New York, NY, USA, 2020, pp. 8409–8416.
Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol. 521, no. 7553, pp. 436–444, 2015.
W. C. Sun, Z. P. Cai, Y. Y. Li, F. Liu, S. Q. Fang, and G. Y. Wang, Data processing and text mining technologies on electronic medical records: A review, J. Healthc. Eng., vol. 2018, p. 4302425, 2018.
W. C. Sun, Z. P. Cai, F. Liu, S. Q. Fang, and G. Y. Wang, A survey of data mining technology on electronic medical records, in Proc. IEEE 19th Int. Conf. on e-Health Networking, Applications and Services (Healthcom), Dalian, China, 2017, pp. 1–6.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, Attention is all you need, in Proc. 31st Int. Conf. on Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6000–6010.
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, in Proc. 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2019, pp. 4171–4186.
L. Floridi and M. Chiriatti, GPT-3: Its nature, scope, limits, and consequences, Minds Mach., vol. 30, no. 4, pp. 681–694, 2020.
J. Lee, W. J. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, BioBERT: A pre-trained biomedical language representation model for biomedical text mining, Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020.
K. X. Huang, J. Altosaar, and R. Ranganath, ClinicalBERT: Modeling clinical notes and predicting hospital readmission, arXiv preprint arXiv: 1904.05342, 2019.
X. Rong, word2vec parameter learning explained, arXiv preprint arXiv: 1411.2738, 2014.
Y. Liu, T. Ge, K. Mathews, H. Ji, and D. McGuinness, Exploiting task-oriented resources to learn word embeddings for clinical abbreviation expansion, in Proc. 2015 Workshop on Biomedical Natural Language Processing, Beijing, China, 2015, pp. 92–97.
T. Mikolov, Q. V. Le, and I. Sutskever, Exploiting similarities among languages for machine translation, arXiv preprint arXiv: 1309.4168, 2013.
P. Kim, Convolutional neural network, in MATLAB Deep Learning: With Machine Learning, Neural Networks and Artificial Intelligence, P. Kim, ed. Berkeley, CA, USA: Springer, 2017, pp. 121–147.
I. Freeman, L. Roese-Koerner, and A. Kummert, Effnet: An efficient structure for convolutional neural networks, in Proc. 25th IEEE Int. Conf. on Image Processing, Athens, Greece, 2018, pp. 6–10.
Y. Kim, Convolutional neural networks for sentence classification, in Proc. 2014 Conf. on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014, pp. 1746–1751.
L. Breiman, Random forests, Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001.
A. Cutler, D. R. Cutler, and J. R. Stevens, Random forests, in Ensemble Machine Learning: Methods and Applications, C. Zhang and Y. Q. Ma, eds. Boston, MA, USA: Springer, 2012, pp. 157–176.
E. Shelhamer, J. Long, and T. Darrell, Fully convolutional networks for semantic segmentation, IEEE Trans. Patt. Analy. Mach. Intell., vol. 39, no. 4, pp. 640–651, 2017.
Y. Doval and C. Gómez-Rodríguez, Comparing neural- and N-gram-based language models for word segmentation, J. Assoc. Inform. Sci. Technol., vol. 70, no. 2, pp. 187–197, 2019.
K. C. Chang and H. T. Chang, Is it possible to use chatbot for the Chinese word segmentation? in Proc. 3rd Int. Conf. on Natural Language Processing and Information Retrieval, Tokushima, Japan, 2019, pp. 20–24.
H. Saif, M. Fernández, Y. L. He, and H. Alani, On stopwords, filtering and data sparsity for sentiment analysis of twitter, in Proc. 9th Int. Language Resources and Evaluation Conf., Reykjavik, Iceland, 2014, pp. 810–817.
B. K. Yang, G. Z. Dai, Y. J. Yang, D. R. Tang, Q. Li, D. N. Lin, J. Zheng, and Y. P. Cai, Automatic text classification for label imputation of medical diagnosis notes based on random forest, in Proc. 7th Int. Conf. on Health Information Science, Cairns, Australia, 2018, pp. 87–97.
D. Kostrzewa and R. Brzeski, Adjusting parameters of the classifiers in multiclass classification, in Proc. 13th Int. Conf. on Beyond Databases, Architectures and Structures. Towards Efficient Solutions for Data Analysis and Knowledge Representation, Ustroń, Poland, 2017, pp. 89–101.
L. van der Maaten and G. Hinton, Visualizing data using t-SNE, J. Mach. Learn. Res., vol. 9, no. 86, pp. 2579–2605, 2008.
M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you?”: Explaining the predictions of any classifier, in Proc. 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 2016, pp. 1135–1144.