Journal Home > Volume 4 , Issue 1

Soft Tissue Tumors (STT) are a form of sarcoma found in tissues that connect, support, and surround body structures. Because of their shallow frequency in the body and their great diversity, they appear to be heterogeneous when observed through Magnetic Resonance Imaging (MRI). They are easily confused with other diseases such as fibroadenoma mammae, lymphadenopathy, and struma nodosa, and these diagnostic errors have a considerable detrimental effect on the medical treatment process of patients. Researchers have proposed several machine learning models to classify tumors, but none have adequately addressed this misdiagnosis problem. Also, similar studies that have proposed models for evaluation of such tumors mostly do not consider the heterogeneity and the size of the data. Therefore, we propose a machine learning-based approach which combines a new technique of preprocessing the data for features transformation, resampling techniques to eliminate the bias and the deviation of instability and performing classifier tests based on the Support Vector Machine (SVM) and Decision Tree (DT) algorithms. The tests carried out on dataset collected in Nur Hidayah Hospital of Yogyakarta in Indonesia show a great improvement compared to previous studies. These results confirm that machine learning methods could provide efficient and effective tools to reinforce the automatic decision-making processes of STT diagnostics.


menu
Abstract
Full text
Outline
About this article

Improvement in Automated Diagnosis of Soft Tissues Tumors Using Machine Learning

Department of Computer Sciences, Faculty of Sciences and Technologies, My Ismail University, Errachidia 52000, Morocco.
Department of Computer Science, Faculty of Sciences, Chouaib Doukkali University, El Jadida 24000, Morocco.
Department of Mathematics, Universitas Indonesia, Depok 16424, Indonesia.

Abstract

Soft Tissue Tumors (STT) are a form of sarcoma found in tissues that connect, support, and surround body structures. Because of their shallow frequency in the body and their great diversity, they appear to be heterogeneous when observed through Magnetic Resonance Imaging (MRI). They are easily confused with other diseases such as fibroadenoma mammae, lymphadenopathy, and struma nodosa, and these diagnostic errors have a considerable detrimental effect on the medical treatment process of patients. Researchers have proposed several machine learning models to classify tumors, but none have adequately addressed this misdiagnosis problem. Also, similar studies that have proposed models for evaluation of such tumors mostly do not consider the heterogeneity and the size of the data. Therefore, we propose a machine learning-based approach which combines a new technique of preprocessing the data for features transformation, resampling techniques to eliminate the bias and the deviation of instability and performing classifier tests based on the Support Vector Machine (SVM) and Decision Tree (DT) algorithms. The tests carried out on dataset collected in Nur Hidayah Hospital of Yogyakarta in Indonesia show a great improvement compared to previous studies. These results confirm that machine learning methods could provide efficient and effective tools to reinforce the automatic decision-making processes of STT diagnostics.

Keywords: classification, machine learning, Support Vector Machine (SVM), soft tissues tumours, preprocessing techniques, Decision Tree (DT), predictive diagnosis

References(38)

[1]
F. Collin, M. Gelly-Marty, M. B. N. Binh, and J. M. Coindre, Sarcomes des tissus mous: Donneés anatomopathologiques actuelles, Cancer/Radiothérapie, vol. 10, nos. 1&2, pp. 7-14, 2006.
[2]
J. Juntu, A. M. De Schepper, P. Van Dyck, D. Van Dyck, J. Gielen, P. M. Parizel, and J. Sijbers, Classification of soft tissue tumors by machine learning algorithms, in Soft Tissue Tumors, F. Derbel, ed. London, UK: IntechOpen, 2011, pp. 53-69.
[3]
A. M. De Schepper and J. L. Bloem, Soft tissue tumors: Grading, staging, and tissue-specific diagnosis, Top. Magn. Reson. Imaging, vol. 18, no. 6, pp. 431-444, 2007.
[4]
T. Hayashi, A. Horiuchi, K. Sano, Y. Kanai, N. Yaegashi, H. Aburatani, and I. Konishi, Biological characterization of soft tissue sarcomas, Annals of Translational Medicine, vol. 22, no. 3, p. 368, 2015.
[5]
G. Castellano, L. Bonilha, L. M. Li, and F. Cendes, Texture analysis of medical images, Clin. Radiol., vol. 59, no. 12, pp. 1061-1069, 2004.
[6]
Y. L. Huang, K. L. Wang, and D. R. Chen, Diagnosis of breast tumors with ultrasonic texture analysis using support vector machines, Neural Comput. Appl., vol. 15, no. 2, pp. 164-169, 2006.
[7]
B. Julesz, E. N. Gilbert, L. A. Shepp, and H. L. Frisch, Inability of humans to discriminate between visual textures that agree in second-order statistics—Revisited, Perception, vol. 2, no. 4, pp. 391-405, 1973.
[8]
H. Farhidzadeh, B. Chaudhury, M. Zhou, D. B. Goldgof, L. O. Hall, R. A. Gatenby, R. J. Gillies, and M. Raghavan, Prediction of treatment outcome in soft tissue sarcoma based on radiologically defined habitats, in Proc. SPIE 9414, Medical Imaging 2015: Computer-Aided Diagnosis, Orlando, FL, USA, 2015, p. 94141U.
DOI
[9]
M. Karanian and J. M. Coindre, Quatrième édition de la classification OMS des tumeurs des tissus mous, Ann. Pathol., vol. 35, no. 1, pp. 71-85, 2015.
[10]
Z. Rustam, S. Hartini, T. Siswantining, D. A. Utami, and N. K. Putri, Comparison between fuzzy kernel C-means, fuzzy kernel possibilistic C-means and support vector machines in soft tissue tumor classification, in Advanced Intelligent Systems for Sustainable Development (AI2SD’2019), M. Ezziyyani, ed. Cham, Germany: Springer, 2020, pp. 92-105.
DOI
[11]
H. S. Xu, L. Wang, and W. L. Gan, Application of improved decision tree method based on rough set in building smart medical analysis CRM system, Int. J. Smart Home, vol. 10, no. 1, pp. 251-266, 2016.
[12]
P. D. Afonso and V. V. Mascarenhas, Imaging techniques for the diagnosis of soft tissue tumors, Rep. Med. Imaging, vol. 8, pp. 63-70, 2015.
[13]
C. D. M. Fletcher, K. K. Unni, and F. Mertens, Pathology and Genetics of Tumours of Soft Tissue and Bone. Lyon, France: IARC Press, 2002.
[14]
C. D. M. Fletcher, The evolving classification of soft tissue tumours: An update based on the new WHO classification, Histopathology, vol. 48, no. 1, pp. 3-12, 2006.
[15]
C. D. M. Fletcher, The evolving classification of soft tissue tumours-An update based on the new 2013 WHO classification, Histopathology, vol. 64, no. 1, pp. 2-11, 2014.
[16]
C. G. L. Guillou, Tumeurs des tissus mous: Rôle du pathologiste dans l’approche diagnostique, Rev. Med. Suisse, vol. 3, p. 32473, 2007.
[17]
P. Marec-Bérard, F. Chotel, and L. Claude, PNET/Ewing tumours: Current treatments and future perspectives, Bull. Cancer, vol. 97, no. 6, pp. 707-713, 2010.
[18]
K. Scotlandi, D. Remondini, G. Castellani, M. C. Manara, F. Nardi, L. Cantiani, M. Francesconi, M. Mercuri, A. M. Caccuri, M. Serra, et al., Overcoming resistance to conventional drugs in Ewing sarcoma and identification of molecular predictors of outcome, Journal of Clinical Oncology, vol. 27, no. 13, pp. 2209-2216, 2009.
[19]
D. Komura and S. Ishikawa, Machine learning methods for histopathological image analysis, Comput. Struct. Biotechnol. J., vol. 16, pp. 34-42, 2018.
[20]
C. S. T. Koumetio, W. Cherif, and S. Hassan, Optimizing the prediction of telemarketing target calls by a classification technique, in Proc. 2018 6th Int. Conf. on Wireless Networks and Mobile Communications, Marrakesh, Morocco, 2018, pp. 1-6.
DOI
[21]
S. C. K. Tekouabou, W. Cherif, and H. Silkan, A data modeling approach for classification problems: application to bank telemarketing prediction, in Proc. 2ndInt. Conf. on Networking, Information Systems & Security, Rabat, Morocco, 2019, pp. 1-7.
DOI
[22]
K. Lakshminarayan, S. A. Harp, and T. Samad, Imputation of missing data in industrial databases, Appl. Intell., vol. 11, no. 3, pp. 259-275, 1999.
[23]
A. Jindal, A. Dua, K. Kaur, M. Singh, N. Kumar, and S. Mishra, Decision tree and SVM-based data analytics for theft detection in smart grid, IEEE Trans. Ind. Inform., vol. 12, no. 3, pp. 1005-1016, 2016.
[24]
Y. W. Chang, C. J. Hsieh, K. W. Chang, M. Ringgaard, and C. J. Lin, Training and testing low-degree polynomial data mappings via linear SVM, J. Mach. Learn. Res., vol. 11, pp. 1471-1490, 2010.
[25]
N. A. Shrivastava, A. Khosravi, and B. K. Panigrahi, Prediction interval estimation of electricity prices using PSO-tuned support vector machines, IEEE Trans. Ind. Inform., vol. 11, no. 2, pp. 322-331, 2015.
[26]
S. S. Keerthi and C. J. Lin, Asymptotic behaviors of support vector machines with Gaussian kernel, Neural Comput., vol. 15, no. 7, pp. 1667-1689, 2003.
[27]
R. A. Lippert and R. M. Rifkin, Infinite-σ limits for Tikhonov regularization, J. Mach. Learn. Res., vol. 7, pp. 855-876, 2006.
[28]
S. Ruggieri, Efficient C4.5 [classification algorithm], IEEE Trans. Knowl. Data Eng., vol. 14, no. 2, pp. 438-444, 2002.
[29]
S. Bashir, U. Qamar, F. H. Khan, and M. Y. Javed, An efficient rule-based classification of Diabetes using ID3, C4.5, & CART ensembles, in Proc. 2014 12th Int. Conf. on Frontiers of Information Technology, Islamabad, Pakistan, 2014, pp. 226-231.
DOI
[30]
S. L. Salzberg, Book review: C4.5: Programs for machine learning by J. Ross Quinlan. Morgan Kaufmann publishers, inc., 1993, Mach. Learn, vol. 16, no. 3, pp. 235-240, 1994.
[31]
S. Raschka and V. Mirjalili, Python Machine Learning: Machine Learning and Deep Learning with Python, Scikit-learn, and TensorFlow. Birmingham, UK: Packt Publishing, 2019.
[32]
G. Marinakos and S. Daskalaki, Imbalanced customer classification for bank direct marketing, J. Mark. Anal., vol. 5, no. 1, pp. 14-30, 2017.
[33]
S. Young, C. H. Huang, and M. McDermott, Internationalization and competitive catch-up processes: Case study evidence on Chinese multinational enterprises, Manage. Int. Rev., vol. 36, no. 4, 295-314, 1996.
[34]
D. Zahras, Z. Rustam, and D. Sarwinda, Soft tissue tumor classification using stochastic support vector machine, IOP Conf. Ser. Mater. Sci. Eng, vol. 546, no. 5, p. 052089, 2019.
[35]
Y. Zhang, Y. F. Zhu, X. M. Shi, J. Tao, J. J. Cui, Y. Dai, M. T. Zheng, and S. W. Wang, Soft tissue sarcomas: Preoperative predictive histopathological grading based on radiomics of MRI, Acad. Radiol., vol. 26, no 9, pp. 1262-1268, 2019.
[36]
Y. Lee, J. B. Seo, J. G. Lee, S. S. Kim, N. Kim, and S. H. Kang, Performance testing of several classifiers for differentiating obstructive lung diseases based on texture analysis at high-resolution computerized tomography (HRCT), Comput. Methods Programs Biomed., vol. 93, no. 2, pp. 206-215, 2009.
[37]
J. Juntu, J. Sijbers, S. De Backer, J. Rajan, and D. van Dyck, Machine learning study of several classifiers trained with texture analysis features to differentiate benign from malignant soft-tissue tumors in T1-MRI images, J. Magn. Reson. Imaging, vol. 31, no 3, pp. 680-689, 2010.
[38]
J. M. Boone, K. K. Lindfors, C. S. Beatty, and J. A. Seibert, A breast density index for digital mammograms based on radiologists’ randing, J. Digit. Imaging, vol. 11, no. 3, p. 101, 1998.
Publication history
Copyright
Rights and permissions

Publication history

Received: 01 August 2020
Accepted: 23 September 2020
Published: 12 January 2021
Issue date: March 2021

Copyright

© The author(s) 2021

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return