Journal Home > Volume 19 , Issue 3

Multiple-Instance Learning (MIL) is used to predict the unlabeled bags’ label by learning the labeled positive training bags and negative training bags. Each bag is made up of several unlabeled instances. A bag is labeled positive if at least one of its instances is positive, otherwise negative. Existing multiple-instance learning methods with instance selection ignore the representative degree of the selected instances. For example, if an instance has many similar instances with the same label around it, the instance should be more representative than others. Based on this idea, in this paper, a multiple-instance learning with instance selection via constructive covering algorithm (MilCa) is proposed. In MilCa, we firstly use maximal Hausdorff to select some initial positive instances from positive bags, then use a Constructive Covering Algorithm (CCA) to restructure the structure of the original instances of negative bags. Then an inverse testing process is employed to exclude the false positive instances from positive bags and to select the high representative degree instances ordered by the number of covered instances from training bags. Finally, a similarity measure function is used to convert the training bag into a single sample and CCA is again used to classification for the converted samples. Experimental results on synthetic data and standard benchmark datasets demonstrate that MilCa can decrease the number of the selected instances and it is competitive with the state-of-the-art MIL algorithms.


menu
Abstract
Full text
Outline
About this article

Multiple-Instance Learning with Instance Selection via Constructive Covering Algorithm

Show Author's information Yanping ZhangHeng ZhangHuazhen WeiJie TangShu Zhao( )
Department of Computer Science and Technology and Key Lab of Intelligent Computing and Signal Processing, Anhui University, Hefei 230601, China.
Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China.

Abstract

Multiple-Instance Learning (MIL) is used to predict the unlabeled bags’ label by learning the labeled positive training bags and negative training bags. Each bag is made up of several unlabeled instances. A bag is labeled positive if at least one of its instances is positive, otherwise negative. Existing multiple-instance learning methods with instance selection ignore the representative degree of the selected instances. For example, if an instance has many similar instances with the same label around it, the instance should be more representative than others. Based on this idea, in this paper, a multiple-instance learning with instance selection via constructive covering algorithm (MilCa) is proposed. In MilCa, we firstly use maximal Hausdorff to select some initial positive instances from positive bags, then use a Constructive Covering Algorithm (CCA) to restructure the structure of the original instances of negative bags. Then an inverse testing process is employed to exclude the false positive instances from positive bags and to select the high representative degree instances ordered by the number of covered instances from training bags. Finally, a similarity measure function is used to convert the training bag into a single sample and CCA is again used to classification for the converted samples. Experimental results on synthetic data and standard benchmark datasets demonstrate that MilCa can decrease the number of the selected instances and it is competitive with the state-of-the-art MIL algorithms.

Keywords: multiple-instance learning, instance selection, constructive covering algorithm, maximal Hausdorff

References(28)

[1]
T. G. Dietterich, R. H. Lathrop, and T. Lozano-Perez, Solving the multiple instance problem with axis-parallel rectangles, Artificial Intelligence, vol. 89, pp. 31-71, 1997.
[2]
A. Zafra, M. Pechenizkiy, and S. Ventura, ReliefF-MI: An extension of ReliefF to multiple instance learning, Neurocomputing, vol. 75, pp. 210-218, 2012.
[3]
Y. X. Chen, J. B. Bi, and J. Z. Wang, MILES: Multiple-instance learning via embedded instance selection, IEEE Transaction Pattern Analysis and Machine Intelligence, vol. 28, pp. 1931-1947, 2006.
[4]
X. F. Song, L. C. Jiao, S. Y. Yang, X. R. Zhang, and F. H. Shang, Sparse coding and classifier ensemble based multi-instance learning for image categorization, Signal Processing, vol. 93, pp. 1-11, 2013.
[5]
Y. X. Chen and J. Z. Wang, Image categorization by learning and reasoning with regions, Journal of Machine Learning Research, vol. 5, pp. 913-939, 2004.
[6]
S. Andrews, I. Tsochantaridis, and T. Hofmann, Support vector machines for multiple-instance learning, in Advance in Neutral Information Processing System 15, 2003, pp. 561-568.
[7]
P. Viola, J. Platt, and C. Zhang, Multiple instance boosting for object detection, in Advance in Neutral Information Processing System 18, 2006, pp.1419-1426.
[8]
O. Maron and T. Lozano-Perez, A framework for multiple-instance learning, in Advance in Neutral Information Processing System 10, 1998, pp. 570-576.
[9]
Q. Zhang and S. A. Goldman, EM-DD: An improved multi-instance learning technique, in Advance in Neutral Information Processing System 14, 2002, pp. 1073-1080.
[10]
R. Rahmani, S. A. Goldman, H. Zhang, S. R. Cholleti, and J. E. Fritts, Localized content based image retrieval, IEEE Transaction Pattern Analysis and Machine Intelligence, vol. 30, pp. 1902-2002, 2008.
[11]
J. Wang and J. D. Zucker, Solving the multiple-instance problem: A lazy learning approach, in Processing of the 17th International Conference on Machine Learning, Morgan Kaufmann Publishers Inc, 2000, pp. 1119-1125.
[12]
P. N. Tan, Introduction to Data Mining. Pearson Education India, 2007.
[13]
G. Q. Liu, J. X. Wu, and Z. H. Zhou, Key instance detection in multi-instance learning, Journal of Machine Learning Research Proceedings Track, pp. 253-268, 2012.
[14]
Z. Fu, A. Robles-Kelly, and J. Zhou, MILIS: Multiple instance learning with instance selection, IEEE Transaction Pattern Analysis and Machine Intelligence, vol. 33, pp. 958-977, 2011.
[15]
W. J. Li and D.Y. Yeung, MILD: Multiple-instance learning via disambiguation, IEEE Trans. on Knowledge and Data Engineer, vol. 22, pp. 76-89, 2010.
[16]
E. Aykut and E. Erkut, Multiple-instance learning with instance selection via dominant sets, Similarity-Based Pattern Recognition, vol. 7005, pp. 171-191, 2011.
[17]
Y. Li, D. M. J. Tax, R. P. W. Duin, and M. Loog, Multiple-instance learning as a classifier combining problem, Pattern Recognition, vol. 46, pp. 865-874, 2013.
[18]
Y. F. Li, J. T. Kwok, I. W. Tsang, and Z. H. Zhou, A convex method for locating regions of interest with multi-instance learning, ECML PKDD, vol. 5782, pp.15-30, 2009.
[19]
Z. H. Zhou, Y. Y. Sun, and Y. F. Li, Multi-instance learning by treating instance as Non-I.I.D samples, in Processing of the 26th International Conference on Machine Learning, pp. 1249-1256, 2009.
DOI
[20]
A. Zafra, M. Pechenizkiy, and S. Ventura, A condition random field for multiple-instance learning, in Processing of the 27th International Conference on Machine Learning, Morgan Kaufmann Publishers Inc, 2010, pp. 287-294.
[21]
S. Zhao, R. Chen, and Y. P. Zhang, MICkNN: Multi-instance covering kNN algorithm, Tsinghua Science and Technology, vol. 18, pp. 360-368, 2013.
[22]
Z. H. Zhou, M. L. Zhang, S. J. Huang, and Y. F. Li, Multi-instance multi-label learning, Artificial Intelligence, vol. 176, pp. 2291-2320, 2012.
[23]
B. Babenko, N. Verma, P. Dollar, and S. J. Belongie, Multiple instance learning with mainfold bags, in Processing of the 28th International Conference on Machine Learning, Morgan Kaufmann Publishers Inc, 2011, pp. 81-88.
[24]
L. X. Jiang, Z. H. Cai, D. H. Wang, and H. Zhang, Bayesian citation-KNN with distance weighting, International Journal of Machine Learning and Cybernetics, vol. 5, no. 2, pp.1-7, 2013.
[25]
L. Zhang and B. Zhang, A geometrical-representation McCulloch-Neural model and its application, IEEE Transactions on Neural Networks, vol. 10, pp. 925-929, 1999.
[26]
K. R. Md and E. Tsutomu, Recurrent neutral network classifier for three layer conceptual network and performance evaluation, Journal of Computers, vol. 5, pp. 40-48, 2010.
[27]
L. Mo and Z. Xie, An improved BP neural network based on IPSO and its application, Journal of Computers, vol. 8, pp. 1267-1272, 2013.
[28]
Y. P. Zhang, Y. H. Wang, and S. Zhao, A novel distributed machine learning method for classification: Parallel covering algorithm, in 7th International Conference of Rough Set and Knowledge Technology, 2012, pp. 185-193.
DOI
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 02 May 2014
Accepted: 04 May 2014
Published: 18 June 2014
Issue date: June 2014

Copyright

© The author(s) 2014

Acknowledgements

This research was supported by the National Natural Science Foundation of China (No. 61175046), the Provincial Natural Science Research Program of Higher Education Institutions of Anhui Province (No. KJ2013A016), the Outstanding Young Talents in Higher Education Institutions of Anhui Province (No. 2011SQRL146), and the Recruitment Project of Anhui University for Academic and Technology Leader.

Rights and permissions

Return