Journal Home > Volume 1 , Issue 4

Various kinds of k-Nearest Neighbor (KNN) based classification methods are the bases of many well-established and high-performance pattern recognition techniques. However, such methods are vulnerable to parameter choice. Essentially, the challenge is to detect the neighborhood of various datasets while ignoring the data characteristics. This article introduces a new supervised classification algorithm, Natural Neighborhood Based Classification Algorithm (NNBCA). Findings indicate that this new algorithm provides a good classification result without artificially selecting the neighborhood parameter. Unlike the original KNN-based method, which needs a prior k, NNBCA predicts different k for different samples. Therefore, NNBCA is able to learn more from flexible neighbor information both in the training and testing stages. Thus, NNBCA provides a better classification result than other methods.


menu
Abstract
Full text
Outline
About this article

Natural Neighborhood-Based Classification Algorithm Without Parameter k

Show Author's information Ji Feng( )Yan WeiQingsheng Zhu
Chongqing Normal University, Chongqing 401331, China.
Chongqing Key Lab. of Software Theory and Technology, College of Computer Science, Chongqing University, Chongqing 400044, China.

Abstract

Various kinds of k-Nearest Neighbor (KNN) based classification methods are the bases of many well-established and high-performance pattern recognition techniques. However, such methods are vulnerable to parameter choice. Essentially, the challenge is to detect the neighborhood of various datasets while ignoring the data characteristics. This article introduces a new supervised classification algorithm, Natural Neighborhood Based Classification Algorithm (NNBCA). Findings indicate that this new algorithm provides a good classification result without artificially selecting the neighborhood parameter. Unlike the original KNN-based method, which needs a prior k, NNBCA predicts different k for different samples. Therefore, NNBCA is able to learn more from flexible neighbor information both in the training and testing stages. Thus, NNBCA provides a better classification result than other methods.

Keywords: classification, nearest neighbor, self-adaptive neighborhood

References(34)

[1]
Z. H. Zhou, N. V. Chawla, Y. C. Jin, and G. J. Williams, Big data opportunities and challenges: Discussions from data analytics perspectives, IEEE Comput. Intell. Mag., vol. 9, no. 4, pp. 62-74, 2014.
[2]
Y. T. Zhai, Y. S. Ong, and I. W. Tsang, The emerging "big dimensionality", IEEE Comput. Intell. Mag., vol. 9, no. 3, pp. 14-26, 2014.
[3]
E. Fix and J. L. Hodges Jr., Discriminatory analysis nonparametric discrimination: Consistency properties, Int. Stat. Rev., vol. 57, no. 3, pp. 238-247, 1951.
[4]
T. Cover and P. Hart, Nearest neighbor pattern classification, IEEE Trans. Inf. Theory, vol. 13, no. 1, pp. 21-27, 1967.
[5]
H. Zhang, A. C. Berg, M. Maire, and J. Malik, SVM-KNN: Discriminative nearest neighbor classification for visual category recognition, in Proc. 2006 IEEE Computer Society Conf. Computer Vision and Pattern Recognition, New York, NY, USA, 2006, pp. 2126-2136.
[6]
S. H. Wang, Q. M. Huang, S. Q. Jiang, Q. Tian, and L. Qin, Nearest-neighbor method using multiple neighborhood similarities for social media data mining, Neurocomputing, vol. 95, pp. 105-116, 2012.
[7]
O. Kucuktunc and H. Ferhatosmanoglu, λ-diverse nearest neighbors browsing for multidimensional data, IEEE Trans. Knowl. Data Eng., vol. 25, no. 3, pp. 481-493, 2013.
[8]
D. Lunga and O. Ersoy, Spherical nearest neighbor classification: Application to hyperspectral data, in Machine Learning and Data Mining in Pattern Recognition, P. Perner, ed. Springer, 2011.
[9]
P. Horton and K. Nakai, Better prediction of protein cellular localization sites with the k nearest neighbors classifier, Proc. Int. Conf. Intell. Syst. Mol. Biol., vol. 5, pp. 147-152, 1997.
[10]
R. M. Parry, W. Jones, T. H. Stokes, J. H. Phan, R. A. Moffitt, H. Fang, L. Shi, A. Oberthuer, M. Fischer, W. Tong, et al., K-nearest neighbor models for microarray gene expression analysis and clinical outcome prediction, Pharmacogenomics J., vol. 10, no. 4, pp. 292-309, 2010.
[11]
J. Macqueen, Some methods for classification and analysis of multivariate observations, in Proc. the 5th Berkeley Symp. Mathematical Statistics and Probability, Berkeley, CA, USA, 1967, pp. 281-297.
[12]
N. S. Altman, An introduction to kernel and nearest-neighbor nonparametric regression, Am. Stat., vol. 46, no. 3, pp. 175-185, 1992.
[13]
G. R. Terrell and D. W. Scott, Variable kernel density estimation, Ann. Stat., vol. 20, no. 3, pp. 1236-1265, 1992.
[14]
M. M. Breunig, H. P. Kriegel, R. T. Ng, and J. Sander, LOF: Identifying density-based local outliers, ACM SIGMOD Record, vol. 29, no. 2, pp. 93-104, 2000.
[15]
H. Zhang and J. Malik, Learning a discriminative classifier using shape context distances, in Proc. 2003 IEEE Computer Society Conf. Computer Vision and Pattern Recognition, Madison, WI, USA, 2003, pp. 242-247.
[16]
Y. G. Chen, K. Chen, and M. A. Nascimento, Effective and efficient shape-based pattern detection over streaming time series, IEEE Trans. Knowl. Data Eng., vol. 24, no. 2, pp. 265-278, 2012.
[17]
B. Tang and H. B. He, ENN: Extended nearest neighbor method for pattern recognition, IEEE Comput. Intell. Mag., vol. 10, no. 3, pp. 52-60, 2015.
[18]
Y. Song, J. Huang, D. Zhou, H. Y. Zha, and C. Lee Giles, IKNN: Informative k-nearest neighbor pattern classification, in Knowledge Discovery in Databases: PKDD 2007, J. N. Kok, J. Koronacki, R. L. de Mantaras, S. Matwin, D. Mladenic, and A. Skowron, eds. Springer, 2007, pp. 248-264.
[19]
M. Z. Jahromi, E. Parvinnia, and R. John, A method of learning weighted similarity function to improve the performance of nearest neighbor, Inf. Sci., vol. 179, no. 17, pp. 2964-2973, 2009.
[20]
J. Toyama, M. Kudo, and H. Imai, Probably correct K-nearest neighbor search in high dimensions, Pattern Recognit., vol. 43, no. 4, pp. 1361-1372, 2010.
[21]
C. Domeniconi, J. Peng, and D. Gunopulos, Locally adaptive metric nearest-neighbor classification, IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 9, pp. 1281-1285, 2002.
[22]
Y. S. Chen, Y. P. Hung, T. F. Yen, and C. S. Fuh, Fast and versatile algorithm for nearest neighbor search based on a lower bound tree, Pattern Recognit., vol. 40, no. 2, pp. 360-375, 2001.
[23]
H. A. Fayed and A. F. Atiya, A novel template reduction approach for the k-nearest neighbor method, IEEE Trans. Neural Netw., vol. 20, no. 5, pp. 890-896, 2009.
[24]
N. Bhatia and , Survey of nearest neighbor techniques, arXiv preprint arXiv: 1007.0085, 2010.
[25]
S. C. Zhang, Shell-neighbor method and its application in missing data imputation, Appl. Intell., vol. 35, no. 1, pp. 123-133, 2011.
[26]
Y. J. Gao, B. H. Zheng, G. C. Chen, Q. Li, C. Chen, and G. Chen, Efficient mutual nearest neighbor query processing for moving object trajectories, Inf. Sci., vol. 180, no. 11, pp. 2176-2195, 2010.
[27]
C. Ding and X. F. He, K-nearest-neighbor consistency in data clustering: Incorporating local information into global optimization, in Proc. ACM Symp. Applied Computing, Nicosia, Cyprus, 2004, pp. 584-589.
DOI
[28]
K. Gowda and G. Krishna, The condensed nearest neighbor rule using the concept of mutual nearest neighborhood, IEEE Trans. Inf. Theory, vol. 25, no. 4, pp. 488-490, 1979.
[29]
W. Jin, A. K. H. Tung, J. W. Han, and W. Wang, Ranking outliers using symmetric neighborhood relationship, in Advances in Knowledge Discovery and Data Mining, W. K. Ng, M. Kitsuregawa, J. Z. Li, and K. Y. Chang, eds. Springer, 2006, pp. 577-593.
DOI
[30]
Q. S. Zhu, J. Feng, and J. L. Huang, Natural neighbor: A self-adaptive neighborhood method without parameter K, Pattern Recognit. Lett., vol. 80, pp. 30-36, 2016.
[31]
J. L. Huang, Q. S. Zhu, L. J. Yang, and J. Feng, A non-parameter outlier detection algorithm based on natural neighbor, Knowl.-Based Syst., vol. 92, pp. 71-77, 2016.
[32]
T. Inkaya, S. Kayalıgil, and N. E. Özdemirel, An adaptive neighbourhood construction algorithm based on density and connectivity, Pattern Recognit. Lett., vol. 52, pp. 17-24, 2015.
[33]
K. Kozak, M. Kozak, and K. Stapor, Weighted K-nearest-neighbor techniques for high throughput screening data, Int. J. Biomed. Sci., vol. 1, no. 3, p. 155, 2006.
[34]
P. Mitra, C. A. Murthy, and S. K. Pal, Unsupervised feature selection using feature similarity, IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 3, pp. 301-312, 2002.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 14 January 2018
Accepted: 02 March 2018
Published: 02 July 2018
Issue date: December 2018

Copyright

© The author(s) 2018

Acknowledgements

This work was supported by the Foundation of Chongqing Normal University (No. 17XLB003) and the Natural Science Foundation Project of CQ CSTC (No. cstc2016jcyjA1362).

Rights and permissions

Return