Natural Neighborhood-Based Classification Algorithm Without Parameter <i>k</i>

Ji Feng; Yan Wei; Qingsheng Zhu

doi:10.26599/BDMA.2018.9020017

Big Data Mining and Analytics 2018, 1(4): 257-265 https://doi.org/10.26599/BDMA.2018.9020017

Open Access | Issue | Published: 02 July 2018

Natural Neighborhood-Based Classification Algorithm Without Parameter k

Show Author's Information Hide Author's Information Ji Feng(

), Yan Wei, Qingsheng Zhu

∙ Chongqing Normal University, Chongqing 401331, China.

∙ Chongqing Key Lab. of Software Theory and Technology, College of Computer Science, Chongqing University, Chongqing 400044, China.

Keywords:

classification, nearest neighbor, self-adaptive neighborhood

Cite this article:

Feng J, Wei Y, Zhu Q. Natural Neighborhood-Based Classification Algorithm Without Parameter k. Big Data Mining and Analytics, 2018, 1(4): 257-265. https://doi.org/10.26599/BDMA.2018.9020017

Download citation

EndNote(RIS)

BibTeX

770

Views

Downloads

Citations

Crossref

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

Various kinds of $k$ -Nearest Neighbor (KNN) based classification methods are the bases of many well-established and high-performance pattern recognition techniques. However, such methods are vulnerable to parameter choice. Essentially, the challenge is to detect the neighborhood of various datasets while ignoring the data characteristics. This article introduces a new supervised classification algorithm, Natural Neighborhood Based Classification Algorithm (NNBCA). Findings indicate that this new algorithm provides a good classification result without artificially selecting the neighborhood parameter. Unlike the original KNN-based method, which needs a prior $k$ , NNBCA predicts different $k$ for different samples. Therefore, NNBCA is able to learn more from flexible neighbor information both in the training and testing stages. Thus, NNBCA provides a better classification result than other methods.

Full text

Abstract

Full text

Outline

About this article

Natural Neighborhood-Based Classification Algorithm Without Parameter k

Show Author's information Hide Author's Information Ji Feng(

), Yan Wei, Qingsheng Zhu

∙ Chongqing Normal University, Chongqing 401331, China.

∙ Chongqing Key Lab. of Software Theory and Technology, College of Computer Science, Chongqing University, Chongqing 400044, China.

Abstract

Keywords: classification, nearest neighbor, self-adaptive neighborhood

References(34)

[1]

Z. H. Zhou, N. V. Chawla, Y. C. Jin, and G. J. Williams, Big data opportunities and challenges: Discussions from data analytics perspectives, IEEE Comput. Intell. Mag., vol. 9, no. 4, pp. 62-74, 2014.

DOI Google Scholar

[2]

Y. T. Zhai, Y. S. Ong, and I. W. Tsang, The emerging "big dimensionality", IEEE Comput. Intell. Mag., vol. 9, no. 3, pp. 14-26, 2014.

DOI Google Scholar

[3]

E. Fix and J. L. Hodges Jr., Discriminatory analysis nonparametric discrimination: Consistency properties, Int. Stat. Rev., vol. 57, no. 3, pp. 238-247, 1951.

DOI Google Scholar

[4]

T. Cover and P. Hart, Nearest neighbor pattern classification, IEEE Trans. Inf. Theory, vol. 13, no. 1, pp. 21-27, 1967.

DOI Google Scholar

[5]

H. Zhang, A. C. Berg, M. Maire, and J. Malik, SVM-KNN: Discriminative nearest neighbor classification for visual category recognition, in Proc. 2006 IEEE Computer Society Conf. Computer Vision and Pattern Recognition, New York, NY, USA, 2006, pp. 2126-2136.

[6]

S. H. Wang, Q. M. Huang, S. Q. Jiang, Q. Tian, and L. Qin, Nearest-neighbor method using multiple neighborhood similarities for social media data mining, Neurocomputing, vol. 95, pp. 105-116, 2012.

DOI Google Scholar

[7]

O. Kucuktunc and H. Ferhatosmanoglu, λ-diverse nearest neighbors browsing for multidimensional data, IEEE Trans. Knowl. Data Eng., vol. 25, no. 3, pp. 481-493, 2013.

DOI Google Scholar

[8]

D. Lunga and O. Ersoy, Spherical nearest neighbor classification: Application to hyperspectral data, in Machine Learning and Data Mining in Pattern Recognition, P. Perner, ed. Springer, 2011.

[9]

P. Horton and K. Nakai, Better prediction of protein cellular localization sites with the k nearest neighbors classifier, Proc. Int. Conf. Intell. Syst. Mol. Biol., vol. 5, pp. 147-152, 1997.

Google Scholar

[10]

R. M. Parry, W. Jones, T. H. Stokes, J. H. Phan, R. A. Moffitt, H. Fang, L. Shi, A. Oberthuer, M. Fischer, W. Tong, et al., K-nearest neighbor models for microarray gene expression analysis and clinical outcome prediction, Pharmacogenomics J., vol. 10, no. 4, pp. 292-309, 2010.

DOI Google Scholar

[11]

J. Macqueen, Some methods for classification and analysis of multivariate observations, in Proc. the 5th Berkeley Symp. Mathematical Statistics and Probability, Berkeley, CA, USA, 1967, pp. 281-297.

[12]

N. S. Altman, An introduction to kernel and nearest-neighbor nonparametric regression, Am. Stat., vol. 46, no. 3, pp. 175-185, 1992.

DOI Google Scholar

[13]

G. R. Terrell and D. W. Scott, Variable kernel density estimation, Ann. Stat., vol. 20, no. 3, pp. 1236-1265, 1992.

DOI Google Scholar

[14]

M. M. Breunig, H. P. Kriegel, R. T. Ng, and J. Sander, LOF: Identifying density-based local outliers, ACM SIGMOD Record, vol. 29, no. 2, pp. 93-104, 2000.

DOI Google Scholar

[15]

H. Zhang and J. Malik, Learning a discriminative classifier using shape context distances, in Proc. 2003 IEEE Computer Society Conf. Computer Vision and Pattern Recognition, Madison, WI, USA, 2003, pp. 242-247.

[16]

Y. G. Chen, K. Chen, and M. A. Nascimento, Effective and efficient shape-based pattern detection over streaming time series, IEEE Trans. Knowl. Data Eng., vol. 24, no. 2, pp. 265-278, 2012.

DOI Google Scholar

[17]

B. Tang and H. B. He, ENN: Extended nearest neighbor method for pattern recognition, IEEE Comput. Intell. Mag., vol. 10, no. 3, pp. 52-60, 2015.

DOI Google Scholar

[18]

Y. Song, J. Huang, D. Zhou, H. Y. Zha, and C. Lee Giles, IKNN: Informative k-nearest neighbor pattern classification, in Knowledge Discovery in Databases: PKDD 2007, J. N. Kok, J. Koronacki, R. L. de Mantaras, S. Matwin, D. Mladenic, and A. Skowron, eds. Springer, 2007, pp. 248-264.

[19]

M. Z. Jahromi, E. Parvinnia, and R. John, A method of learning weighted similarity function to improve the performance of nearest neighbor, Inf. Sci., vol. 179, no. 17, pp. 2964-2973, 2009.

DOI Google Scholar

[20]

J. Toyama, M. Kudo, and H. Imai, Probably correct K-nearest neighbor search in high dimensions, Pattern Recognit., vol. 43, no. 4, pp. 1361-1372, 2010.

DOI Google Scholar

[21]

C. Domeniconi, J. Peng, and D. Gunopulos, Locally adaptive metric nearest-neighbor classification, IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 9, pp. 1281-1285, 2002.

DOI Google Scholar

[22]

Y. S. Chen, Y. P. Hung, T. F. Yen, and C. S. Fuh, Fast and versatile algorithm for nearest neighbor search based on a lower bound tree, Pattern Recognit., vol. 40, no. 2, pp. 360-375, 2001.

DOI Google Scholar

[23]

H. A. Fayed and A. F. Atiya, A novel template reduction approach for the k-nearest neighbor method, IEEE Trans. Neural Netw., vol. 20, no. 5, pp. 890-896, 2009.

DOI Google Scholar

[24]

N. Bhatia and , Survey of nearest neighbor techniques, arXiv preprint arXiv: 1007.0085, 2010.

Google Scholar

[25]

S. C. Zhang, Shell-neighbor method and its application in missing data imputation, Appl. Intell., vol. 35, no. 1, pp. 123-133, 2011.

DOI Google Scholar

[26]

Y. J. Gao, B. H. Zheng, G. C. Chen, Q. Li, C. Chen, and G. Chen, Efficient mutual nearest neighbor query processing for moving object trajectories, Inf. Sci., vol. 180, no. 11, pp. 2176-2195, 2010.

DOI Google Scholar

[27]

C. Ding and X. F. He, K-nearest-neighbor consistency in data clustering: Incorporating local information into global optimization, in Proc. ACM Symp. Applied Computing, Nicosia, Cyprus, 2004, pp. 584-589.

DOI

[28]

K. Gowda and G. Krishna, The condensed nearest neighbor rule using the concept of mutual nearest neighborhood, IEEE Trans. Inf. Theory, vol. 25, no. 4, pp. 488-490, 1979.

DOI Google Scholar

[29]

W. Jin, A. K. H. Tung, J. W. Han, and W. Wang, Ranking outliers using symmetric neighborhood relationship, in Advances in Knowledge Discovery and Data Mining, W. K. Ng, M. Kitsuregawa, J. Z. Li, and K. Y. Chang, eds. Springer, 2006, pp. 577-593.

DOI

[30]

Q. S. Zhu, J. Feng, and J. L. Huang, Natural neighbor: A self-adaptive neighborhood method without parameter K, Pattern Recognit. Lett., vol. 80, pp. 30-36, 2016.

DOI Google Scholar

[31]

J. L. Huang, Q. S. Zhu, L. J. Yang, and J. Feng, A non-parameter outlier detection algorithm based on natural neighbor, Knowl.-Based Syst., vol. 92, pp. 71-77, 2016.

DOI Google Scholar

[32]

T. Inkaya, S. Kayalıgil, and N. E. Özdemirel, An adaptive neighbourhood construction algorithm based on density and connectivity, Pattern Recognit. Lett., vol. 52, pp. 17-24, 2015.

DOI Google Scholar

[33]

K. Kozak, M. Kozak, and K. Stapor, Weighted K-nearest-neighbor techniques for high throughput screening data, Int. J. Biomed. Sci., vol. 1, no. 3, p. 155, 2006.

Google Scholar

[34]

P. Mitra, C. A. Murthy, and S. K. Pal, Unsupervised feature selection using feature similarity, IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 3, pp. 301-312, 2002.

DOI Google Scholar

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 14 January 2018

Accepted: 02 March 2018

Published: 02 July 2018

Issue date: December 2018

Copyright

Acknowledgements

This work was supported by the Foundation of Chongqing Normal University (No. 17XLB003) and the Natural Science Foundation Project of CQ CSTC (No. cstc2016jcyjA1362).