Journal Home > Volume 20 , Issue 6

An audio information retrieval model based on Manifold Ranking (MR) is proposed, and ranking results are improved using a Relevance Feedback (RF) algorithm. Timbre components are employed as the model's main feature. To compute timbre similarity, extracting the spectrum features for each frame is necessary; the large set of frames is clustered using a Gaussian Mixture Model (GMM) and expectation maximization. The typical spectra frame from GMM is drawn as data points, and MR assigns each data point a relative ranking score, which is treated as a distance instead of as traditional similarity metrics based on pair-wise distance. Furthermore, the MR algorithm can be easily generalized by adding positive and negative examples from the RF algorithm and improves the final result. Experimental results show that the proposed approach effectively improves the ranking capabilities of existing distance functions.


menu
Abstract
Full text
Outline
About this article

Audio Retrieval Based on Manifold Ranking and Relevance Feedback

Show Author's information Jing QinXinyue LiuHongfei Lin( )
School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
College of Information Engineering, Dalian University, Dalian 116622, China.

Abstract

An audio information retrieval model based on Manifold Ranking (MR) is proposed, and ranking results are improved using a Relevance Feedback (RF) algorithm. Timbre components are employed as the model's main feature. To compute timbre similarity, extracting the spectrum features for each frame is necessary; the large set of frames is clustered using a Gaussian Mixture Model (GMM) and expectation maximization. The typical spectra frame from GMM is drawn as data points, and MR assigns each data point a relative ranking score, which is treated as a distance instead of as traditional similarity metrics based on pair-wise distance. Furthermore, the MR algorithm can be easily generalized by adding positive and negative examples from the RF algorithm and improves the final result. Experimental results show that the proposed approach effectively improves the ranking capabilities of existing distance functions.

Keywords: manifold ranking, audio information retrieval, relevance feedback

References(27)

[1]
Ghias A., Logan J., Chamberlin D., Smith B. C., Query by humming-musical information retrieval in an audio database, in Proc. the Third ACM International Conference on Multimedia, 1995, pp. 231–236.
DOI
[2]
Jang J. S. R., Hsu C. L., Lee H. R., Continuous HMM and its enhancement for singing/humming query retrieval, in Proc. the 6th International Conference on Music Information Retrieval, London, UK, 2005, pp. 546–551.
[3]
Ohishi Y., Goto M., Itou K., Takeda K., A stochastic representation of the dynamics of sung melody, in Proc. the 8th International Conference on Music Information Retrieval, Vienna, Austria, 2007, pp. 371–372.
[4]
Jang J. S. R., Lee H. R., A general framework of progressive filtering and its application to query by singing/humming, IEEE Transactions on Audio, Speech, and Language Processing, vol. 16, no. 2, pp. 350–358, 2008.
[5]
Ryyn¨anen M., Klapuri A., Query by humming of midi and audio using locality sensitive hashing, in presented at IEEE International Conference on Acoustics, Speech, and Signal Processing, Las Vegas, Nevada, USA, 2008.
DOI
[6]
Casey M. A., Veltkamp R., Goto M., Leman M., Rhodes C., Slaney M., Content-based music information retrieval: Current directions and future challenges, Proceedings of IEEE, vol. 96, no. 4, pp. 668–695, 2008.
[7]
Wiering F., Typke R., Veltkamp R. C., Mirex symbolic melodic similarity and query by singing/humming, http://www.music-ir.org/evaluation/MIREX/2006abstracts/SMSQBSHtypke.pdf, 2006.
[8]
Stephen J., Donald D., Crawford B. T., Ten years of ISMIR: Reflections on challenges and opportunities, in Proc. the 11th International Conference on Music Information Retrieval, Kobe, Japan, 2009, pp. 546–551.
[9]
Zhou D., Bousquet O., Lal T., Weston J., Sch¨olkopf B., Learning with local and global consistency, in presented at the 18th Annual Conf. on Neural Information Processing System, 2004.
[10]
Zhou D., Bousquet O., Lal T., Weston J., Sch¨olkopf B., Ranking on data manifolds, in presented at the 18th Annual Conf. on Neural Information Processing System, 2004.
[11]
Wan X., Yang J., Xiao J., Manifold-ranking based topic-focused multi-document summarization., in Proceedings of the 20th International Joint Conference on Artifical Intelligence, Hyderabad, India, 2007, pp. 2903–2908.
[12]
He J., Li M., Zhang H., Tong H., Zhang C., Manifold-ranking based image retrieval, in Proceedings of the 12th Annual ACM International Conference on Multimedia, New York, USA, 2004, pp. 9–16.
DOI
[13]
Xu B., Bu J., Chen C., Cai D., He X., Liu W., Luo J., Efficient manifold ranking for image retrieval, in Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, Beijing, China, 2011, pp. 525–534.
DOI
[14]
Peng Y., Yang Z., Xiao J., Audio retrieval by segment-based manifold-ranking, in presented at Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on. IEEE, New York, USA, 2009.
DOI
[15]
Seyerlehner K., Schedl M., Pohle T., Knees P., Using block-level features for genre classification, tag classification and music similarity estimation, in Submission to Audio Music Similarity and Retrieval Task of MIREX 2010, 2010.
[16]
Aucouturier J. J., Pachet F., Music similarity measures: What's the use?, in Proceedings of the ISMIR International Conference on Music Information Retrieval, Paris, France, 2002, pp. 157–163.
[17]
Pampalk E., “Computational models of music similarity and their application in music information retrieval”, Ph.D. dissertation, Vienna University of Technology, Austria, March 2006.
[18]
He J., Li M., Zhang H. J., Tong H., Zhang C., Manifold-ranking based image retrieval, in Proc. 12th ACM Int. Conf. Multimedia, New York, USA, 2004, pp. 9–16.
DOI
[19]
He J., Li M., Zhang H., Tong H., Zhang C., Generalized manifold ranking-based image retrieval, IEEE Transactions on Image Processing, vol. 15, no. 10, pp. 3170–3177, 2006.
[20]
He R., Zhu Y., Zhan W., Fast manifold-ranking for content-based image retrieval, in presented at Computing, Communication, Control, and Management, 2009. ISECS International Colloquium on IEEE, 2009.
DOI
[21]
Tzanetakis G., Cook P., Musical genre classification of audio signals, IEEE Transactions Speech and Audio Processing, vol. 10, no. 5, pp. 293–302, 2006.
[22]
Tzanetakis G., Cook P., Marsyas: A framework for audio analysis, Organised Sound, vol. 4, no. 3, pp. 169–175, 2000.
[23]
Wang F., Zhang C., Label propagation through linear neighborhoods, IEEE Trans. Knowledge and Data Eng., vol. 20, no. 1, pp. 55–67, 2008.
[24]
Yang Y., Nie F., Xu D., Luo J., Zhuang Y., Pan Y., A multimedia retrieval framework based on semi-supervised ranking and relevance feedback, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 4, pp. 723–742, 2012.
[25]
Yan Y., Liu G., Wang S., Zhang J., Zheng K., “Graph-based clustering and ranking for diversified image search”, Multimedia Systems, 2014, ..
[26]
Agrawal R., Gollapudi S., Halverson A., Ieong S., Diversifying search results, in Proceedings of the Second ACM International Conference on Web Search and Web Data Mining (WSDM 2009), Barcelona, Spain, pp. 784–791, 2009.
DOI
[27]
Zhuang Y. T., Yang Y., Wu F., Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval, IEEE Trans. Multimed, vol. 10, no. 2, pp. 221–229, 2008.
Publication history
Copyright
Rights and permissions

Publication history

Received: 18 March 2015
Revised: 24 April 2015
Accepted: 26 May 2015
Published: 17 December 2015
Issue date: December 2015

Copyright

© The author(s) 2015

Rights and permissions

Return