MARSYAS: a framework for audio analysis

George Tzanetakis; Perry Cook

doi:10.1017/S1355771800003071

Abstract

Existing audio tools handle the increasing amount of computer audio data inadequately. The typical tape-recorder paradigm for audio interfaces is inflexible and time consuming, especially for large data sets. On the other hand, completely automatic audio analysis and annotation is impossible using current techniques. Alternative solutions are semi-automatic user interfaces that let users interact with sound in flexible ways based on content. This approach offers significant advantages over manual browsing, annotation and retrieval. Furthermore, it can be implemented using existing techniques for audio content analysis in restricted domains. This paper describes MARSYAS, a framework for experimenting, evaluating and integrating such techniques. As a test for the architecture, some recently proposed techniques have been implemented and tested. In addition, a new method for temporal segmentation based on audio texture is described. This method is combined with audio analysis techniques and used for hierarchical browsing, classification and annotation of audio files.

Information

Crossref Citations

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Tzanetakis, G. and Cook, P. 2002. Musical genre classification of audio signals. IEEE Transactions on Speech and Audio Processing, Vol. 10, Issue. 5, p. 293.

Bolea, Yolanda Grau, Antoni and Sanfeliu, Alberto 2003. Progress in Pattern Recognition, Speech and Image Analysis. Vol. 2905, Issue. , p. 221.

Jun Gao Tzanetakis, G. and Steenkiste, P. 2003. Content-based retrieval of music in scalable peer-to-peer networks. p. I.

Tzanetakis, George 2003. Research and Advanced Technology for Digital Libraries. Vol. 2769, Issue. , p. 412.

Li, Tao Ogihara, Mitsunori and Li, Qi 2003. A comparative study on content-based music genre classification. p. 282.

Pan, Jia-Yu Yang, Hyung-Jeong Faloutsos, Christos and Duygulu, Pinar 2004. Automatic multimedia cross-modal correlation discovery. p. 653.

Hauptmann, A.G. Gao, J. Yan, R. Qi, Y. Yang, J. and Wactlar, H.D. 2004. Automated analysis of nursing home observations. IEEE Pervasive Computing, Vol. 3, Issue. 2, p. 15.

Tzanetakis, G. 2004. Song-specific bootstrapping of singing voice structure. p. 2027.

Matushima, R. Makoto Hiramatsu, D. Melo Silveira, R. Ruggiero, W.V. Machado da Costa, C.E. Monteiro, M.M. and Hatori, C. 2004. Integrating MPEG-7 descriptors and pattern recognition: an environment for multimedia indexing and searching. p. 125.

Wieczorkowska, Alicja A. 2005. Intelligent Media Technology for Communicative Intelligence. Vol. 3490, Issue. , p. 228.

Gang Zhai Fox, G.C. Pierce, M. Wenjun Wu and Bulut, H. 2005. eSports: Collaborative and Synchronous Video Annotation System in Grid Computing Environment. p. 95.

Synak, Piotr and Wieczorkowska, Alicja 2005. Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. Vol. 3642, Issue. , p. 314.

Alexandre-Cortizo, E. Rosa-Zurera, M. and Lopez-Ferreras, F. 2005. Application of Fisher Linear Discriminant Analysis to Speech/Music Classification. p. 1666.

Turnbull, D. and Elkan, C. 2005. Fast recognition of musical genres using RBF networks. IEEE Transactions on Knowledge and Data Engineering, Vol. 17, Issue. 4, p. 580.

Wieczorkowska, Alicja Synak, Piotr Lewis, Rory and W.Raś, Zbigniew 2005. Foundations of Intelligent Systems. Vol. 3488, Issue. , p. 456.

Thoshkahna, B. and Ramakrishnan, K.R. 2005. Projekt Quebex: A Query by Example System for Audio Retrieval. p. 265.

Yaslan, Y. and Cataltepe, Z. 2006. Audio Music Genre Classification Using Different Classifiers and Feature Selection Methods. p. 573.

Slaney, Malcolm and White, William 2006. Measuring playlist diversity for recommendation systems. p. 77.

Korhonen, M.D. Clausi, D.A. and Jernigan, M.E. 2006. Modeling emotional content of music using system identification. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), Vol. 36, Issue. 3, p. 588.

Lv, Qin Josephson, William Wang, Zhe Charikar, Moses and Li, Kai 2006. Ferret. p. 317.

Download full list

Article contents

MARSYAS: a framework for audio analysis

Abstract

Information

Access options

Article purchase

Temporarily unavailable

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

MARSYAS: a framework for audio analysis

Abstract

Information

Access options

Article purchase

Temporarily unavailable

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests