Faculty of Engineering and Information Sciences - Papers: Part B

Scalogram Neural Network Activations with Machine Learning for Domestic Multi-channel Audio Classification

Abigail Copiaco, University of WollongongFollow
Christian H. Ritz, University of WollongongFollow
Stefano Fasciani, University of WollongongFollow
Nidhal Abdulaziz, University of WollongongFollow

RIS ID

142088

Publication Details

A. Copiaco, C. Ritz, S. Fasciani & N. Abdulaziz, "Scalogram Neural Network Activations with Machine Learning for Domestic Multi-channel Audio Classification," in 2019 IEEE 19th International Symposium on Signal Processing and Information Technology, ISSPIT 2019, 2019,

Abstract

© 2019 IEEE. Current methodologies explored for audio classification, particularly multi-channel audio, commonly involve the use of individual deep learning approaches. In this paper, we look at domestic multi-channel audio classification through a comparison of various combinations of existing pre-trained Neural Network (NN) models, with Support Vector Machine (SVM) for classification. The NN model is first trained with spectro-temporal features extracted from the audio, characterized by scalogram images that are generated through the Continuous Wavelet Transform (CWT). Activations that are extracted from the selected layer of the concerned neural network model, are then sent as features used to train the machine learning approach for classification. Utilization of the network activations learnt from the deep learning component of the classifier strengthens the time-frequency features of the signal that are extracted from the spectrogram. This therefore allows further improvement to the accuracy. For the full SINS development database, best results yielded an F1-score of over 97% for the tenth layer of the Xception network when combined with the multi-class Linear SVM, showing a drastic improvement from the top performing F1-score achieved in the DCASE 2018 Task 5 challenge, which rests at around 89%.

Please refer to publisher version or contact your library.

COinS

Link to publisher version (DOI)

http://dx.doi.org/10.1109/ISSPIT47144.2019.9001814

Faculty of Engineering and Information Sciences - Papers: Part B

Scalogram Neural Network Activations with Machine Learning for Domestic Multi-channel Audio Classification

RIS ID

Publication Details

Abstract

Link to publisher version (DOI)

Search

Browse

Links

Faculty of Engineering and Information Sciences - Papers: Part B

Scalogram Neural Network Activations with Machine Learning for Domestic Multi-channel Audio Classification

Authors

RIS ID

Publication Details

Abstract

Share

Link to publisher version (DOI)

Search

Browse

Links