The paper explores automated classification techniques for classroom sounds to capture diverse learning and teaching activities' sequences. Manual labeling of all recordings, especially for long durations like multiple lessons, poses practical challenges. This study investigates an automated approach employing scalogram acoustic features as input into the ensembled Convolutional Neural Network (CNN) and Bidirectional Gated Recurrent Unit (BiGRU) hybridized with Extreme Gradient Boost (XGBoost) classifier for automatic classification of classroom sounds. The research involves analyzing real classroom recordings to identify distinct sound segments encompassing teacher's voice, student voices, babble noise, classroom noise, and silence. A sound event classifier utilizing scalogram features in an XGBoost framework is proposed. Comparative evaluations with various other machine learning and neural network methodologies demonstrate that the proposed hybrid model achieves the most accurate classification performance of 95.38%.
Funding
This research data was obtained from the ARC Discovery Project DP130100481.
ARC Discovery Project | DP130100481
Pedagogies for knowledge-building: investigating subject-appropriate, cumulative teaching : Australian Research Council | DP130100481