A Machine Learning Approach to Classify Biomedical Acoustic Features for Baby Cries

Publication Name

Journal of Voice


Communication is imperative for living beings for exchanging information. But for newborns, the only way of communicating with the world is through crying, and it is the only medium through which caregivers can know about the needs of their children. Timely addressing baby cries is very important so that the child is relieved at the earliest. It has been a challenge, especially for new parents. The literature says newborn babies use The Dustan Baby Language to communicate. According to this language, there are five words to understand a baby's needs, which are “Neh” (hungry), “Eh” (burp is needed), “Owh/Oah” (fatigue), “Eair/Eargghh” (cramps), “Heh” (feel hot or wet, physical discomfort). This research aims to develop a model for recognizing baby cries and distinguishing between different kinds of baby cries. Here we more broadly focus on whether the infant is in pain due to hunger or discomfort. The study proposes a comparative approach using four classification models: random forest, support vector machine, logistic regression, and decision tree. These algorithms learn from the spectral features: chroma_stft, spectral_centroid, bandwidth, spectral_rolloff, mel-frequency cepstral coefficients, linear predictive coding, res, zero_crossing_rate extracted from the infant cry. The support vector machine model outperforms other classifiers for correctly classifying infant cries.

Open Access Status

This publication is not available as open access



Link to publisher version (DOI)