Literature DB >> 33281921

Identification of Diseases in Newborns Using Advanced Acoustic Features of Cry Signals.

Yasmina Kheddache1, Chakib Tadj2.   

Abstract

Our challenge in the current study is to extend research on the cries of newborns for the early diagnosis of different pathologies. This paper proposes a recognition system for healthy and pathological cries using a probabilistic neural network classifier. Two different kinds of features have been used to characterize newborn cry signals: 1) acoustic features such as fundamental frequency glide (F0glide) and resonance frequencies dysregulation (RFsdys); 2) conventional features such as mel-frequency cestrum coefficients. This paper describes the automatic estimation of the proposed characteristics and the performance evaluation of these features in identifying pathological cries. The adopted methods for F0glides and RFsdys estimation are based on the derived function of the F0 contour and the jump "J" of the RFs between two subsequent tunings, respectively. The database used contains 3250 cry samples of full-term and preterm newborns, and includes healthy and pathologic cries. The obtained results indicate the important association between the quantified features and some studied pathologies, and also an improvement in the identification of pathologic cries. The best result obtained is 88.71% for the correct identification of health status of preterm newborns, and 82% for the correct identification of full-term infants with a specific disease. We conclude that using the proposed characteristics improves the diagnosis of pathologies in newborns. Moreover, the method applied in the estimation of these characteristics allows us to extend this study to other uninvestigated pathologies.
© 2019 The Author(s).

Entities:  

Keywords:  Classification; F0 glides; Mel-frequency cestrum coefficients; Pathologic cry; Probabilistic neural network; RF dysregulations

Year:  2019        PMID: 33281921      PMCID: PMC7672377          DOI: 10.1016/j.bspc.2019.01.010

Source DB:  PubMed          Journal:  Biomed Signal Process Control        ISSN: 1746-8094            Impact factor:   3.880


1. Introduction

The majority of sick babies appear healthy at birth. Thus, early diagnosis of hidden pathologies for a quick and effective treatment during their first week of life is crucial, as it could save the lives of these babies. However, the realization of a systematic neonatal diagnosis procedure for all newborns demands high costs because it involves the participation of numerous health professionals and specialized equipment. Thus, our aim is to develop a low-cost diagnostic system that allows pediatricians to detect pathologies affecting newborns using spontaneous cry signals. Cry signal analysis is a valuable tool for predicting neonatal diseases. It allows for the recognition of a sick infant when signs of illnesses are absent. In the previous studies of infant cries, two different approaches have been adopted: 1) an automatic recognition that consist of cry classification using advanced signal processing techniques [1, 2, 3, 4, and 5], and 2) a spectral cry analysis based on observing the spectrograms of cry signals and software tools [6,7,8,9, and 10]. However, the cries of newborns provide important acoustic parameters that are not considered while monitoring the first days of infant life and in the standard measurements of the Apgar score (appearance, pulse, grimace, activity, and respiration), which is used to verify a baby’s health immediately after birth. Moreover, studies regarding the development of reliable estimation procedures for the most important acoustics characteristics as well as the identification of their pathological markers are scarce. Unlike previous works, the aim of our study is to propose an automated tool that uses important characteristics of cry signals to support the diagnosis of diseases in newborn infants. These characteristics are described in Section 4.2.1. They include acoustic parameters that qualify the vocal tract, such as mel-frequency cestrum coefficients (MFCCs) and resonance frequencies dysregulation (RFsdys), and the vocal fold, such as fundamental frequency glide (F0glide). F0glide has been associated with central nervous system diseases, asphyxia, and malformations of the orolaryngeal tract; RFsdys indicate poor neural control of the vocal tract and breathing, and has been associated with hyperbilirubinemia, and prenatal tobacco and cocaine exposure [11]. MFCCs can represent properly various models of cries [2], they allow decoupling between the features of the vocal tract and the features generated by the source of excitation. To differentiate pathologic cries from healthy ones, we strengthened the previous studies with data based on cry analysis. Hence, the primary purpose of our work is to complete a detailed analysis of acoustic phenomena produced in newborn cries, focusing specifically on the prevalence of F0glide and RFsdys. Moreover, we study the effectiveness of their use as an input of the proposed newborn’s pathological cry identification system (PCIS). We performed specific procedures to estimate the average percentage of F0glide, the rising and falling times of F0glide (Tglide), and the percentage of RFsdys in the cry signals of healthy and sick infants. By implementing these procedures as described in Section 4.2.2 and by using the median and interquartile range, we managed to 1) determine the quantitative relationship among these acoustic parameters and the various studied pathologies, 2) facilitate their use in the proposed diagnostic system. This paper is organized as follows. Section 2 provides an overview of previous works. In Section 3, the details regarding the database used are presented. In Section 4, we explain our methodology for cry signal feature extraction. Section 5 describes the proposed PCIS. In Section 6, we provide a statistical analysis of the estimated characteristics and also an evaluation of the obtained classification results. Finally, the conclusion is presented in Section 7, which also summarizes the suggestions and future directions for further research.

2. Previous studies

Studies regarding neonatal cry had focused primarily on the classification of cries and also on the relationship between the characteristics of cry signals and diseases. These two research directions progressed separately. Hence, the characteristics of cries that are closely related to the pathologies of newborns have not been considered in the classification of cry studies. In cry recognition studies, different types of classifiers have been used, such as artificial neural networks [12], Gaussian mixture model [13], hidden Markov model [2], probabilistic neural network (PNN) [3], support vector machine [14], and radial basis function networks[15]. The developed automatic recognition systems were applied on cry signals of normal infants and infants suffering from deafness, asphyxia, cleft lip, or cleft palate. The most used features were MFCCs, wavelet packet transform coefficients, and linear predictive coding coefficients. According to our knowledge, none have considered the acoustic characteristics of cry signals in the classification of pathologic cries. In this study, we are interested in the early diagnosis of newborn pathologies using the acoustic characteristics of cry signals. Most direct methods that have been addressed in different studies to assess the acoustic properties of cry signals were based on spectrographic analysis using visual techniques [8, 9]. The studied parameters were manually measurable in the spectrogram of the cry signal. These studies were primarily focused on babies at risk with low neurological factors such as prematurity, hyperbilirubinemia, prenatal exposure to drugs, and also neurological damages such as Krabbe’s disease, Down syndrome, asphyxia, and meningitis [16]. Several characteristics have been associated with these diseases, such as hyperphonic cries, F0 irregularity, extremely high-pitched cries, dysphonic cries, change in phonation mode, variability of F0, glide of F0, and dysregulation of RFs [7,8,10]. In our previous studies [10,17,18], we adopted an automatic approach for the evaluation of the prevalence of these features. The results obtained indicated that the studied characteristics depend on the pathology itself. We have also provided the experimental results corresponding to a proposed PCIS to separate pathologic cries from healthy ones, with and without using the studied characteristics such as the average percentage of hyperphonic cries and F0 irregularity [17]. The results of cry classification were better when these features were used in the PCIS.

3. Database and cry recording

The acoustic analysis presented herein was performed on a real database. The recordings of cry signals were created specifically to study the possibilities of the early diagnosis of various pathologies using spontaneous cries during the first days of infant life. They were performed in the Pediatrics Department at the Sainte-Justine Hospital in Montreal, using a small recorder, at a distance of 10 cm from a baby’s mouth with a sampling rate of 44.1 kHz and a resolution of 16 bits. This database is similar to that used in [17], [18], and [19]. A description of the recording sessions was presented in a previous work [13]. The recordings were performed with background noise; they have been performed for healthy and pathological newborns cries for different kinds of cries, such as hunger, sampling blood, and change of diapers. The categories of health conditions considered in our cry database are as follows: healthy, heart problems, neurological disorders, respiratory diseases, and blood abnormalities. The constructed dataset contains 3250 cry samples of 1-s duration from 66 babies. Among them were preterm and full-term newborns, from 1-day to 1-month old. It is distributed by pathologies and gestational age, as illustrated in Table 1. It is noteworthy that the age of the babies does not exceed one month because the infants acquire voluntary control of their vocal tracts beyond this age [10].
Table 1

Studied pathologies for different gestational ages

Gestational agePathologySample size
Full-term newborn (t)Healthy1010
Hyperbilirubinemia250
Vena cava thrombosis77
Meningitis115
Peritonitis20
Asphyxia190
Lingual frenum141
Preterm newborn (P)Healthy764
IUGR-microcephaly (in utero growth retardation)78
Tetralogy of Fallot53
Gastroschisis134
IUGR-asphyxia (intra-uterine growth retardation)148
RDS (respiratory distress syndrome)270
Studied pathologies for different gestational ages All the samples in the database have been used to investigate and analyze the proposed acoustic features in cry signals. However, because more healthy samples are available than pathology samples, and the number of cry samples associated with each of the diseases are different, only part of this database has been used in the study of the newborn’s PCIS, as shown in Table 4 in Section 6.2.
Table 4

Distribution of cry samples per class

ExperimentsNumber of ClassesGestational ageSamples number Per classClasses
First2 classesPreterm50×5/250Pathologic (Tetralogy of Fallot, Gastroschisis, IUGR-microcephaly, RDS, IUGR-asphyxia) /healthy
Full term50×5/250Pathologic (hyperbilirubinemia,lingual frenum, vena cava thrombosis, asphyxia, meningitis) /healthy
Second5 classesPreterm50×5Tetralogy of Fallo /gastroschisis /RDS /IUGR-asphyxia/healthy
Full-term50×5Hyperbiluribinemia/vena cava thrombosis/asphyxia/meningitis/healthy

4. Cry signal features extraction

Feature extraction is a crucial step in the classification task; it affects the accuracy and reliability of pattern recognition. In our work, new features were extracted in addition to MFCCs. They were used to characterize different models of pathological cries and also cries from healthy newborns. These new high level features are the F0glide, Tglide, and RFsdys.

4.1 MFCC

MFCCs are the most used feature coefficients in automatic infant cry classification [1, 2, 4]. Considering the statistical stationary of cry signals in short periods of time, MFCCs were extracted using a frame duration of 23.2 ms (1024 samples) with 50% recovery and a sampling rate of 44.1 kHz. For every cry signal, we extracted 13 MFCC parameters for each frame of 23.2 ms. We obtained an MFCC matrix of 13 lines × N columns. N corresponds to the total number of frames in the whole cry signal. The calculation of MFCCs was performed as follows: Perform a glottal inverse filtering to attenuate the influence of the vocal tract. Multiply each frame by the Hamming window. Estimate the power spectrum sequence using fast Fourier transform (FFT). Convert the frequency outputs by the FFT onto the mel scale and obtain the logarithm of all filterbank energies. Apply the discrete cosine transform on the log filterbank energies. Hence, MFCCs are obtained from the following relation: where K is the number of filterbanks, which is 20 in this study. M is the length of the cepstrum that is chosen to be 13, and Sk represents the energy output of the kth triangular band pass filter.

4.2 Acoustic features

The set of acoustic parameters used in our study is presented in Table 2. The first five characteristics were investigated previously in [19]. The automatic estimation of the remaining features are described below.
Table 2

Studied cry characteristics

CharacteristicsDefinitions.
Fundamental frequency (F0)Vibration frequency (in Hz) of the vocal folds.
F0 HarmonicsMultiples of fundamental frequency.
RF1, RF2 (Hz)The first and second vocal tract resonance frequencies.
Tuning (TUP)Blocks of 1024 samples during which an RF remains close to the harmonics of F0 (distance < 100 Hz).
Transition (TRP)Blocks of 1024 samples between two subsequent TUPs.
F0glideRapid increase or decrease in F0 of 600 Hz or more.
Glide duration (Tglide)Defined by the start time and finish time of F0 glide.
Dysregulation of RFs (RFsdys)High frequency variability of RFs (RF1, RF2).
Studied cry characteristics

4.2.1 Features description

Research in acoustic phonetics was focused on the mechanisms of speech production rather than cry production in newborns. Moreover, the acoustic properties of vocal folds and many aspects of source-filter interaction have not been clearly defined in newborn cries. Therefore, we have investigated the acoustic properties of speech and singing to understand the F0glide and RFsdys phenomena in cry signals. In vowel and voiced consonant production theory, the glottis operates independently of the vocal tract [20]. This theory concerns male speech more than female and child speeches. However, glide production is influenced by the interaction between the vocal tract and vocal folds [21]. When the oscillation frequency of the vocal fold F0 approaches the vocal tract resonance RFs (RF1, RF2), a nonlinear source-filter system coupling occurs. This interaction is more important for female and child speeches and even more for singing [22]. It has been mentioned [23] that the rising and falling F0 occur with the upward and downward movement of the larynx respectively, and that the control mechanisms of F0 are influenced by the movements of the laryngeal frame and cervical spine. Based on the average pressure decrease across a narrow constriction in normal voice, Stevens [24] defined glides as a class of consonants with a constriction that is not sufficiently narrow. Thus, the glides were produced with a greater degree of constriction in the vocal tract, and their categories were differentiated by the aerodynamics of different degrees of vocal tract constriction. In cry signals, F0glide was defined as a rapid change in F0 (> 600 Hz) in 0.1 s. It can be either rising or falling glides [8]. This characteristic is shown in Figure 1 (a) that represents a spectrogram of a hyperphonic and dysphonic cry for a newborn suffering from asphyxia. We can observe the upward glide of 1102-2390 Hz at 0.053-0.210 s from the F0 contour of this cry, and thus the duration of glide (0.157 s) is higher than 0.1 s.
Figure 1

Spectrograms of cry signals using PRAAT [29]. a) Rising F0glide in the cry of full-term infant suffering from asphyxia. b) RF1dys between two successive TUPs in the cry of preterm infant with gastroschisis.

Spectrograms of cry signals using PRAAT [29]. a) Rising F0glide in the cry of full-term infant suffering from asphyxia. b) RF1dys between two successive TUPs in the cry of preterm infant with gastroschisis. Some studies have investigated the relationship between the glottal properties and vocal tract acoustics of an opera singer [25, 26, 27, and 28]. According to [25], the increase or decrease in resonance frequencies is related to the possibility of changes in the configuration of organs involved in voice production as follows: glottal-opening duration, glottal vibratory amplitude, glottal area, laryngeal height, mouth opening, and tongue shape. Because the tuning (TUP) phenomenon is defined as an adjustment of the RF such that it is close to F0 or one of its harmonics, it has been shown that professional singers use many resonance tuning strategies for vowels (RF1: 2F0, RF1: 3F0, RF2: 4F0, RF2: 5F0, RF2: 6F0, and RF2: 8F0 tuning) [25]. In our previous work [19], an automated approach has been used to estimate and evaluate the duration and percentage of TUP between RFs and the harmonics of F0 for healthy and pathologic newborn cries. The obtained results encouraged us to further investigate the variability of RFs, termed RFsdys here. Furthermore, this characteristic has been associated with neonatal diseases [10]. However, to our best knowledge, none have evaluated the variation pattern of RFsdys whether in cry signals or in speech signals. To quantify RFsdys and analyze their variation patterns according to the studied pathologies, we define this characteristic by the RF jumps between two successive TUPs. Therefore, during the transition period TRP as shown in Figure 1 (b), we observed that RF1 suddenly jumped downward from 3178 Hz to attain 2196 Hz at t = 0.031 s. Because F0 = 450 Hz, this jump occurred between two successive tunings with the 7th and 5th harmonics of F0, (RF1:7F0 and RF1:5F0). Thus, in this case, the jump “J” is equal to two.

4.2.2 Feature estimation

The adopted approach for the estimation of F0glide, Tglide, and RFsdys is illustrated in Figure 2, and the principal steps are detailed below.
Figure 2

Flowchart of adopted approach.

Flowchart of adopted approach. In the first step, cry signals were processed manually using PRAAT (a freeware program for the analysis and reconstruction of acoustic speech signals) [29]. The automated preprocessing method is currently being investigated by other researchers studying the PICS [30]. Because the recorded sounds include the background noises, speech, sound of medical equipment, and silence, some of these parts may distort the results of the analysis. Therefore, this step consists of noise filtering and segmentation of recordings into useful and non-useful segments. Subsequently, the database was organized by pathology and gestational age. In the second step, we applied the following procedure for acoustic characteristic measurement using self-developed functions in Matlab. This has been performed for healthy and pathologic cries by pathology and gestational age. Each cry segment of 1-s duration was divided into 50% overlapping frames of 1024 samples. Each frame was multiplied by the Hamming window. Thereafter, for each data frame, we estimated F0 and its 10 harmonics using the simple inverse filtering tracking algorithm [31, 32]; further, we estimated the RFs (RF1, RF2) using the modified covariance method based on the autoregressive power spectral density (AR-PSD) [33, 34]. The primary steps of the estimation algorithm of F0 and RFs are detailed in [18]. The performances of these algorithms were tested on a real newborn cry database [16, 32]. The derived function M(t) was estimated using F0 contour measurements and subsequently used for F0glide detection. The TUPs and the jumps of RFs were estimated using the harmonics of F0 and RFs measurements. In contrast to a previous study [35], where the TUPs were investigated using a special visualization concept, the TUPs in our work were detected automatically according to the given definitions in Table 2 and subsequently used for RFsdys estimation. For each cry sample, F0glides, Tglide, and RFsdys (RF1dys, RF2dys) were calculated using the adopted algorithms detailed below.

4.2.2.1 F0glide and Tglide estimation

The following approach was adopted in this work to detect F0glide and its corresponding Tglide in 1-s cry segments: Estimate F0 and obtain the derived function M(t) of F0. Obtain the time indexes “x” when M (t) is equal to zero. M(x) = 0 Calculate the corresponding F0(x). Identify the glides of F0 by the following condition: N: number of zero values at the function M(t) Obtain the duration Tglide of each F0glide, by the following formula: T(x): the times at index « x». Calculate the average percentage of F0glide (PF0glide) and average of Tglide (ATglide) by pathology and gestational age, using these formulas: NTotal : Total number of segments in cry samples of 1-s duration. Nglide: Number of F0glide that occurred. Calculate the median and interquartile range of the evaluated characteristics PF0glide and ATglide. Herein, we did not consider the time constraint of 0.1 s for the durations of F0glide. Thus, Tglide, which could be of any length, was estimated and analyzed according to the pathologies.

4.2.2.2 RFsdys estimation

The estimation of RFsdys was performed by computing the jump “J” of the RFs between two subsequent TUPs or during the transition duration (TRP). “Jump” represents the number of harmonics between two subsequent TUPs. This study examines the Js in the range of 1 to 9. In our work, the TUPs and TRPs were detected automatically and used for the evaluation of RFsdys. RFsdys were separately investigated for RF1dys and RF2dys with the same procedure for each data frame in 1-s cry segments as follow: Estimate F0 and its 10 harmonics. Estimate RF1 and RF2. Identify TUPs separately for RF1 and RF2 by the following conditions. n: order of harmonic (1 to 9). p: index of frame in cry segment Find the time indexes of the start and the end of TRPs, i.e., (s1, e1) and (s2, e2), for RF1 and RF2, respectively. Calculate the corresponding RF1 (s1), RF1 (e1), RF2 (s2), and RF2 (e2). Obtain the jumps J1 and J2 for RF1 and RF2 ,respectively, using the following formulas, respectively: where CRF1, CRF2 are the differences in RFs, i.e., RF1, RF2, respectively, during the TRPs. N, N represent the number of TRPs for RF1, RF2, respectively. Thereafter, for each jump J1 and J2 in the range of 1 to 9, the average percentages of RF1dys (PRF1dys) and of RF2dys (PRF2dys) were estimated for each pathology and gestational age, using these formulas, respectively: NJ1, NJ2 are the number of jumps J1 and J2, respectively, in cry samples of 1-s duration. NTotal: Total number of segments in cry samples of 1-s duration. Calculate the median and interquartile range of the evaluated characteristics PRF1dys and PRF2dys.

5. PCIS description

The proposed diagnostic system was built around an efficient PNN classifier and significant infant cry features. In this study, we investigate the effectiveness of the extracted characteristics (PF0glide+ATglide+PRF1dys+PRF2dys) concatenated with MFCC features for the classification of healthy and sick newborn infants according to their cries. Thus, the sets of characteristics once obtained were used as inputs to the PNN classifier. The PNN classifier is used widely for classification problems in the medical domain [36, 37]. It is ideal for real-time applications and is computationally inexpensive. It can learn using the conjugate gradient method, new incoming training data without having to repeat the whole training process [38]. The PNN classifier implements the Bayesian decision rule. Its network architecture is based on three layers: (1) the input layer that computes the distances between the input vector and the training input; (2) radial basis layer that produces a vector of probabilities; (3) competitive layer that selects the maximum of these probabilities. The network classifies the input vector, the assigned class as those with the highest probability [36]. Matlab was used for the development of the test system. The primary steps are as follows: wav files recovery from each folder of the studied pathologies. division of 1-s cry signals into overlapping frames of 1024 samples, with 50% recovery. For each 1-s cry segment, the following were performed: estimation of PF0glide, ATglide, PRF1dys, and PRF2dys. extraction of 13 MFCCs for each interlaced frame. conversion of each matrix of the MFCCs specific to a cry sample into only one vector. concatenation of each vector with its corresponding PF0glide, ATglide, PRF1dys, and PRF2dys. The set of obtained vectors were selected as the final inputs to the PNN classifier. application of five-fold cross validation; four folds for training set and one fold for testing set. data classification using the PNN classifier; the strategy proposed by [36] paper and the function newpnn () in Matlab [39] were used. calculation of the correct identification rate namely, accuracy.

6. Results and discussion

6.1 Acoustic and statistical analysis

The cry signals were analyzed using the studied characteristics (PF0glide, ATglide, PRF1dys, and PRF2dys). Moreover, they have been investigated according to the gestational age (full-term and preterm) and diseases afflicting newborns. The cry features given in Table 2 were estimated for the complete database presented in Table 1. An example of F0glide is shown in Figure 3; it represents an estimate of F0 for a 1-s cry of a newborn suffering from asphyxia. The irregular contour of F0 presents the rising glide between 0.21 s and 0.24 s and the falling glide between 0.58 s and 0.63 s.
Figure 3

Estimated F0 for a cry signal of a newborn suffering from asphyxia; rising and falling F0glide (red ellipses).

Estimated F0 for a cry signal of a newborn suffering from asphyxia; rising and falling F0glide (red ellipses). The average percentage of F0glide (PF0glide) and the average durations of the glides (ATglide) are given in Table 3 and presented by pathology and gestational age in Figures 4(a) and 4(b), respectively.
Table 3

PF0glide, and ATglide estimation results

PathologyPF0glide (%)ATglide (s)
MedianInterquartile RangeMedianInterquartile Range
Healthy (t)0000
Hyperbiluribinemia(t)8.333.90.160.10
Vena cava thrombosis (t)7.144.870.250.11
Meningitis (t)0000.00
Peritonitis (t)12.56.520.220.12
Asphyxia (t)84.170.150.09
Lingual frenum (t)0000.00
Healthy (P)0000.00
IUGR-microcephaly (P)18.4612.50.280.10
Tetralogy of Fallot (P)104.910.310.17
Gastroschisis (P)16.6680.270.11
IUGR-asphyxia (P)11.114.950.230.12
RDS (P)105.710.180.11
Figure 4

Box-and-whiskers plots for a) PF0glide, b)Tglide, by pathology.

Box-and-whiskers plots for a) PF0glide, b)Tglide, by pathology. PF0glide, and ATglide estimation results The obtained results indicate that the cry signals do not present the glides of F0 in both healthy premature and full-term newborns, and also in the studied cases of lingual frenum and meningitis. They also indicate that, in a group of full-term newborns, the highest median and interquartile range of PF0glide are found in peritonitis with average glide durations compared to other diseases. The lowest percentage is found in the case of vena cava thrombosis with longer durations of F0glide. In the same group of full-term newborns, a small difference occurred in the occurrence of F0glide and also ATglide between the cries of hyperbilirubinemia and asphyxia diseases. The highest median and interquartile ranges of PF0glide among the two groups of newborns were observed in the cries of preterm newborns affected by IUGR-microcephaly disease with rather long periods of glides. However, Figure 4 and Table 3 show that the highest median and interquartile range of ATglide characterize the tetralogy of Fallot disease, which presents the same PF0glide as RDS disease compared to other pathologies. Concerning the RFsdys, examples of this feature are shown in Figure 5, which represents the estimates of F0, their harmonics, RF1, and RF2 for the 1-s cry of a newborn suffering from hyperbilirubinemia. An RF1dys was observed at approximately 0.38 s between the third and sixth harmonics, “J = 3” and also an RF2dys at approximately 0.26 s between the seventh and ninth harmonics “J = 2”.
Figure 5

Full-term newborn suffering from hyperbilirubinemia, F0 and their harmonics (red lines), RFs: RF1 (black dots) and RF2 (blue dots), RFsdys between TUPs (green rectangles).

Full-term newborn suffering from hyperbilirubinemia, F0 and their harmonics (red lines), RFs: RF1 (black dots) and RF2 (blue dots), RFsdys between TUPs (green rectangles). Comparative results are presented in Figure 6. They exhibit the variation patterns of RFsdys by pathology. This dysregulation is expressed by the percentage of jump “J” previously defined in Section 3.4. Figures 6(a) and (b) show that the majority of cries of preterm newborns are characterized by more RF1dys when J = 1 than cries of full-term newborns. The percentage of RF1 jumps is more important for all jumps J = 1 to 9 in the cries of healthy preterm newborns than in the cries of healthy full-term newborns.
Figure 6

Average percentage of RFsdys by jumps “J”. a) RF1dys for full-term newborns, b) RF1dys for preterm newborns, c) RF2dys for full-term newborns, d) RF2dys preterm newborns.

Average percentage of RFsdys by jumps “J”. a) RF1dys for full-term newborns, b) RF1dys for preterm newborns, c) RF2dys for full-term newborns, d) RF2dys preterm newborns. According to Figure 6(a), the cry signals of vena cava thrombosis and asphyxia diseases present the most RF1dys, with the highest percentage of jump for “J” from 1 to 9, except for “J” of 3 and 4 jumps, where the largest percentage is found in cry signals of lingual frenum pathology. In the case of hyperbilirubinemia and for “J” equal to 1, 2, and 3 jumps, the cry signals were qualified by more RF1dys than those of healthy newborns. Further, in peritonitis disease, the percentage of jumps for “J” of 1, 2, 5, and 6 is higher than those of healthy newborns. Unlike these latest pathologies, the cry signals in case of meningitis present less RF1dys for all jumps “J” (1 to 9) than in healthy newborn. From Figure 6 (b), we can observe the highest RF1dys for jumps J = 1 to 6 in the case of tetralogy of Fallot disease compared to the other studied cases. We also noticed a higher percentage of RF1 jumps for J = 1 in both cases of RDS and gastroschisis, and less RF1dys was observed in the cry signals of IUGR–microcephaly and IUGR–asphyxia compared to the cries of healthy preterm newborns. The variation patterns of RF2dys for full-term and preterm newborn cry signals are shown in Figure 6 (c and d, respectively). We observed that the percentage of RF2 jumps depends on the studied pathologies. According to the jumps “J”, it can be less or more important than in those in healthy newborn cries. In the cases of full and preterm newborns, all studied diseases are characterized by low difference in the percentage of RF2 jumps compared to healthy preterm newborn cries, except for the cry signals of tetralogy of Fallot disease that presents a greater percentage of RF2dys than the healthy ones for J = 1 to 5 jumps.

6.2 Classification results

In this work, we present the results of healthy and sick (with specific diseases, see Table 1) infant identification systems. To test and validate the use of the studied characteristics, two experiments were performed in both cases of full-term and preterm infants: 1) separate cries by health status of infant (pathologic cries from healthy ones); 2) separate cries into those of healthy infants and those sick with specific diseases. The effectiveness of the studied characteristics (PF0glide, ATglide, PRF1dys, and PRF2dys) in the recognition of pathological cries was compared to the use of MFCC parameters as the input to the PCIS. The performance of the PCIS was evaluated based on the rate of correct identification (overall accuracy) [4]. The distribution of cry samples according to classes and pathologies are shown in Table 4. In the first experiment, the cry samples are distributed into two classes: healthy cries and pathological cries. In the second experiment, the cry samples are distributed in five classes according to the used pathologies. There are many healthy samples in our database and limited numbers of cry samples from sick infants. However, the five-fold cross validation has been used, in which each fold is of equal size and contains the same percentage of samples as that of each target class. Distribution of cry samples per class The best results of the correct identification rate obtained for all implemented experiments are shown in Table 5. The use of MFCCs as feature characteristics is tested for the two experiments. The best performance obtained (72%) is found in the identification of full-term infants pathologies. The second test corresponds to the use of the studied features in addition to MFCCs. This test provides a better identification rate (88.71%) of health status of preterm infants.
Table 5

Overall accuracy results

ExperimentsClasses numberGestational ageFeatures
MFCCPF0glide + ATglide + PRF1dys+ PRF2dys + MFCC
First2 classesPreterm70.16 %88.71 %
Full term60.00 %67.00 %
Second5 classesPreterm60.00 %72.00 %
Full term72.00 %82.00 %
Overall accuracy results According to the comparative results presented in Table 5, the use of the studied characteristics (PF0glide, ATglide, PRF1dys, and PRF2dys) in addition to MFCCs improves the performance of the PCIS compared to the use of MFCCs without the studied characteristics compared to the MFCC features only; in the first experiment, the overall accuracy had improved by 18.5% points for the identification of health status of preterm infants and by 7% points for the identification of the health status of full-term infants. In the second experiment, the overall accuracy improved by 12% points for the identification of preterm infants with a specific disease, and 10% points for the identification of full-term infants with a specific disease.

7. Conclusion

In this work, we proposed an automated classification method for different cry patterns. We used new features that improved the accuracy of the recognition of pathological cries. This study focused on the following: 1) improvement of PCIS performances using studied features, 2) description of our method for estimating the proposed features, and 3) description of associations between infant medical conditions and the considered characteristics. The proposed methods to estimate these features permitted the analysis of a considerable number of pathologic cry signals in a short time, and the extension of the study for other pathologies not considered. Hence, we are recording a larger database with a greater variety of pathologies and more subjects for each pathology. We successfully identified the occurrence of the proposed characteristics in cry signals according to the studied pathologies. The glide of F0 was found absent in healthy newborn cry signals and also in some of the pathologies. The obtained results suggested that the prevalence of relatively slow glides increased in sick preterm infants, especially in babies with diseases affecting the CNS such as IUGR-microcephaly. Our results indicated a complex pattern variation of RF2dys in the pathologic cry signals compared to that of RF1dys. From the presented work and results obtained in this study, we conclude that, occurrences of Fglides and RF1dys in addition to other characteristics investigated in our previous works were found to be useful in the diagnosis of the studied pathologies. The encouraging classification results indicated the highly discriminatory nature of the proposed features. These characteristics can be explored as inputs to a diagnostic system using other modeling and classification methods to ultimately provide a basis for alerting healthcare workers to intervene. We also expect to improve the results by studying other classifiers. The results obtained using the PNN classifier and the proposed acoustic features were not compared with other results of previous studies. It is noteworthy that in this study, cases of full-term and premature babies, both with and without pathologies were investigated, thus increased the complexity of a PCIS significantly. For example, the cry signals of neonatal asphyxiate and healthy babies contain significant differences in the acoustic signals, thus easing the identification process. In conclusion, our research underlined the importance of using acoustic features in the task of cry recognition and provided support and additional results to investigations regarding neonatal cries.
  13 in total

1.  Vocal tract in female registers--a dynamic real-time MRI study.

Authors:  Matthias Echternach; Johan Sundberg; Susan Arndt; Michael Markl; Martin Schumacher; Bernhard Richter
Journal:  J Voice       Date:  2009-01-29       Impact factor: 2.009

2.  Nonlinear source-filter coupling in phonation: vocal exercises.

Authors:  Ingo Titze; Tobias Riede; Peter Popolo
Journal:  J Acoust Soc Am       Date:  2008-04       Impact factor: 1.840

3.  Effects of in utero cocaine exposure on newborn acoustical cry characteristics.

Authors:  M J Corwin; B M Lester; C Sepkoski; S McLaughlin; H Kayne; H L Golub
Journal:  Pediatrics       Date:  1992-06       Impact factor: 7.124

4.  Professional male singers' formant tuning strategies for the vowel /a/.

Authors:  Johan Sundberg; Filipa M B Lã; Brian P Gill
Journal:  Logoped Phoniatr Vocol       Date:  2011-07-15       Impact factor: 1.487

5.  Formant tuning strategies in professional male opera singers.

Authors:  Johan Sundberg; Filipa M B Lã; Brian P Gill
Journal:  J Voice       Date:  2013-02-27       Impact factor: 2.009

6.  Resonance frequencies behavior in pathologic cries of newborns.

Authors:  Yasmina Kheddache; Chakib Tadj
Journal:  J Voice       Date:  2014-08-28       Impact factor: 2.009

7.  Role of vertical larynx movement and cervical lordosis in F0 control.

Authors:  K Honda; H Hirai; S Masaki; Y Shimada
Journal:  Lang Speech       Date:  1999 Oct-Dec       Impact factor: 1.500

Review 8.  Assessment of infant cry: acoustic cry analysis and parental perception.

Authors:  Linda L LaGasse; A Rebecca Neal; Barry M Lester
Journal:  Ment Retard Dev Disabil Res Rev       Date:  2005

9.  Parametric and non-parametric estimation of speech formants: application to infant cry.

Authors:  A Fort; A Ismaelli; C Manfredi; P Bruscaglioni
Journal:  Med Eng Phys       Date:  1996-12       Impact factor: 2.242

10.  Classification of cries of infants with cleft-palate using parallel hidden Markov models.

Authors:  Dror Lederman; Ehud Zmora; Stephanie Hauschildt; Angelika Stellzig-Eisenhauer; Kathleen Wermke
Journal:  Med Biol Eng Comput       Date:  2008-03-27       Impact factor: 2.602

View more
  1 in total

1.  Nonlinear Statistical Analysis of Normal and Pathological Infant Cry Signals in Cepstrum Domain by Multifractal Wavelet Leaders.

Authors:  Salim Lahmiri; Chakib Tadj; Christian Gargour
Journal:  Entropy (Basel)       Date:  2022-08-22       Impact factor: 2.738

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.