| Literature DB >> 36009709 |
Veronika C Beeck1, Gunnar Heilmann2, Michael Kerscher2, Angela S Stoeger1.
Abstract
Sound production mechanisms set the parameter space available for transmitting biologically relevant information in vocal signals. Low-frequency rumbles play a crucial role in coordinating social interactions in elephants' complex fission-fusion societies. By emitting rumbles through either the oral or the three-times longer nasal vocal tract, African elephants alter their spectral shape significantly. In this study, we used an acoustic camera to visualize the sound emission of rumbles in Asian elephants, which have received far less research attention than African elephants. We recorded nine adult captive females and analyzed the spectral parameters of 203 calls, including vocal tract resonances (formants). We found that the majority of rumbles (64%) were nasally emitted, 21% orally, and 13% simultaneously through the mouth and trunk, demonstrating velopharyngeal coupling. Some of the rumbles were combined with orally emitted roars. The nasal rumbles concentrated most spectral energy in lower frequencies exhibiting two formants, whereas the oral and mixed rumbles contained higher formants, higher spectral energy concentrations and were louder. The roars were the loudest, highest and broadest in frequency. This study is the first to demonstrate velopharyngeal coupling in a non-human animal. Our findings provide a foundation for future research into the adaptive functions of the elephant acoustic variability for information coding, localizability or sound transmission, as well as vocal flexibility across species.Entities:
Keywords: elephant; formant; functional morphology; graded repertoire; sound production; source-filter theory; vocal communication; vocal complexity; vocal tract; vocalization
Year: 2022 PMID: 36009709 PMCID: PMC9404934 DOI: 10.3390/ani12162119
Source DB: PubMed Journal: Animals (Basel) ISSN: 2076-2615 Impact factor: 3.231
Figure 1Schematic figure of the vocal tract in sagittal view: (1) larynx (yellow), vocal folds (red), epiglottis (orange), trachea (green), esophagus (brown); (2) velum (dark blue); (3) tongue (pink); (4) nasal cartilages (violet). In this figure, the epiglottis is slightly open unifying the oral and nasal vocal tract. When the (2) velum (dark blue) is lowered and touches the epiglottis (orange) the oral vocal tract can be closed, and the nasal vocal tract opened for purely nasal sound emission (adapted from [67]).
Subjects and call sample sizes. Number of call types combined with emission per subject; the subjects estimated age according to their handlers at the time of recording; and the subjects shoulder height as a mean value of two measurements.
| Call Types and Emission | |||||||
|---|---|---|---|---|---|---|---|
| Subject Names | Age | Shoulder Height (m) | Rumble Nasal | Rumble Oral and Nasal | Rumble | Roar | Sum |
| Champa | 41 | 2.40 | 24 | 9 | 10 | 10 | 53 |
| Chan Chun | 45 | 2.47 | 5 | 0 | 0 | 0 | 5 |
| Dhibya | 48 | 2.50 | 11 | 0 | 16 | 0 | 27 |
| Dipendra | 60 | 2.43 | 28 | 2 | 2 | 0 | 32 |
| Hira | 45 | 2.55 | 6 | 9 | 0 | 4 | 19 |
| Pawan | 55 | 2.41 | 36 | 4 | 0 | 0 | 40 |
| Raj | 42 | 2.49 | 2 | 1 | 14 | 0 | 17 |
| Saraswati | 27 | 2.40 | 1 | 0 | 0 | 0 | 1 |
| Sunder | 46 | 2.41 | 7 | 2 | 0 | 0 | 9 |
|
|
|
|
|
|
| ||
Acoustic parameters measured and their description.
| Acoustic Parameter | Description |
|---|---|
| Duration in s | Time from the onset until the end of the vocalization measured from the spectrogram. |
| Mean Fundamental Frequency (F0) | Mean of fundamental frequency values of 60 points spaced evenly across the tracked contour. |
| Frequency Variability Index (FVI) [ | Calculated variable that represents the magnitude of frequency modulation across a call computed by dividing the variance in frequency by the square of the average frequency of a call and then multiplying the value by 10. |
| Inflection Factor (IF) | Percentage of points along the fundamental frequency’s contour showing a reversal in slope. |
| Jitter Factor (JF) [ | Calculated variable that represents a weighted measure of the amount of frequency modulation by calculating the sum of the absolute value of the difference between two sequential frequencies divided by the mean frequency. The sum result is then divided by the total number of points measured minus 1 and the final value is obtained by multiplying it by 100. |
| Dominant Frequency (DF) in Hz | Frequency with the highest amplitude peak in the power spectrum. |
| Wiener Entropy | Measurement of tonality defined as the ratio of a power spectrum’s geometric mean to its arithmetic mean, expressed on a log scale. Lower values relate to higher tonality. |
| Quartile 25 (Q25) | Parameter characterizing the spectral energy distribution, i.e., the frequency value where 25% of the total energy is located below this value. |
| Quartile 50 (Q50) | Parameter characterizing the spectral energy distribution, i.e., the frequency value where 50% of the total energy is located below this value. |
| Quartile 75 (Q70) | Parameter characterizing the spectral energy distribution, i.e., the frequency value where 75% of the total energy is located below this value. |
| Spectral Centroid Frequency (SCF) | Weighted average frequency where the weights are the normalized energy of each frequency component. |
Figure 2Examples of rumble emission types. Rumbles emitted in succession by Pawan (f, 55y) during the separation phase of the separation–reunion experiment showing the acoustic camera images (left), spectrograms (middle) and power spectra (right). The LPC function lines in red indicate the formant positions (F1–F13). (a) Nasal rumble; (b) Mixed emission, with oral emission but with a distorted sound sphere appearing when the whole call is selected; (c) Nasal part of the mixed rumble when lower harmonics are selected; (d) Oral part of a mixed rumble when only upper harmonics are selected; with (e) a more dominant nasal part when only the lower harmonics are selected. Note that during (a) nasal rumble emission the elephant was chewing and the mouth at times wide open as defined by jaw lowering.
Figure 3Example of a roar–rumble combination. Emitted during the separation phase of a separation–reunion experiment by Hira (f, 45y). This shows acoustic camera images of different time and frequency ranges (a–f) and the corresponding spectrogram, with a zoom–in on the rumble part (e,f) and the power spectrum (a,b) taken in the middle of the roar between selection window a and b. The LPC function line in red indicates the formant positions (F1–F9); (a,b,d) show the oral only emission of the roar parts; (c) shows the partial oral emission of the mixed emission rumble part, exhibiting a distorted sphere and the (e) nasal emission of the lower harmonics; (f) shows a pattern of simultaneous sound radiation over the front, the mouth and trunk, as well as a strong ground reflection here at the transition between roar and rumble.
Behavioral context. Number of call types combined with call emission per behavioral context.
| Rumble Nasal | Rumble Oral and Nasal | Rumble | Roar | Sum | |
|---|---|---|---|---|---|
| Arousal | 18 | 0 | 0 | 0 | 18 |
| Command | 17 | 2 | 5 | 0 | 24 |
| Contact | 26 | 15 | 6 | 5 | 52 |
| Greeting | 59 | 10 | 31 | 9 | 109 |
|
|
|
|
|
|
|
Mouth opening defined by visual jaw lowering during call emission.
| Rumble Nasal | Rumble Oral and Nasal | Rumble Oral | Roar Oral | |
|---|---|---|---|---|
| Closed | 65 | 3 | 8 | 0 |
| Slightly open | 19 | 13 | 8 | 3 |
| Wide open | 6 | 9 | 12 | 9 |
| Chewing | 14 | 0 | 5 | 0 |
| Unknown | 16 | 2 | 9 | 2 |
Results of acoustic parameters measured per call emission type. The table shows sample sizes, median and mean values ± SD.
| Rumble Nasal | Rumble Oral and Nasal | Rumble Oral | Roar Oral | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Parameters |
| Median | Mean ± SD |
| Median | Mean ± SD |
| Median | Mean ± SD |
| Median | Mean ± SD |
| Duration in s | 120 | 5.25 | 5.93 ± 3.42 | 27 | 3.94 | 4.12 ± 1.74 | 42 | 7.36 | 7.753 ± 3.86 | 14 | 1.28 | 1.42 ± 0.79 |
| SPL in dB | 72 | 51.7 | 51.57 ± 5.05 | 24 | 74 | 73.5 ± 11.28 | 27 | 65.2 | 62.159 ± 11.524 | 12 | 88.7 | 88.7 ± 5.87 |
| Mean F0 | 75 | 14.96 | 14.98 ± 3.87 | 21 | 21.78 | 22.39 ± 3.37 | 37 | 15.78 | 15.594 ± 3.634 | 11 | 172.4 | 170.93 ± 45.41 |
| FVI | 75 | 0.06 | 0.074 ± 0.05 | 21 | 0.12 | 0.12 ± 0.07 | 37 | 0.06 | 0.074 ± 0.068 | 11 | 0.04 | - |
| IF | 75 | 0.39 | 0.39 ± 0.11 | 21 | 0.27 | 0.28 ± 0.12 | 37 | 0.35 | 0.33 ± 0.11 | 11 | 0.17 | - |
| JF | 75 | 3.05 | 3.28 ± 1.08 | 21 | 3.16 | 3.27 ± 0.92 | 37 | 2.7 | 2.74 ± 1.09 | 11 | 2.07 | - |
| DF in Hz | 64 | 16.6 | 16.22 ± 3.55 | 20 | 97.81 | 69.99 ± 43.61 | 27 | 18.55 | 25.643 ± 23.51 | 14 | 378.29 | 505.52 ± 392.27 |
|
| ||||||||||||
| Wiener Entropy | 66 | 0.17 | 0.23 ±0.13 | 21 | 0.21 | 0.25± 0.11 | 29 | 0.20 | 0.23± 0.09 | 15 | 0.47 | 0.45 ± 0.13 |
| Q25 in Hz | 66 | 13.68 | 20.80± 53.34 | 21 | 42.10 | 53.93± 28.46 | 29 | 27.36 | 30.93± 12.21 | 15 | 300.00 | 278.36± 111.92 |
| Q50 in Hz | 66 | 34.73 | 53.52± 82.40 | 21 | 92.63 | 102.46± 48.84 | 29 | 63.15 | 70.54± 27.87 | 15 | 408.42 | 388.93± 138.15 |
| Q 75 in Hz | 66 | 84.73 | 142.68± 129.58 | 21 | 134.73 | 185.67± 105.94 | 29 | 140.00 | 173.92± 99.84 | 15 | 510.52 | 529.67± 112.71 |
| SPF in Hz | 66 | 93.74 | 121.29± 79.97 | 21 | 133.44 | 157.93±58.24 | 29 | 148.06 | 149.83± 44.92 | 15 | 421.55 | 417.90± 95.06 |
|
| ||||||||||||
| F1 in Hz | 61 | 21.48 | 21.12 ± 2.71 | 19 | 33.2 | 33.51 ± 4.48 | 28 | 26.86 | 28.39 ± 6.58 | 13 | 157.47 | 188.38 ± 89.26 |
| F2 in Hz | 34 | 103.25 | 103.34 ± 8.96 | 18 | 110.84 | 110.03 ± 6.82 | 19 | 105.47 | 101.26 ± 14.12 | 13 | 405.03 | 438.91 ± 99.84 |
| F3 in Hz | 0 | 0 | 0 | 17 | 242.19 | 260.28 ± 56.47 | 16 | 240.23 | 259.32 ± 83.66 | 13 | 631.71 | 692.63 ± 161.82 |
| F4 in Hz | 0 | 0 | 0 | 17 | 437.5 | 408.72 ± 75.59 | 14 | 464.36 | 440.64 ± 79.13 | 13 | 869.04 | 936.07 ± 200.37 |
| F5 in Hz | 0 | 0 | 0 | 16 | 598.14 | 583.26 ± 125.44 | 13 | 545.9 | 577.96 ± 117.67 | 13 | 1097.53 | 1244.10 ± 335.26 |
| F6 in Hz | 0 | 0 | 0 | 16 | 750.49 | 755.17 ± 187.56 | 12 | 704.59 | 694.59 ± 144.73 | 13 | 1314.15 | 1467.41 ± 389.24 |
| F7 in Hz | 0 | 0 | 0 | 15 | 889.65 | 908.2 ± 196.67 | 12 | 814.45 | 823.24 ± 176.55 | 13 | 1644.11 | 1719.8 ± 410.19 |
| F8 in Hz | 0 | 0 | 0 | 14 | 960.94 | 1030.41 ± 229.54 | 9 | 878.91 | 899.31 ± 160.21 | 13 | 1881.59 | 1949.07 ± 439.4 |
| F9 in Hz | 0 | 0 | 0 | 14 | 1145.51 | 1162.25 ± 243.24 | 6 | 1021.6 | 1033.32 ± 155.65 | 13 | 2139.03 | 2270.22 ± 540.77 |
| F10 in Hz | 0 | 0 | 0 | 13 | 1244.14 | 1293.17 ± 289.66 | 6 | 1127.93 | 1167.91 ± 199.95 | 13 | 2315.92 | 2533.59 ± 555.53 |
| ΔF1–F10 in Hz | 34 | - | 81.11 ± 9.42 | 18 | - | 148.99 ± 40.02 | 19 | - | 127.61 ± 36.38 | 14 | - | 258.95 ± 66.65 |
| ΔF1–F2 in Hz | 34 | - | 81.11 ± 9.42 | 18 | - | 76.66 ± 4.56 | 19 | - | 70.73 ± 12.91 | 14 | - | 249.73 ± 56.82 |
| ΔF–F10 in Hz | 0 | - | 0 | 17 | - | 160.96 ± 53.85 | 14 | - | 144.93 ± 31.99 | 14 | - | 261.33 ± 71.15 |
|
| ||||||||||||
| ΔF1–F10 in cm | 34 | - | 218.76 ± 26.99 | 18 | - | 126.6 ± 38.13 | 19 | - | 151.46 ± 58.11 | 14 | - | 70.88 ± 14.27 |
| ΔF1–F2 in cm | 34 | - | 218.76 ± 26.99 | 18 | - | 229.08 ± 14.26 | 19 | - | 255.57 ± 47.83 | 14 | - | 73.11 ± 14.92 |
| ΔF3–F10 in cm | 0 | - | 0 | 17 | - | 119.46 ± 36.13 | 14 | - | 126.02 ± 26.19 | 14 | - | 70.77 ± 15.7 |
Figure 4Formant position values in Hz per call and emission type plotted against the formant increment. The regression line gives the values for the formant spacing (ΔF1–F10) that is used to calculate the vocal tract length. Formant position values varied more among calls in higher formants.