Literature DB >> 30697246

Ontogeny of vocal rhythms in harbor seal pups: an exploratory study.

Andrea Ravignani^1,2, Christopher T Kello³, Koen de Reus¹, Sonja A Kotz^4,5,6, Simone Dalla Bella^6,7,8, Margarita Méndez-Aróstegui¹, Beatriz Rapado-Tamarit¹, Ana Rubio-Garcia¹, Bart de Boer².

Abstract

Puppyhood is a very active social and vocal period in a harbor seal's life Phoca vitulina. An important feature of vocalizations is their temporal and rhythmic structure, and understanding vocal timing and rhythms in harbor seals is critical to a cross-species hypothesis in evolutionary neuroscience that links vocal learning, rhythm perception, and synchronization. This study utilized analytical techniques that may best capture rhythmic structure in pup vocalizations with the goal of examining whether (1) harbor seal pups show rhythmic structure in their calls and (2) rhythms evolve over time. Calls of 3 wild-born seal pups were recorded daily over the course of 1-3 weeks; 3 temporal features were analyzed using 3 complementary techniques. We identified temporal and rhythmic structure in pup calls across different time windows. The calls of harbor seal pups exhibit some degree of temporal and rhythmic organization, which evolves over puppyhood and resembles that of other species' interactive communication. We suggest next steps for investigating call structure in harbor seal pups and propose comparative hypotheses to test in other pinniped species.

Entities: Chemical Disease Species

Keywords: bioacoustics; pinnipeds; rhythm; timing; vocal development

Year: 2018 PMID： 30697246 PMCID： PMC6347067 DOI： 10.1093/cz/zoy055

Source DB: PubMed Journal: Curr Zool ISSN： 1674-5507 Impact factor: 2.624

Acoustic communication in animals can be investigated along several dimensions. Historically, the study of animal bioacoustics has focused on spectral and combinatorial features of vocalizations (see Table 1 for these and other definitions; Janik and Slater 1997; Bradbury and Vehrencamp 1998; Gerhardt and Huber 2002; Ravignani and Norton 2017). Comparatively, especially in nonavian vertebrates, little attention has been paid to vocal timing, intended as the perception and production of single events in time. If there is a paucity of studies on mammal vocal timing, even less is known about mammal vocal rhythms, intended as structured patterns of multiple temporal events (see Table 1).

Table 1.

Definition of terms and concepts in order of appearance in the article

Term	Definition
Temporal	Referring to the timing of a vocalization.
Spectral	Referring to the frequency features of a vocalization, for example, fundamental frequency, harmonics, formants, and harmonicity.
Socioecology	Study of interactions among members of a species, and of how an organism’s environment affects its social structure.
Duet	Result of 2 individuals vocalizing, possibly interactively.
Chorus	Result of 2 or more individuals vocalizing, possibly interactively.
Combinatorial	A type of structure resulting from joining constituent elements, where the result may be more than the simple sum of its elements.
Beat perception	Extraction of a main periodicity—the beat or tactus—from a complex acoustical signal (e.g., music embedding different metrical levels).
Synchronization	Process by which events of a temporal sequence occur at the same time as events in another temporal sequence.
Vocal (production) learning	Ability to produce vocalizations not belonging to one’s default repertoire, often via imitation or social learning.
Rhythm	Sequence of durations marked by acoustic events. Some rhythms may include repeating regular patterns, one or more periodicities, a pattern of accents/prominences, and hierarchical grouping (see isochrony and grouping below).
Polygynous	Social organization by which few dominant males mate with all receptive females.
Lek	Spatial aggregation of male conspecifics who engage in competitive displays to attract females.
Oestrus	Period of sexual fertility in most female mammals.
Lanugo	Natal hair coat that is typically shed in utero in harbor seals
Weaning	Developmental phase during which pups transition from breastfeeding to independent foraging and life without their mothers.
Hearing threshold	Sound level above which an organism can hear a specific sound.
Intensity	Power carried by sound waves.
Pitch	Perceptual quality of sounds, and psychological counterpart of the frequency of a signal.
Distributional	Relating to the statistical distribution of a quantity.
Structural	Relating to the sequential ordering of elements composing a signal and their frequency of co-occurrence.
Hypothesis-free metric	Measurement which makes little or no assumptions on the underlying structure of the measured quantity.
Periodicity	Feature of a sequence in which events (e.g., sounds of a metronome) occur at equal time intervals.
Sequence	A set of events following each other in a particular order.
Transition probability	Probability that an element type in a sequence is adjacent and preceded by another (or the same) element type.
Isochrony	Property of a pattern in which all temporal intervals have equal duration.
Grouping	Organization of temporal events based on their relative proximity or on their relative acoustic properties.

Definition of terms and concepts in order of appearance in the article A cross-species hypothesis in evolutionary neuroscience makes the study of timing particularly relevant (Patel 2006, 2014). The “vocal learning-beat perception and synchronization” (VL-BPS) hypothesis suggests that only species capable of vocal production learning may show a particular form of rhythm and timing, namely the ability to extract a regular pulse from sound and synchronize movement to it (Patel 2006). At present, only 4 clades of mammals are known to be capable of vocal production learning, often entailing vocal imitation (Adret 1992; Janik and Slater 1997): elephants, bats, pinnipeds, and cetaceans. Likewise, whereas elaborate timing and rhythmicity exist in many species such as crickets, anurans and fireflies (see Ravignani et al. 2014 for a review), few animals are capable of flexible beat perception and synchronization (Wilson and Cook 2016). In principle, the VL-BPS hypothesis predicts that we should find vocal learning abilities in all species that can perceive a beat in a rhythmic sequence and synchronize with it. Unfortunately, vocal imitation and rhythm synchronization have rarely been investigated in the same species and especially not in mammals (Ravignani and Cook 2016; Wilson and Cook 2016; Lattenkamp and Vernes 2018). However, the study of timing and rhythm in vocal learning mammals becomes particularly important in light of the VL-BPS hypothesis. Such studies may shed light on the evolution of human cognition and the neural circuitry for rhythm and vocal learning (Patel 2014; Ravignani et al. 2016; Vernes 2017). Research on pinnipeds has uncovered vocal learning capacities in some species (Schusterman 1977; Ralls et al. 1985; Sanvito et al. 2007; Reichmuth and Casey 2014; Casey et al. 2015) and vocal rhythmicity or beat perception and synchronization in others (Page et al. 2002; Cook et al. 2013; Mathevon et al. 2017; Rogers 2017). Pinnipeds are therefore closely related species particularly appropriate to test the VL-BPS hypothesis (Ravignani et al. 2016; Ravignani 2018a). To this aim, it is necessary to study timing and rhythm in a species displaying vocal learning capabilities such as the harbor seal Phoca vitulina (Ralls et al. 1985). This study focuses on harbor seals and in particular on the vocal behavior of pups related to mother–pup interactions. Mother–pup recognition, like most social behaviors in pinnipeds such as the competition for space and access to females, is maintained through vocal signaling (Schusterman 2008). Harbor seal mothers frequently go for foraging trips at sea to nurse pups during lactation (Insley et al. 2003). As harbor seals nurse their pups in colonies, efficient vocal communication is particularly important in mother–pup reunions, as mothers need to identify their pup within the colony when returning from foraging. Unlike most pinniped species that use pup attraction calls produced by the mother to maintain contact, harbor seal pups emit mother attraction calls (MACs) during the lactation period (Renouf 1984). Harbor seal pups are, therefore, very vocal during this time and the pups’ individually distinctive calls play an important role in mother–pup recognition (Renouf 1984; Perry and Renouf 1988). Accurate recognition of offspring by the mother increases reproductive success (Insley et al. 2003). The offspring’s survival therefore depends on (1) the mother’s capacity to perceive and recognize individual calls, building on sound perception, and (2) the caller’s ability to emit distinguishable individualized signals, building on sound production (Tibbetts and Dale 2007). Mother–pup recognition is therefore an area where both vocal learning and rhythmic abilities may be important for distinguishing pup calls from each other. Sound perception in harbor seals has been measured both in air and underwater (sound propagates further underwater than in air; Renouf 1991). In air, adults’ hearing thresholds span 1 kHz and 22.5 kHz, with best sensitivity at 16 kHz (Lucke et al. 2016). In contrast, adults’ hearing thresholds span 0.125 kHz and 100 kHz underwater, with best sensitivities below 4 kHz (Kastelein et al. 2009). Harbor seals also show higher thresholds for shorter repeating signals (≤50 ms) than longer repeating signals (≥100 ms) (Terhune 1988). When tested with more natural sounds, harbor seals are capable of discriminating between calls from different pups (Renouf 1985) and mothers are capable of acoustically recognizing their own pup 3 days after birth (Sauvé et al. 2015a). However, it is currently unclear which acoustic parameters of pups’ calls a mother is most sensitive to, and which parameters she employs to recognize her own pup. Sound production in harbor seals can be described by the source-filter framework of phonation. According to this theory, vocal production occurs when a source signal, generated by the vibration of the vocal folds, is filtered by the cavities of the supralaryngeal tract (Fant 1960). These anatomical structures impose constraints on the acoustic structure of the sounds and result in individualized vocalizations that vary due to growth and development in puppyhood (Charrier et al. 2009). This means that the mother must adapt to her pup’s calls as it continues to grow. Indeed, the resonant properties of the vocal tract change as the pup grows (Ravignani et al. 2017). In addition, in this sexually dimorphic species, sex steroids produced during growth might act on laryngeal structures causing different vocal characteristics between males and females (Aufdemorte et al. 1983; de Reus 2017). The ontogeny of the harbor seal pup MACs has been investigated in at least 3 subspecies (Khan et al. 2006; Sauvé et al. 2015b; de Reus 2017). All studies reported an effect of age and sex on acoustic parameters but only some showed an effect of body size (Khan et al. 2006; Sauvé et al. 2015b). As pups grow older, their calls become longer in duration, more frequent, and less harmonic (de Reus 2017). With age, MACs also show a consistent decrease in fundamental frequency and frequency modulation (Khan et al. 2006; Sauvé et al. 2015b). In addition, male pups have a lower pitch than females (Sauvé et al. 2015b; de Reus 2017). However, there exist similarities and differences in the acoustic call parameters found in the aforementioned studies. For example, the fundamental call frequency is higher in captive individuals (Khan et al. 2006), and call durations are much shorter in wild conspecifics (Sauvé et al. 2015b). A third study (de Reus 2017) focused on wild-born pups that were opportunistically recorded during their short rehabilitation period coinciding with their weaning period before they were returned to their natural environment. Compared with the previous 2 studies, this work revealed differences in fundamental frequency (Khan et al. 2006) and call duration (Sauvé et al. 2015b). The vocal repertoire of harbor seal pups clearly shows call variation between captive and wild animals and also between animals inhabiting different geographical locations (Khan et al. 2006; Sauvé et al. 2015b; de Reus 2017). In brief, (1) breeding colonies are dense enough that individual identification of pups by their mother is critical, and (2) mothers are able to recognize their own pup based on acoustic properties of pups’ MACs. This raises the question: which aspects of the vocalization make identification possible? This question is common in animal bioacoustics (e.g., Aubin and Jouventin 1998; Aubin et al. 2000; Jouventin et al. 1999 in penguins) and has been tackled in pinnipeds in several studies (Charrier et al. 2002, 2003, 2009, 2010; Dobson and Jouventin 2003; Aubin et al. 2015). However, prior studies have not addressed whether mother–pup recognition relies on rhythm and temporal features of calls. To understand which features mothers can employ for recognition, previous research has focused on spectral features of pups’ calls. We hypothesize that information on the timing between calls and their respective silences can further be used to study the individuality of MACs in the temporal domain (Ravignani 2018a, b). In particular, individuality of MACs and individual recognition may rely on temporal and rhythmic features, which have been neglected until now. Predictable temporal structure may help to separate calls from different individuals. If temporal structure plays an important role, we expect to find such structure in pups’ calls. To test this hypothesis, we focus on the temporal dimension of pups’ calls, and look for temporal and rhythmic structure by analyzing the following measures: (1) call durations, (2) inter-onset intervals (IOIs), and (3) inter-peak intervals (IPIs) of calls (Ravignani 2018b). The duration is the time between the onset and offset of one vocalization. The IOI is the time elapsed between the onsets of 2 successive calls (Ravignani and Norton 2017). The IPI is the time between the maximum-intensity peak of a call and the maximum-intensity peak of the next call (Ravignani 2018b; see also Jadoul et al. 2016). Although previous preliminary work did not find statistical differences between IOIs and IPIs (Ravignani 2018b), IPI has potential biological significance due to basic psychophysics. In fact, the perceived acoustic structure of pups’ MACs obviously varies depending on the distance between emitter and receiver (see spectrograms in Sauvé et al. 2015a). Hence, although the onset of a call may be clear to a receiver at close distance, it may not be perceived (or be perceived as occurring later) with increasing distance from the caller. Maximum intensity peaks, however, do not suffer from this limitation. Given a few acoustic assumptions, and for reasonably short distances, the intensity peak of a call will always occur at approximately the same point in time independently of the distance from the observer. IPIs are, therefore, worth scrutiny, both to probe their potential temporal structure and to test potential similarities with IOIs. In this study, we address 2 empirical questions: Do harbor seal pups display rhythmic structure in their calls and does this structure change over time? We also examine 2 methodological questions: What are the best analytical techniques to capture different types of temporal structure in pups’ vocalizations and which temporal metrics extracted from pups’ calls are appropriate to study rhythm?

Materials and Methods

Subjects

We recorded 3 wild-born harbor seal pups. A female pup (tagged 292) was brought in for rehabilitation at the Sealcentre Pieterburen, The Netherlands (Ravignani et al. 2017), at the estimated age of 7 days (Ravignani 2018b). Another female pup (tagged 192) was admitted at the estimated age of 2 days, whereas a male pup (tagged 201) was admitted at the estimated age of 10 days. Pups 192 and 201 still had lanugo upon arrival, suggesting that they were born prematurely. The animals were individually housed in a cabin or room with access to a pool. Seals in rehabilitation are usually housed socially (de Reus 2017). These recordings and analyses took advantage of the rare occurrences of individual housing.

Sound recordings

Individual recordings were performed daily between age 7 and 28 days, depending on the individual (de Reus 2017; Ravignani 2018b). Here we report on recordings between day 9 and 27 for individual 292, between day 6 and 18 for individual 192, and between day 12 and 27 for individual 201. Recordings were performed at a random time out of 4 possibilities: 7 AM, 11 AM, 3 PM, and 7 PM. Recordings were collected right before feeding (and 4–12 h after the previous feeding). Ten minutes of vocalizations were recorded in air each day at 0.5–2 m distance from the seal. Recordings were collected using a unidirectional microphone Sennheiser ME-66 (frequency response: 40–20, 000 Hz ± 2, 5 dB; Sennheiser electronic GmbH & Co. KG, Wedemark, Germany), equipped with a MZW-66 foam windshield. A Zoom H6 (Zoom Corporation, Tokyo, Japan) digital recorder recorded and saved the sounds as uncompressed “.wav” files with a sampling frequency of 48 kHz and a 24-bit quantization.

Call annotations

The recorded audio files were manually annotated (de Reus 2017; Ravignani 2018b) in Praat version 6.0.1 (Boersma and Weenink 2017). In particular, all onsets and offsets of pup vocalizations were annotated and further analyzed.

Extraction of temporal variables

The .wav sounds and Praat’s annotations were imported in Python 2.7 using a custom script. The script used the package TextGridTools 1.4.3 to parse “.Textgrid” annotation files (Buschmeier and Wlodarczak 2013) and the package Parselmouth to process “.wav” sound files by calling Praat (Jadoul et al. 2018). The custom script extracted and combined annotations and sound features, and it computed 3 measures: (1) durations, (2) IOIs, and (3) IPIs of calls (Ravignani 2018b). In computing durations and IOIs, the accuracy of the onset was further refined using Praat’s pitch tracking function. The maximum-intensity peaks used to compute IPIs were extracted using Praat’s intensity function, called via Parselmouth (Jadoul et al. 2018).

Statistics

A number of metrics, statistical tests, and visual methods were used to explore and test temporal and rhythmic features. Except for Allan Factor (AF) and burstiness, all analyses focused on differences within individuals rather than between individuals. For each individual, Anderson–Darling tests were employed to examine whether the distributions of durations, IOIs, or IPIs were drawn (across days) from the same underlying distribution. Friedman tests were used to examine whether the sample distributions of durations, IOIs or IPIs differed across days. Two-sample Kolmogorov–Smirnov tests were used to examine pairwise differences in distributions across days, and differences between IOI and IPI distributions for each pup. In the Kolmogorov–Smirnov tests across days, an alpha value of 0.05 was Bonferroni corrected for multiple comparisons: alpha was divided by the binomial coefficient C(d, 2)=d(d−1)/2, where d was the number of days compared pairwise. For each pup, Kendall’s Tau non-parametric correlation was used to compare IOIs and IPIs. The Augmented Dickey–Fuller (ADF) unit root test and the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) test for stationarity were used on the daily time series of median duration, IOI, or IPI to test whether these medians were random walks (Kwiatkowski et al. 1992; Hamilton 1994; MacKinnon 1994, 2010). If the ADF test cannot reject its null hypothesis, whereas the KPSS does reject its null, then the data provide evidence that the series of IOIs (or IPIs or durations) has a unit root, that is, is a random walk over days. Phase space plots were used to visualize structural regularities in IOIs beyond simple distributional regularities (Wagner 2007; Ravignani 2017). In fact, while the above methods do not differentiate between sequences having elements with similar distributions but arranged in different orders, phase space plots are sensitive to durational sequencing and ordering (Ravignani and Norton 2017). Transition matrices, representing Markov chains, were further employed to assess structural regularities in call durations beyond distributional ones (Ravignani and Madison 2017; Ravignani 2018a). For each pup and recording day, we ran a K-means clustering algorithm to find potential categorical distributions in durations (Ravignani et al. 2016). A custom Python script ran alternative versions of the K-means clustering algorithm for each K (the hypothesized number of clusters), ranging from 2 to 10. Each clustering version received a Silhouette score, quantifying the goodness of clustering (Rousseeuw 1987). For each pup and day, (1) the final K was chosen to be the number of clusters minimizing the Silhouette score; (2) each durational value in a sequence of durations was assigned to its category (i.e., cluster), and (3) the transitions between durational categories were plotted. For each plot, a darker blue in a transition matrix corresponds to a higher transition probability, that is, a higher probability that the durational category on the vertical axis d(t) is followed by the durational category on the horizontal axis d(t + 1). Burstiness and AF over days were adopted as hypothesis-free metrics to assess the degree of clustering of temporal events (Abney et al. 2017; Falk and Kello 2017; Kello et al. 2017). Burstiness is a measure borrowed from dynamical systems and statistical physics (Goh and Barabási 2008). It quantifies to what extent events are isochronous vs. clustered in time. A burstiness value close to −1 indicates perfect periodicity (i.e., isochrony). A burstiness value close to 1 indicates high burstiness, which is where periods of clustered activity are followed by periods of inactivity. Burstiness was calculated by dividing the difference between standard deviation and mean IOI by their sum. AF for each pup was computed in Matlab (Kello et al. 2017) using “.wav” files where all sounds except the pup calls were silenced using a custom Python script. AF is a hypothesis-free metric quantifying the degree of grouping, that is, how events are hierarchically organized at different time scales (Kello et al. 2017). As such, AF is a curve over windowed periods of time, rather than a scalar value.

Results

We first tested whether, for each pup, the IOI distributions differed from the IPI distributions, which was not the case (Two-sample Kolmogorov–Smirnov tests. Pup 192: N = 1, 240, D = 0.02, P = 0.92; pup 201: N = 3335, D = 0.01, P = 0.86; pup 292: N = 2059, D = 0.02, P = 0.62). Similarly, for each pup, IOIs strongly correlate with IPIs (Kendall’s Tau. Pup 192: N = 1240, Tau = 0.91, P < 0.001; pup 201: N = 3335, Tau = 0.90, P < 0.001; pup 292: N = 2059, Tau = 0.90, P < 0.001). As the IPI distributions closely resemble IOI distributions, IPI analyses are mostly omitted in the rest of the paper. Figure 1 shows daily distributions of durations (top) and IOIs (bottom) for each pup.

Figure 1.

Violin plots depicting the distribution of durations (top) and IOIs (bottom) in milliseconds over days.

Violin plots depicting the distribution of durations (top) and IOIs (bottom) in milliseconds over days. Comparison of distributions of durations (top) and IOIs (bottom) between any pair of recording days (x and y axes). A black square denotes a significant 2-sample Kolmogorov–Smirnov test, with alpha = 0.05/[days×(days−1)/2], adjusted for all multiple comparisons. The 45° lines denote adjacent days (i.e., d and d + 1). For instance, in the top-left panel, the square at the bottom-left of the graph denotes a significant difference between distributions of durations of Days 6 and 7. The whole graph suggests some heterogeneity but little divergence over time. Crucially, adjacent days are rarely statistically different, suggesting a punctuated slow change. We tested the hypothesis that the sampled distributions of durations were drawn across days from the same underlying distribution. The same hypothesis was tested for the distributions of IOI and IPI. Three k-samples Anderson–Darling tests for each pup suggested that durations are drawn from different probability distributions across days; the same holds for IOIs and IPIs (for all individuals, all A > 9 and all P < 0.001 see Table 2).

Table 2.

Anderson–Darling tests

Timing measure	Test statistic, p-value, sample size	r17–192	r17–201	r17–292
Duration	A	17.07	23.79	33.84
	P	<0.001	<0.001	<0.001
	n	1253	3352	2078
IPI	A	9.41	39.69	11.55
	P	<0.001	<0.001	<0.001
	n	1240	3335	2059
IOI	A	9.59	40.16	12.40
	P	<0.001	<0.001	<0.001
	n	1240	3335	2059

All tests were significant at P < 0.001.

Anderson–Darling tests All tests were significant at P < 0.001. We also tested the hypothesis that the sampled distributions of durations differed across days. The same hypothesis was tested for the distributions of IOI and IPI. For each of 2 pups (see Table 3), 3 Friedman tests suggested that durations differed across days, and the same held for IOIs and IPIs (all Q > 36, all P < 0.01). For a third individual, however, only IOIs and IPIs differed across days, whereas durations did not.

Table 3.

Friedman tests

Timing measure	Test statistic, p-value, sample size	r17–192	r17–201	r17–292
Duration	Q	18.90	82.80	87.41
	P	0.09	<0.001	<0.001
	n	156	512	1102
IPI	Q	26.39	77.33	36.19
	P	<0.01	<0.001	<0.01
	n	143	496	1083
IOI	Q	23.66	74.54	40.18
	P	<0.05	<0.001	<0.01
	n	143	496	1083

All tests but 1 (highlighted in bold) were significant at P < 0.05.

Friedman tests All tests but 1 (highlighted in bold) were significant at P < 0.05. Finally, we tested the hypothesis of equal variances in IOI, IPI and duration distributions within pups and across days. The null hypothesis could be rejected in 8 out of 9 cases (Levene’s tests, all W > 1.9, all P < 0.05), suggesting that IOIs, IPIs, and durations have different variances across days. In contrast, only 1 case of homoskedasticity was found, namely the distribution of durations for pup 192 (Levene’s test, W = 1.2, all P = 0.23). Kolmogorov–Smirnov tests on pairs of days showed heterogeneity of distributions across days. This finding held both for durations (Figure 2, top) and IOIs (bottom). ADF and KPSS tests on the time series of median durations and IOIs (and IPIs, not shown) across days were not significant for any pup (all P > 0.05). Lack of significance in the KPSS tests suggests that the daily median of durations and IOIs are not random walks over days.

Figure 2.

Comparison of distributions of durations (top) and IOIs (bottom) between any pair of recording days (x and y axes). A black square denotes a significant 2-sample Kolmogorov–Smirnov test, with alpha = 0.05/[days×(days−1)/2], adjusted for all multiple comparisons. The 45° lines denote adjacent days (i.e., d and d + 1). For instance, in the top-left panel, the square at the bottom-left of the graph denotes a significant difference between distributions of durations of Days 6 and 7. The whole graph suggests some heterogeneity but little divergence over time. Crucially, adjacent days are rarely statistically different, suggesting a punctuated slow change.

Two main results emerge from these tests. First, in the time window analyzed, there is very slow change in durations and IOIs. Second, very few of the comparisons between adjacent days, corresponding to the squares lying on the 45° gray lines, are significant. Distributions of adjacent days resemble each other more often than not. Phase space plots of all variables, days, and pups (not shown) do not show clear geometrical patterns (as seen, instead, in Ravignani 2017). However, some runs of adjacent days (Figure 3) do show pairwise similarities. Phase space plots are intended as exploratory rather than inferential tools. Hence no scalar metric can be readily assigned to a plot, although this would be a desirable feature. To try and provide a number to quantify the degree of structure within and between plots, we performed some preliminary extraction of metrics post-hoc. Using Python’s “PIL” and “skimage” packages, we calculated Shannon’s entropy of each phase space plot of individual 201. The plots of days 13, 14, and 15 showed similar levels of entropy. The plots of days 23, 24, and 25 also showed similar levels of entropy. In addition, the entropy of days 13, 14, and 15 differed from that of the previous and following days. We take this as very preliminary quantitative support for the visual intuitions derived by Figure 3: low visual entropy corresponds to a higher overlap of edges, hence more repeating patterns within 1 recording day.

Figure 3.

Phase space plots of individual 201’s IOI at Days 13, 14, and 15 (top), and Days 23, 24, and 25 (bottom). Although no clear geometrical pattern emerges, consecutive days appear as a “smeared” version of the previous ones (see Ravignani 2017). The fact that most edges connect at the bottom-left of the figure suggests that short IOIs often occur in pairs, rather than an individual short IOI being followed by an individual long IOI, or pairs of long IOI. Clustering and transition matrices of call durations of the 3 pups (Figures 4–6) show several clear properties. First, within each pup and across days, the algorithm does not converge towards a stable number of categories. Second, a partition of durations in 2 durational categories appears to be the most common, both within and between pups. Third, the transition probabilities are not evenly spread in the matrices, but concentrated in a few cells. In other words, a small number of transitions is very probable, whereas many others have low probability. Fourth, the high probability transitions do not lie on the diagonal, especially not in the bottom-right side of the matrices. This means that transitions between categories (i.e., adjacent calls of different durations) are very common while transitions within categories (i.e., 2 adjacent calls of the same duration) are uncommon. This is particularly true for long durations; transitions between pairs of short durations still occur.

Figure 4.

Transition matrices between centroids of duration clusters for individual 192. Each matrix represents 1 day (First row: Days 6, 7, 8, 9, etc.). Darker blue corresponds to a higher transition probability, that is, a higher probability that the durational category on the vertical axis d(t) is followed by the durational category on the horizontal axis d(t + 1). Categories were calculated via K-means clustering algorithms, computing a Silhouette score for each possible K ≤ 10, and choosing the K minimizing the Silhouette score. Transition matrices between centroids of duration clusters for individual 201. Each matrix represents 1 day (First row: Days 12, 13, 14, 15, etc.). See Figure 4 for details. Transition matrices between centroids of duration clusters for individual 292. Each matrix represents 1 day (First row: Days 9, 10, 11, 12, etc.). See Figure 4 for details. Daily burstiness of IOIs was computed for each pup. Figure 7 shows daily values, and their within-pup average across days. The 2 female pups, 192 and 292, exhibit higher values of burstiness and statistically greater than 0 (Wilcoxon signed-rank test: W = 1.0, P < 0.01). The male pup 201 instead has a mean value statistically equal to 0 (W = 59.0, P > 0.05), with daily values oscillating above and below 0. So, while little can be said about 201, the other 2 pups’ rhythms are quite bursty. Nonparametric correlations (Spearman r and Kendall’s tau) between day and burstiness are mostly negative but non-significant.

Figure 7.

Daily and mean IOIs burstiness of the 3 pups. A value close to 0 denotes randomness. A value close to 1 denotes bursts of activity followed by periods of inactivity. A value close to −1 denotes isochrony. AF was computed for each pup using the raw audio files. A few properties of the AF curves can be observed. First, the AF curves depicted in Figure 8 (left) are quite similar across pups. In other words, all 3 pups have a similar hierarchical organization of call onsets. Second, seal pup calls are quite clustered, especially when compared with other environmental sounds recorded outside vocalization bouts (not shown). Third, the AF curves for all 3 pups are relatively steep.

Figure 8.

(Left) AF curves of the 3 seal pups (analysed here) and other species (from Kello et al. 2017). Each curve (i.e., function) consists of 11 orthonormal (independent) variances. Below 1 s, the curves show within-species similarities and between-species variability. Above 1 s, all species show different patterns, with harbor seals and killer whales exhibiting the steepest curves. (Right) AF curves plotted in terms of the linear and quadratic coefficients of a third-order polynomial fit to each individual AF function, in logarithmic coordinates. AF functions from animal vocalizations analyzed in Kello et al. (2017) are shown for comparison. Seal vocalizations have larger linear coefficients because their AF functions are steepened by the scarcity of seal calls compared with other animal vocalization recordings. Note also that calls were segmented and isolated for seal recordings, but not for other recordings. Despite their steepness, AF functions for seal vocalizations clustered with other animal vocalizations, and particularly with killer whales, relative to human speech and music recordings not plotted here but analyzed in Kello et al. (2017).

Discussion

This study investigated the presence and development of vocal rhythms and timing in captive harbor seal pups. We recorded 3 wild-born pups daily over a 1–3 week period, annotated their calls, and extracted temporal measures. These temporal parameters were analyzed using a range of techniques. We found that rhythmic distributional and structural regularities appear within and across individuals, and partly develop over puppyhood. We began by investigating statistical distributions of durations, IOIs, and IPIs within individuals. We found that these variables vary daily over puppyhood, and intensity peaks within a call do not occur at predictable positions in time. Classical frequentist statistics suggested variability across days for all temporal measures. In one instance, the call durations of female seal 192 did not change across days. This was, however, the animal with the lowest sample size and a type II error may therefore have prevented the detection of a small effect. IOIs and IPIs were distributionally similar to each other and highly correlated (see Tables 2 and 3, and Ravignani 2018b). This similarity suggests that IOIs and IPIs either provide interchangeable measures of between-calls timing, or contain some fine-grained differences, which we were unable to detect. IOIs and IPIs may indirectly provide information on the internal temporal structure of calls, suggesting that the peak of each call is always reached at a constant delay from the onset. Contrasting the IOI-IPI similarity with the distributions of durations, similar across days, it may also be that the peak of each call is reached at a relative fixed proportion of the call duration computed from the call onset. Either way, the data show that intensity peaks do not occur at random positions with respect to onsets (see also de Reus 2017). We found several pairwise differences in distributions of durations—and distributions of IOIs—between days. Crucially, most detected differences were between non-adjacent, rather than adjacent days. This suggests the presence of a punctuated instead of a daily change. We hypothesize that changes in durations and IOIs accumulate over days, until they become statistically detectable. Note that seal 192’s IOIs stood out as outliers, as only 1 significant difference was detected in the 78 performed statistical comparisons between pairs of days. We then investigated how neighboring durations or IOIs are mutually affected, by focusing on structural regularities beyond distributional statistics (Jadoul et al. 2016). Visual representations of IOIs’ structural regularities using phase space plots revealed that the seals’ IOIs have some, though limited, adjacent rhythmic structure that is definitely less stereotyped than other rhythmic behaviors in other species (e.g., human music, Ravignani 2017; humpback whales, Schneider and Mercado 2018). However, when considering visual similarities among adjacent days, it is apparent that there are streaks of days where call onsets were similarly timed. In other words, while the rhythmic pattern of IOIs is not clearly quantifiable for individual plots, adjacent days do show some rhythmic similarities (Ravignani and Norton 2017). Thus, while we could not uncover the exact structure of consecutive IOIs, there were regularities in vocalization onsets, which were repeated and slowly evolved across adjacent days (i.e., visually showing noise reduction and a tendency towards shorter IOIs). Notably, noise reduction over days as visualized in Figure 3 can be mapped to a rhythmic interpretation. In phase space plots, noise reduction often corresponds to geometrical shapes recurring within the same plot (Ravignani 2017). A recurrent geometrical shape with k edges corresponds to a recurring rhythmic pattern of k + 1 IOIs, or equivalently, k + 2 vocalizations. Hence, while Figure 3 tells us little about the rhythmic organization of call onsets in absolute terms, the noise reduction over days suggests that the succession of call onsets becomes more structured. The older the pup, the more predictable the onset of a call becomes given the onsets of previous calls. In brief, while phase space plots do not provide a clear picture of the structure per se, they do show an increase in structure. Higher density towards shorter IOI is usually generated by a vertex in the short IOI range. Such vertex in the short IOI range corresponds to 3 vocalizations separated by 2 short IOIs (Ravignani 2017). This means that series of 3 (or more) calls happening within a short time are separated by a longer pause, followed again by 3 (or more) calls happening within a short time, etc. Conversely, 2 long pauses in a row are quite uncommon. In the wild, when pups are looking for their mother and vocalizing, it might be advantageous to vocalize more than once in different directions. Durational categories were inferred by applying clustering algorithms to sequences of durations. Transition matrices, summarizing the probability of transitioning from 1 durational category to another, also showed some structural organization. Especially in later days of recordings, call durations alternated between 2 or few categories; these predictable runs were only rarely interspersed by less frequent call durations. Finally, the matrices’ diagonals in general did not show high probabilities. This corresponds to alternation of durations rather than repetition of durations. In comparison, we speculate that a similar analysis on the metronomic barks of a California sea lion would result in fundamentally different results, namely matrices with high probabilities on the diagonal, hence many repetitions (Schusterman 1977; Ravignani 2018a). Functionally, we hypothesize that call duration might convey emotional information, as call duration is a vocal correlate of arousal (Filippi et al. 2017). We show how analyses beyond distributional statistics, in particular transition matrices (Ravignani 2018a), capture the development of seals’ rhythmic regularities. We see a few features emerging across individuals and ages: durations become more categorical, with fewer categories and possible combinations (i.e., transitions) decreasing in number. This temporal development is remindful of song learning in zebra finches and speech ontogeny in humans (Feher et al. 2009; Lipkind et al. 2013): Little-structured vocalizations (i.e., before babbling) turn into precisely-timed speech or song. The sort of structural analyses performed here could be extended to other pinniped species. On the one hand, comparative work could test the hypotheses we formulated, such as the isochronous structure of California sea lions’ barks. On the other hand, hypothesis-free analyses could be performed across all 33 pinniped species, tracing the evolution of rhythmic traits within the pinniped phylogenetic tree (e.g., Gingras and Fitch 2013; Gingras et al. 2013). Two additional metrics rarely used in bioacoustics and animal behavior, namely burstiness and AF analysis, provided additional evidence that the pups’ calls are organized according to a temporal structure. Burstiness measured the clustering of call onsets over time. The calls of the 2 younger, female pups were relatively bursty. This means periods of activity separated by periods of rest. The burstiness of the older, male pup 201 did not exhibit a strong trend, hovering around 0. Although our data are too limited to provide solid inference, these differences in burstiness might suggest potential sex or age differences. In addition, while the link between laryngeal anatomy and temporal features remains unexplored, sex steroids may act on laryngeal structures causing different vocal characteristics between males and females, and among different developmental stages. Alternatively, differences in burstiness may derive from sex differences in vocalization rates (de Reus 2017). Sex and age differences in pups’ vocal burstiness may, in turn, provide mothers with a cue to individuality to recognize their pup. Nonetheless, given to the variable number of events from one day to another, the current results of burstiness should be interpreted with caution and await further research. Comparing the harbor seal pups’ burstiness with a hypothetical similar analysis of California sea lions’ metronomic barks, we would predict an opposite result for sea lions: a negative burstiness approaching −1, mirroring the near periodicity (“empirical isochrony” in Ravignani and Madison 2017) of sea lions’ barks (Schusterman 1977; Ravignani 2018a). Here as well, we suggest that future work should measure individuality and species-specificity of burstiness in California sea lions to explicitly test this hypothesis, in more pinniped species, and in other organisms. Finally, AF analysis showed relatively steep and monotonic slopes for all pups. This is particularly noticeable when comparing the seal pups’ curves to other species’ and when looking at longer timescales (right side of the left panel in Figure 8). Killer whales vocalizations (Kello et al. 2017) are the closest to seal pups’ calls (Figure 8). All 3 pups had similar AF slopes, close to those previously found in killer whales and instrumental music (Kello et al. 2017). Steep and monotonic slopes were interpreted in killer whales as proxies of a communication system shaped by social interactions (Kello et al. 2017). This could also be the case for harbor seal pups, as pup calls are used during very active and socially intense weeks in a seal’s life. Cumulative AF variances (not shown) did not exhibit a clear pattern. In brief, all 3 classes of analytical approaches we adopted (i.e., distributional, structural, and dynamical systems approaches) proved fruitful and complementary. Distributional approaches suggested similarities between IOIs and IPIs, and moderate heterogeneity of durations, IOIs, and IPIs across days. Structural approaches showed rhythms: structural timing regularities where the occurrence of 1 call can be predicted from the time of occurrence of previous calls. Dynamical systems approaches showed that the occurrence of pups’ calls is bursty, as opposed to periodic. Their hierarchical temporal structure is reminiscent of the interactive signaling found in other species. At present, it is difficult to say which analyses may be most suitable for future research, as most techniques are only now being employed in non-human animal research. We recommend adopting our tripartite analytical approach to the study of vocal rhythm and temporal structure in other species and domains. The picture we paint of pups’ vocal rhythms is only a first attempt, and more work is needed. Beyond what we can infer from the positive results just discussed, our analytical methods could have provided more clear-cut insights than they actually did. Ideally, phase space plots could have provided a clearer picture, showing not only the existence, but also the structure, of rhythmic patterns. Likewise, the transition matrices could have shown a comparable number of categories across days, unlike the highly variable number of categories we observe in the actual matrices. Four non-mutually exclusive reasons may account for these apparent shortcomings. It may be that our analytical methods are still too rough to capture fine-grained rhythmic regularities in harbor seal pups’ calls, or that our daily recording times were too short. Alternatively, it may be that the moderate rhythmicity we found here represents the true amount of rhythmicity in pup calls. Harbor seals might express some communicative information in the spectral domain, by modulating the fundamental frequency or formants of calls (Ralls et al. 1985). As a third reason, the temporal clustering of pups’ vocalizations may be primarily triggered by interactive communication (Kello et al. 2017; Pika et al. 2018; Ravignani 2018b). Although directed towards their mostly silent mothers, pup calls are often produced at hearing distance of other seals of the same age. Hence, there might be pressures spurring a pup to precisely time her calls interactively (Pika et al. 2018; Kotz et al. (Forthcoming)) with other pups and playback experiments testing the effect of temporal parameters are needed, both for pups’ vocal interaction and mothers’ recognition as well (Ravignani 2018b). Fourth and finally, the semi-captive conditions of data collection, including the absence of mothers, may trigger temporal properties in pups’ vocalization that are different from wild calls. The 4 hypotheses listed above could be tested while improving upon other limitations of this study. Most notably, future research should increase the sample size to enable quantitative inference at a population level, rather than individual level. In addition, a larger sample would enable testing of age and sex effects on burstiness and periodicity of calls. Future data collection should also attempt to record several animals in interaction, rather than isolation. The pups sampled here were already vocalizing before admittance to rehabilitation and such vocal behaviors did not develop in a vacuum, but in a social medium. So, while collection of our individual recordings took advantage of a “vocal momentum” from the wild, attrition due to captivity might have altered the amount of calling or its temporal properties. Recording animals in interaction will enable us to infer whether the rhythmic properties we observed had been molded by social interaction, or if they were partly modified by isolation. Ideally, and for the same purposes, our study could be replicated in the wild to directly disentangle the effects of isolated captivity vs. group captivity vs. natural conditions on vocal rhythms. In particular, recording of pups belonging to rookeries of different sizes could be compared: smaller rookeries might be less interactive and hence show a flatter AF curve. In conclusion, this work is a first step towards understanding the presence of vocal rhythms in harbor seal pups, their development, and the appropriate quantitative tools to study them. We hope our work can be expanded and complemented with other findings on pinniped vocal learning and rhythm to provide an integrative, cross-species framework (Ralls et al. 1985; Cook et al. 2013; Patel 2014; Ravignani et al. 2016; Wilson and Cook 2016; Mathevon et al. 2017; Lattenkamp and Vernes 2018; Ravignani 2018a).

Author Contributions

A.R., S.A.K., S.D.B., and B.d.B. conceived the research; A.R., K.d.R., and M.M.A. performed the research; A.R., C.T.K., K.d.R., and S.D.B. analyzed the data; A.R., S.A.K., K.d.R., M.M.A., B.R.T., and A.R.G. wrote the manuscript. All authors listed have read, edited, and approved the manuscript for publication.

42 in total

Ontogeny of vocal rhythms in harbor seal pups: an exploratory study.

Materials and Methods

Subjects

Sound recordings

Call annotations

Extraction of temporal variables

Statistics

Results

Discussion

Author Contributions

1. Penguins use the two-voice system to recognize each other.

2. Vocal development in captive harbor seal pups, Phoca vitulina richardii: age, sex, and individual differences.

Review 3. Individual recognition: it is good to be different.

4. Finding a parent in a king penguin colony: the acoustic system of individual recognition.

5. Mother-calf vocal communication in Atlantic walrus: a first field experimental study.

6. A neural network model for generating complex birdsong syntax.

7. How mothers find their pups in a colony of Antarctic fur seals.

8. Underwater detection of tonal signals between 0.125 and 100 kHz by harbor seals (Phoca vitulina).

9. How does a fur seal mother recognize the voice of her pup? An experimental study of Arctocephalus tropicalis.

10. De novo establishment of wild-type song culture in the zebra finch.

Review 1. Rhythm in speech and animal vocalizations: a cross-species perspective.