Like humans, songbirds are one of the few animal groups that learn vocalization. Vocal learning requires coordination of auditory input and vocal output using auditory feedback to guide one's own vocalizations during a specific developmental stage known as the critical period. Songbirds are good animal models for understand the neural basis of vocal learning, a complex form of imitation, because they have many parallels to humans with regard to the features of vocal behavior and neural circuits dedicated to vocal learning. In this review, we will summarize the behavioral, neural, and genetic traits of birdsong. We will also discuss how studies of birdsong can help us understand how the development of neural circuits for vocal learning and production is driven by sensory input (auditory information) and motor output (vocalization).
Like humans, songbirds are one of the few animal groups that learn vocalization. Vocal learning requires coordination of auditory input and vocal output using auditory feedback to guide one's own vocalizations during a specific developmental stage known as the critical period. Songbirds are good animal models for understand the neural basis of vocal learning, a complex form of imitation, because they have many parallels to humans with regard to the features of vocal behavior and neural circuits dedicated to vocal learning. In this review, we will summarize the behavioral, neural, and genetic traits of birdsong. We will also discuss how studies of birdsong can help us understand how the development of neural circuits for vocal learning and production is driven by sensory input (auditory information) and motor output (vocalization).
Many animal species communicate by vocalization. Although the vocalizations of most animal
species constitute their innate behavior, some animal groups, such as mammals (humans,
cetaceans, bats, elephants, and pinnipeds) and birds (oscine songbirds, parrots, and
hummingbirds) develop a complex vocal pattern through vocal learning [23, 24]. The songbird is an
attractive animal model for understanding the mechanisms underlying vocal learning because
non-human primates and rodents have a limited ability to modify their vocalization [39]. There are approximately 3,500 songbird species all
over the world, and their birdsong shows a readily quantifiable species-specific variation,
ideal to investigate the developmental changes of acoustic and sequential song structure
(Fig. 1). Some species of birds such as the zebra finch and canary are easily bred under
laboratory conditions [48]. These features mean that
studies of songbirds can provide excellent insights into the evolution, function,
development, and mechanisms of vocal learning. Here we review vocal learning in songbirds,
with particular focus on auditory input as a developmental epigenetic factor of vocal
development. First, we highlight the parallels between human speech and birdsong and
introduce the neural mechanisms involved in vocal production and learning. We then provide
an overview of the contribution of auditory input during vocal development and
maintenance.
Fig. 1.
Song learning and species differences in song pattern. (A) Examples of song
development in a zebra finch. The zebra finch is known as a closed-ended learner,
meaning that once a stable species-specific song pattern “motif” is developed, the
song structure remains unchanged throughout life [8, 22, 74]. This stereotypy of crystallized song enables precise
quantification of the similarities and differences in vocal development and song
patterns between experiments, allowing for examination of genetic and epigenetic
factors that contribute to the acquisition and maintenance of complex vocal patterns.
(B, C) Examples of adult song patterns of two Bengalese finches (B) and two Java
sparrows (C).
Song learning and species differences in song pattern. (A) Examples of song
development in a zebra finch. The zebra finch is known as a closed-ended learner,
meaning that once a stable species-specific song pattern “motif” is developed, the
song structure remains unchanged throughout life [8, 22, 74]. This stereotypy of crystallized song enables precise
quantification of the similarities and differences in vocal development and song
patterns between experiments, allowing for examination of genetic and epigenetic
factors that contribute to the acquisition and maintenance of complex vocal patterns.
(B, C) Examples of adult song patterns of two Bengalese finches (B) and two Java
sparrows (C).
Human Speech and Birdsong
Although birds and mammals diverged from a common ancestor approximately three hundred
million years ago [25], birdsong broadly possesses
three behavioral traits similar to that of human speech [7].First, sensory and sensorimotor learning is crucial for the development of both the
birdsong and human speech [7]. Sensory learning is the
initial phase. Animals listen to and memorize conspecific adult vocalization as their
template (Fig. 1A). The sensorimotor learning
follows, and animals start vocalizing, gradually matching their vocalization to the
memorized template (Fig. 1A). At the early stage
of sensorimotor learning, fledgling juvenile songbirds produce unstructured sounds. These
sounds are referred to as subsong. The subsong is similar to the babbling vocalization of
humaninfants [4]. Juveniles compare these sounds with
the memorized template and achieve vocal imitation through a process of trial-and-error
vocalizations using auditory feedback (Fig. 1A).
Thus, this reliance on tutor experience and auditory feedback means that birds raised in
complete social and acoustic isolation will develop abnormal song (Figs. 2B and C) [33, 40].
Fig. 2.
Examples of song development and syllable scatter plots [duration versus mean
frequency modulation (FM)] in an intact, a socially isolated, an early-deafened, and
an adult-deafened bird. (A, B) Colored portions (blue and green) highlight stable song
motifs. The intact and socially isolated birds exhibited song stability around dph
110. The crystallized song pattern of the socially isolated bird is similar to that of
the intact (normal) bird, except for a prolonged and variable syllable (green
bracket). (C) Orange shading highlights stable song motifs. (D) Song before and after
adult deafening. Blue shading indicates crystallized motifs, which developed at dph
100–150.
Examples of song development and syllable scatter plots [duration versus mean
frequency modulation (FM)] in an intact, a socially isolated, an early-deafened, and
an adult-deafened bird. (A, B) Colored portions (blue and green) highlight stable song
motifs. The intact and socially isolated birds exhibited song stability around dph
110. The crystallized song pattern of the socially isolated bird is similar to that of
the intact (normal) bird, except for a prolonged and variable syllable (green
bracket). (C) Orange shading highlights stable song motifs. (D) Song before and after
adult deafening. Blue shading indicates crystallized motifs, which developed at dph
100–150.Second, learned vocalizations consist of a complex motor sequence, quantifiable at the
phonological and syntactical levels in both songbirds and humans. Although the human speech
and birdsong share common features and their vocal patterns are defined as ordered strings
of sounds, they are different in a critical character. Human speech has the flexible
capacity to convey meaning associated with distinct sound (phonology) and word (syntax)
order, while songbirds use their songs for territorial advertisement and for mate
attraction, just conveying the information about the individual identity of the bird to
receivers [42].Third, vocal learning occurs within a critical period, usually at the early developmental
stage before adulthood. Both songbirds and humans are unable to learn vocalization equally
well throughout their life. Although it is critical that humans and birds are provided with
appropriate auditory and social conditions during the critical period to achieve vocal
learning, they are different with regard to the following point: humans are able to learn
new words and languages throughout their life. Some species of songbirds categorized as
closed-ended vocal learners, e.g., the zebra finch and Bengalese finch, which are commonly
used in research, are unable to learn new songs at the adult stage, while others categorized
as open-ended vocal learners, e.g., the canary, have the ability to imitate new sounds to
some extent as adults. Neurogenesis in the adult avian brain was first reported in canaries
[14]. This neurogenesis allows for replacement of
old neurons with new ones and results in a seasonal fluctuation in the neuron number that
correlates with the capacity of song plasticity [30].As outlined above, birdsong shares numerous behavioral traits with human speech. In
contrast, with regard to their vocal organs and respiratory systems, there are subtle
differences in the functional morphology [56]. Birds
generate sound using an organ named the syrinx, which is part of the respiratory system,
whereas humans generate sound using the larynx, which contains the vocal folds. However, the
basic mechanism underlying sound generation in birds shows strong analogies to the human
source-filter mechanism. In both the cases, vocalizations are generated by airflow-induced
oscillation of the vocal folds in the human larynx and elements in the wall of the syrinx,
followed by filtering and tuning of sound by the upper airway. To generate vocal sounds, the
components of the peripheral vocal system, such as respiration organs, vocal organs and
vocal tract structures require to be precisely coordinated through the neural control of a
number of different muscles [65]. The following text
elaborates on the neural substrates involved in vocal development and how they contribute to
this process.
Neural Substrates of Vocal Learning and Production
In vertebrates such as mammals and birds, the central nervous system is divided into five
basic regions: the hindbrain, the midbrain, the thalamus, the cerebellum, and the cerebrum.
Across vertebrate species, there is similar structural organization throughout most of these
five brain regions, except the cerebrum. In birds, the cerebrum is organized into large cell
clusters; on the other hand, in mammals, the cerebrum is divided into subcortical nuclei,
such as the basal ganglia, and the cerebral cortex, which consists of six main layers.
However, recent studies have indicated that the avian striatal and pallidal domains are well
conserved in relation to their counterparts in the cerebrum of mammals (Fig. 3) [25, 26]. Both humans and songbirds have specific brain regions involved in vocal
learning and production. Humans have a specialized circuit that forms a network of brain
areas (including Broca’s area and temporal areas) devoted to speech perception and
production. Syntax-related networks are reported to exist in the opercular/triangular parts
of left inferior frontal gyrus and the left lateral premotor cortex [29]. In addition, the basal ganglia are considered to be involved in
prosodic modulation and language acquisition [1, 12]. Several studies have indicated that compared with
the native language the basal ganglia showed different activity during speech production and
syntactic processing of a second language [11, 31]. The basal ganglia are engaged in language learning
in adults. The identity and function of the neural networks contributing to vocalization
have been particularly well studied in songbirds through a variety of neurophysiological and
molecular biology methods.
Fig. 3.
Schematic diagrams of the brain areas involved in vocal learning and production.
(modified from Horita and Wada, 2011 [20], and
Pfenning et al., 2014 [50]).
(A, B) Upper drawings illustrate a brain section from a male zebra finch (A) and a
human (B). Solid black arrows denote connections within the posterior vocal motor
circuit (from HVC to RA to brainstem motor nuclei). White arrows denote connections
within the basal ganglia–forebrain circuit. Dashed black arrows denote connections
between the two circuits. Red arrows show the direct connections found only in vocal
learners, which project from vocal motor cortex regions to brain stem vocal motor
neurons. (C, D) Lower drawings illustrate comparative and simplified connectivity of
anterior and posterior vocal circuits in a songbird (C) and a human (D). DLM: dorsal
lateral medial nucleus of the thalamus, DM: dorsal medial nucleus of the midbrain,
HVC: a vocal nucleus (no acronym), LMAN: lateral MAN, MAN: magnocellular nucleus of
the anterior nidopallium, nXIIts: twelfth nucleus, tracheosyringeal part, RA: robust
nucleus of the arcopallium, Ram/Pam: nucleus retroambiguus/parambiguus.
Schematic diagrams of the brain areas involved in vocal learning and production.
(modified from Horita and Wada, 2011 [20], and
Pfenning et al., 2014 [50]).
(A, B) Upper drawings illustrate a brain section from a male zebra finch (A) and a
human (B). Solid black arrows denote connections within the posterior vocal motor
circuit (from HVC to RA to brainstem motor nuclei). White arrows denote connections
within the basal ganglia–forebrain circuit. Dashed black arrows denote connections
between the two circuits. Red arrows show the direct connections found only in vocal
learners, which project from vocal motor cortex regions to brain stem vocal motor
neurons. (C, D) Lower drawings illustrate comparative and simplified connectivity of
anterior and posterior vocal circuits in a songbird (C) and a human (D). DLM: dorsal
lateral medial nucleus of the thalamus, DM: dorsal medial nucleus of the midbrain,
HVC: a vocal nucleus (no acronym), LMAN: lateral MAN, MAN: magnocellular nucleus of
the anterior nidopallium, nXIIts: twelfth nucleus, tracheosyringeal part, RA: robust
nucleus of the arcopallium, Ram/Pam: nucleus retroambiguus/parambiguus.The brain areas associated with song learning and production, the song nuclei, are
organized into two major circuits: the posterior vocal motor circuit and the anterior basal
ganglia–forebrain circuit (Figs. 3A and C). The
vocal motor circuit is involved in the generation of vocal patterns through a hierarchical
process of regulation of syllable sequence and acoustic features [17, 73]. Furthermore, the premotor
HVC nucleus is the only song system nucleus that receives direct projections from auditory
areas [5], and it has a crucial role in encoding the
experience of the tutor song [58]. Mirror neurons
have been reported in HVC of some songbirds [13,
53] such as the swamp sparrow and Bengalese
finches. These neurons display a precise form of vocal–auditory mirroring in analogy to the
motor–visual ones found in human and nonhuman primate cortical motor areas. This form of
sensorimotor correspondence is considered to be important for vocal learning and
communication [44]. In contrast, the basal
ganglia–forebrain circuit in both humans and songbirds is involved in motor and cognitive
processes, such as control of vocal movements and reinforcement-based learning. In
songbirds, this circuit plays a crucial role in song learning by supporting vocal
exploration with direct premotor bias in response to the vocal experience [3, 6, 27, 61], and it
also maintains learned vocalizations using auditory feedback. Variability in the sequence
and structure of syllables is reduced by the presence of a female [27]. Physiological studies have indicated that this context-dependent
change in song variability is accompanied by changes in singing-related neural activity
within cortical nucleus LMAN [27, 60]. A recent study also revealed that the basal ganglia
nucleus Area X is essential for singing-related patterned burst firing of LMAN, which is
critical for vocal plasticity and adjustment in response to auditory feedback [32]. Together, these two premotor circuits are believed
to produce vocalizations at different stages of song development. The poorly structured
subsong, akin to human babbling, is driven primarily by the basal ganglia–forebrain circuit,
whereas the adult song is highly stereotyped and is driven primarily by the motor circuit.
Transferring control of song from the basal ganglia–forebrain circuit to the motor circuit
is crucial for regulating vocal plasticity and stabilization [9].Human speech and birdsong result from the development of specialized brain regions for
vocal learning and production, which develop through interaction between genetic and
environmental factors. However, little is known about the genetic mechanisms underlying
vocal development. Overcoming this problem requires an appropriate model system whose
genomic information has been well understood and in which genetic manipulation can be
performed. For example, FoxP2, a Forkhead box family gene that encodes a
transcription factor, has been reported as the gene underlying a human developmental
language impairment caused by structural abnormalities in the striatum, cerebellum, and
cortex [35, 69]. Similarly, in songbirds, FoxP2 is expressed strongly in the
striatum and is regulated during vocal development [15, 66]. Knockdown of
FoxP2 in the songbird striatum impairs song learning, decreases spine
density of striatal spiny neurons, and disrupts the control of vocal variability by
interfering with dopamine-dependent modulation [16,
62]. Other genes relevant to speech and other humanlanguage disorders have been reported to be differentially expressed in the song nuclei of
songbirds [18], and investigation and manipulation of
these genes has become possible following the sequencing of the zebra finch genome [72] and through use of transgenesis and viral
transfection [2, 16, 43]. In addition, targeting of viral
vectors to specific brain regions using microinjections can be used to regulate gene
expression with temporal and spatial precision in order to analyze the function of genes,
cells, and circuits [57, 58, 68].In addition to these genetic contributors, developmental factors that influence
epigenetics, such as social interaction [34] and
nutrition [47], are also important in the development
of vocalization and the brain regions that support it. Dysfunction of motor and auditory
ability causes speech disorders, such as aphasia and stuttering. Aphasia usually results
from a stroke, brain tumor, or head injury. Studies of vocal deficits by lesions to song
nuclei provide us with an animal model of aphasia. For instance, an adult zebra finch
becomes unable to produce a learned vocal pattern after HVC lesions [4, 63, 67]. Stuttering is the most common disorder of speech motor control in
young children who are developing speech [52]. The
incidence of stuttering is higher in males than in females. Stuttering is resolved by
adulthood in nearly 80% of children with developmental stuttering. Twin studies have
reported substantial genetic and epigenetic effects on stuttering [10, 55]. However, the
neurobiological basis of this disorder is poorly understood despite recent progress in
uncovering its genetic roots. From a comparative point of view, song syllable repetitions of
the zebra finch resemble part-word repetitions, a common feature of stuttering [18]. Song syllable repetitions can be induced by delays
or disruptions in auditory feedback during vocalization [19, 36], similar to those that can occur in
humans [21]. Furthermore, auditory input is crucial
for the acquisition of birdsong and human speech and can influence epigenetic factors
contributing to sensorimotor learning [33, 59].
Audition for Vocal Learning and Maintenance
Audition provides important information for vocal learning, both for learning templates and
for evaluation of one’s own vocal output. Auditory feedback also plays an important role in
maintaining stable vocal output in adulthood [19,
36, 38,
46].When songbirds are deprived of auditory input before the sensory learning phase of song,
they do not develop normal songs (Fig. 2C) [33], similar to individuals with hearing loss that have
difficulty developing normal speech patterns. However, audition-deprived songbirds can still
develop a certain degree of species-specific song [41, 54] and crystallize vocal patterns,
though they are noisy and amorphous (Fig. 2C)
[45]. In motor circuit nuclei, developmental gene
expression is found to be conserved in an age-dependent manner even in deafened birds [45], indicating audition-independent robustness of gene
expression dynamics during vocal development in the song system. Although auditory
information is crucial for song development, auditory input is not the main driver of
developmental gene expression dynamics in motor circuit nuclei.In adult humans and songbirds, disruption of auditory feedback causes gradual deterioration
in learned vocalization (Fig. 2D) [19, 46, 71], and the rate of deterioration depends on age [38]. When deprived of auditory feedback, deterioration of
vocal patterns is much more severe at a younger age, and deterioration takes longer at an
older age. Delays or disruptions in auditory feedback during vocalization result in
stuttering, deletions, and distortion of syllables [19, 36]. Furthermore, birds exhibit the
capacity to adjust pitch according to perceived errors in vocal production [64], and the speed and extent of vocal error correction
decreases markedly with age [28]. The vocal
variability necessary for audition-dependent song plasticity is generated by the basal
ganglia–forebrain circuit [3, 27, 49]. However, expression of
the molecular markers of neural activity-dependent gene induction [dual specificity
phosphatase 1 (Dusp1), c-fos, and Arc] is
similar throughout development in the nuclei of the basal ganglia–forebrain circuit [45]. This suggests that molecular signaling cascades are
consistently regulated regardless of age in the basal ganglia–forebrain circuit related to
vocal learning and maintenance. Therefore, during vocal learning, inherited genetic programs
contribute to vocal development and auditory-dependent vocal plasticity, which are directly
or indirectly regulated by age or activation of vocalization (motor)-dependent epigenetic
factors.
Conclusions and Future Directions
Vocal learning is an ability shared by both songbirds and humans. It is a complex form of
sensorimotor learning that requires coordination of sensory input and motor output to guide
one’s own vocalization. Complex learned vocalization is shaped by both genetic and
environmental factors during development.Hearing impairment and developmental disabilities lead to deficits in acquired vocal
patterns and maintenance during vocal development, including speech disorders, such as
aphasias, and stuttering. Songbirds that have had auditory input disrupted are useful animal
models for understanding how hearing impairment affects the development of brain regions for
vocal learning and production (Fig. 4).
Fig. 4.
A schematic highlighting the use of songbirds as a research model for disorders of
vocal development and communication.
A schematic highlighting the use of songbirds as a research model for disorders of
vocal development and communication.As we have described, audition-independent robustness of gene expression is present in the
songbird motor circuit, which indicates that volitional vocalization itself may have a
crucial influence on epigenetic factors that activate the genetic programs necessary for
regulating vocal plasticity and development of vocal patterns. In fact, a large set of
neural plasticity-related genes are regulated by singing in song nuclei [37, 51, 70]. Although variability in the accuracy of
syllable/word structures in humanchildren and adults with hearing impairments has been
observed, little is known about the neural basis of the variability. Language outcomes may
vary by the overall hearing level, age of onset of hearing loss, and therapeutic
interventions, such as hearing aids or cochlear implants. In addition, vocal development may
rely not only on how good one’s hearing is but also how much vocalization they produce. This
suggests that interventions, such as hearing aids or cochlear implant, performed at an early
stage of word production may have a more positive effect on language development in children
with congenital hearing impairment.Studies on birdsong using behavioral manipulation and genetic and neurophysiological tools
have shed light on the specialized neural networks that underlie vocal learning. Further
research is needed to understand how auditory input, motor activity, and aging affect the
development of brain areas involved in vocal learning and production.
Authors: Sebastian Haesler; Kazuhiro Wada; A Nshdejan; Edward E Morrisey; Thierry Lints; Eric D Jarvis; Constance Scharff Journal: J Neurosci Date: 2004-03-31 Impact factor: 6.167
Authors: Todd F Roberts; Sharon M H Gobes; Malavika Murugan; Bence P Ölveczky; Richard Mooney Journal: Nat Neurosci Date: 2012-09-16 Impact factor: 24.884
Authors: Eric E Bauer; Melissa J Coleman; Todd F Roberts; Arani Roy; Jonathan F Prather; Richard Mooney Journal: J Neurosci Date: 2008-02-06 Impact factor: 6.167
Authors: Andreas R Pfenning; Erina Hara; Osceola Whitney; Miriam V Rivas; Rui Wang; Petra L Roulhac; Jason T Howard; Morgan Wirthlin; Peter V Lovell; Ganeshkumar Ganapathy; Jacquelyn Mouncastle; M Arthur Moseley; J Will Thompson; Erik J Soderblom; Atsushi Iriki; Masaki Kato; M Thomas P Gilbert; Guojie Zhang; Trygve Bakken; Angie Bongaarts; Amy Bernard; Ed Lein; Claudio V Mello; Alexander J Hartemink; Erich D Jarvis Journal: Science Date: 2014-12-12 Impact factor: 47.728