Literature DB >> 22645122

Transcriptome analysis of a North American songbird, Melospiza melodia.

Anuj Srivastava1, Kevin Winker, Timothy I Shaw, Kenneth L Jones, Travis C Glenn.   

Abstract

An effective way to understand the genomics of divergence in non-model organisms is to use the transcriptome to identify genes associated with divergence. We examine the transcriptome of the song sparrow (Melospiza melodia) and contrast it with the avian models zebra finch (Taeniopygia guttata) and chicken (Gallus gallus). We aimed to (i) obtain a functional annotation of a substantial portion of the song sparrow transcriptome; (ii) compare transcript divergence; (iii) efficiently characterize single nucleotide polymorphism/indel markers possibly fixed between song sparrow subspecies; and (iv) identify the most common set of transcripts in birds using the zebra finch as a reference. Using two individuals from each of three populations, whole-body mRNA was normalized and sequenced (110 Mb total). The assembly yielded 38,539 contigs [N50 (the length-weighted median) = 482 bp]; 4574 were orthologous to both model genomes and 3680 are functionally annotated. This low-coverage scan of the song sparrow transcriptome revealed 29,982 SNPs/indels, 1402 fixed between populations and subspecies. Referencing zebra finch and chicken, we identified 43 and 5 fast-evolving genes, respectively. We also identified the most common set of transcripts present in birds with respect to zebra finch. This study provides new insight into songbird transcriptomes, and candidate markers identified here may help research in songbirds (oscine Passeriformes), a frequently studied group.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22645122      PMCID: PMC3415294          DOI: 10.1093/dnares/dss015

Source DB:  PubMed          Journal:  DNA Res        ISSN: 1340-2838            Impact factor:   4.458


Introduction

Determining the genetic underpinnings of organismal divergence and speciation will provide insight into the evolutionary generation of biodiversity, and next-generation sequencing is propelling such studies in non-model organisms.[1,2] An effective way to initiate genomic-wide data sets in non-model organisms is to focus on the transcriptome, or expressed sequence, which, unlike a whole-genome approach, increases the data's focus on functional genomic attributes.[3,4] As these data become available, evolutionary biologists will be able to make contrasts within and among lineages to identify genes associated with divergence.[5-8] To gain insight into the genes associated with avian diversification, we examine the transcriptome of the song sparrow (Melospiza melodia) and contrast it with the model birds zebra finch (Taeniopygia guttata) and chicken (G. gallus). The song sparrow is broadly distributed across North America and exhibits pronounced morphological variation, with 25 subspecies recognized (of 52 described[9]). It has been extensively studied over the past 70 yrs; it is considered a model vertebrate species for field research; and it will continue to be a focus for questions about the causes of population variation in behaviour, demographics, and morphology.[10] Our goals in this study were to (i) obtain a functional annotation of a substantial portion of the song sparrow transcriptome; (ii) compare transcript divergence between the song sparrow and the two bird genomes sequenced and assembled to the highest quality thus far, zebra finch (T. guttata) and chicken (G. gallus); (iii) efficiently characterize a set of single nucleotide polymorphism (SNP)/indel markers that may be fixed between song sparrow subspecies; and (iv) identify the most common set of transcripts present in bird species using the zebra finch as a reference. Achieving these goals will establish important baseline data for a non-model organism in a speciose group (passerines or songbirds) frequently studied.

Materials and methods

Samples, cDNA library, and sequencing

Two song sparrows still undergoing growth (from embryo to just-fledged) were sampled from each of three Alaska populations (the northwestern most distribution of the species), chosen because they span some of the most pronounced morphological diversity that occurs in the species (Fig. 1): two island populations of M. m. maxima (from Attu and Adak islands; an egg and a very young nestling from Attu Island, unvouchered; and vouchers UAM 27831 and 27832 from Adak Island) and one mainland population of M. m. caurina (from Cordova, vouchers UAM 27829 and 27830). The Attu and Adak populations of Melospiza m. maxima are the largest in the species and also have different plumage coloration; in addition, they are non-migratory, unlike the population from Cordova, which is also smaller and darker (Fig. 1).
Figure 1.

Samples in this study came from Cordova (Melospiza melodia caurina, right in inset) and Adak and Attu islands (M. m. maxima, left in inset); grey shading indicates the species' range.

Samples in this study came from Cordova (Melospiza melodia caurina, right in inset) and Adak and Attu islands (M. m. maxima, left in inset); grey shading indicates the species' range. All samples were obtained in June (spring) at a very young age and only two were sexable (both females, one each from Cordova and Adak). The egg was homogenized, whereas from the others six tissues (brain, liver, heart, muscle, bone, and pancreas) were taken, minced and placed in RNAlater (Qiagen, Valencia, CA) within minutes of death and then frozen. In the laboratory, tissues were homogenized and total RNA was isolated using Trizol (Invitrogen, Carlsbad, CA) and subsequently cleaned using a Qiagen RNeasy column. Equal amounts of RNA from individuals of each population were pooled and an MINT universal cDNA kit (Evrogen, Moscow, Russia) with primers modified specifically for 454 procedures[11] was used to create cDNA libraries enriched for full-length transcripts. We then normalized the three cDNA libraries using the TRIMMER cDNA normalization kit (Evrogen) to substantially decrease the relative abundance of common transcripts. The normalized cDNA was fragmented and prepared for sequencing using standard 454 procedures, including independent molecular identifiers [MID tags: Cordova (MID 13), Attu (MID 18) and Adak (MID 19)] for each of the three populations. As each library contained a unique MID tag, libraries were pooled and sequenced as a single sample. Sequencing was performed at the University of Georgia's Georgia Genomics Facility on a Roche 454 FLX using Titanium chemistry.

Assembly, polymorphism, and ortholog identification

Bases were called from the 454-generated sff file using Pyrobayes,[12] which provides improved accuracy in the estimation of base qualities for pyrosequences. We removed MINT primer sequences, short sequences, and other contaminatants using SeqClean (http://compbio.dfci.harvard.edu), and reads from all three populations were combined. We performed a combined assembly of reads using MIRA,[13] and then used GigaBayes,[14] a short-read SNP and short indel discovery program, to detect polymorphisms. To make the SNP/indel predictions more reliable, we used the more stringent criteria that the minor allele must occur at least three times and be present at ≥10% relative to the major allele frequency when >30 reads per locus were obtained (after combining all the reads for particular alleles among different subspecies; sequences with fewer reads are considered the minor allele and sequences with more reads are considered the major allele). We identified orthologous contigs (against the zebra finch and chicken genomes) using the reciprocal blast approach, because it has been found to be superior to sophisticated orthology detection algorithms.[15] A stringent cutoff of 1e−20 was used to separate paralogues from orthologues. The cDNA sequences from the zebra finch (taeGut3.2.4.60.cdna.all.fa) and chicken (WASHUC2.60.cdna.all.fa) were obtained from the Biomart database (www.biomart.org). Although the zebra finch is a passerine and thus more closely related to the song sparrow, the chicken database contains sequences from whole growing chicks, whereas that of the zebra finch emphasizes neural transcripts. To identify likely genomic positions of the song sparrow contigs, we mapped them against genomic sequences of the zebra finch (taeGut3.2.4.60.dna_rm.toplevel.fa) and chicken (WASHUC2.60.dna_rm.toplevel.fa) using BLAT[16] with default criteria. We obtained feature information for protein-coding genes and ncRNA using the Ensemble (http://uswest.ensembl.org/index.html/) Xenoref and gtf files, respectively.

Most common set of transcripts in birds

To find the most common set of transcripts in birds with respect to zebra finch, we collected and assembled (454 GS assembler version 2.5) the transcriptome sequence of 12 bird species (publicly available sequence[5,7,8,17]). The orthologous sequence with respect to zebra finch was determined using the bidirectional blast best hit method (1e−20). Only contigs >200 bp were used in the analysis. After determining the orthologous sequences, we sorted them in decreasing order and added orthologous sequences from other species sequentially to find the most common set.

Functional annotation of contigs

We used Blast2GO[18] (B2G) to functionally annotate the contigs. A combined graph was generated for each gene ontology (GO) category. For the molecular function division, a graph was obtained using default criteria and for the other two divisions (cellular component and biological process), seq/node filter values were changed to 4/10 to prevent overloading the graphs.

Estimation of substitution rates

Substitution rates were estimated for contigs that were orthologous to both zebra finch and chicken. Reading frames for these contigs were identified using BLASTX[19] against protein sequences of zebra finch (taeGut3.2.4.60.pep.all.fa) and chicken (WASHUC2.60.pep.all.fa) obtained from Biomart (www.biomart.org). Sequences that produced significant alignments were extracted (using their coordinates), translated, and aligned using CLUSTALW.[20] Sequences that contained frame shifts were excluded from the analysis. Corresponding codon alignments were produced using PAL2NAL,[21] and, finally, rates were estimated using a maximum likelihood method implemented in the CODEML program of the PAML package Version 4.1.[22] Pairwise maximum likelihood analyses were performed in runmode-2. The estimated rates of non-synonymous to synonymous substitutions (Ka/Ks values) were plotted as a scatter plot in the range of 0–2.0.

Results and discussion

Sequence assembly

The pooled reads from all three populations yielded 131 Mb (458 808 sequences) of raw data, which was reduced to 110Mb (381 474 sequences) after the use of SeqClean (Table 1). The mean raw and cleaned read lengths were 286 and 290 bp, respectively. Poor-quality reads were often very short and were purged entirely prior to assembly. Without a reference genome for the song sparrow, de novo assembly was required. Cleaned sequences were assembled into 38 539 contigs with N50 and N90 values of 482 and 317 bp, respectively (Supplementary data). There were 1417 singletons. The mean coverage per contig was 3.93 X and the mean GC content per contig was 43.6%.
Table 1.

Number of reads and assembly statistics for three song sparrow populations (SRA 048516)

SubspeciesLocalitynaMIDRaw readsCleaned readsCleaned bases (MB)
M. m. caurinaCordova213138 439114 09832.5
M. m. maximaAdak219135 588117 16634.7
M. m. maximaAttu218184 781150 21042.8
Combined6458 808381 474110

aNumber of individuals pooled prior to sequencing.

Number of reads and assembly statistics for three song sparrow populations (SRA 048516) aNumber of individuals pooled prior to sequencing. We acknowledge that the amount of sequencing presented is insufficient to allow a high-quality assembly of the extremely diverse transcriptome that we have sampled. A large number of tissues were sampled, and these clearly contain a large and diverse set of transcripts (see Section 3.2). Simulations indicate that transcriptomes sequenced with 454 Titanium chemistry will quickly lead to about twice as many contigs as transcripts, and additional sequences only gradually cause the number of contigs to reach the number of transcripts (i.e. the point when contigs = transcripts; data not shown). Thus, quite large numbers of additional sequences will be necessary to fully assemble the transcripts contained in these cDNA libraries. Given the relatively high cost of 454 sequencing, it would be more economical to obtain the additional sequences as paired-end reads on Illumina or Ion Torrent platforms.

Functional annotation

B2G, which we used to functionally annotate the contigs, has three annotation steps involving (i) a blast against databases, (ii) mapping against GO resources, and (iii) annotation to generate reliable functional assignments. In our data, 12 880 of the contigs (33.46% overall, of which 8540 were unique hits) had significant matches to currently known proteins in the NCBI non-redundant protein database. Because one-third of the contigs hit the same proteins as other contigs in our data, this indicates that large transcripts were often split among multiple contigs in our assembly. Although it is possible to use the zebra finch or chicken proteins as a reference to scaffold the song sparrow contigs, we did not do this because it could make chimeras, and assembly of full-length genes was not a major goal of this work. As expected, zebra finch and chicken were identified as the top two species with the best blast hits for our song sparrow contigs (Table 2). Contigs with significant blast matches were functionally annotated. GO resource assignment was found for 3949 (10.2%) of the total contigs (with 24 363 GO terms; there can be multiple terms per contig), of which 3367 (8.7% of all contigs) were functionally annotated (Supplementary Sheet 1).
Table 2.

Species with ≥100 top hits from B2G

SpeciesHits
T. guttata7820
G. gallus2222
Homo sapiens235
Monodelphis domestica193
Mus musculus187
Ailuropoda melanoleuca177
Ornithorhynchus anatinus149
Canis familiaris119
M. melodia113
Rattus norvegicus100
Species with ≥100 top hits from B2G In the first GO division, ‘biological process’,[23] 22 categories were identified. Most contigs (3578 = 53.1%) were involved in ‘cellular and metabolic processes’. The second most abundant category was ‘biological regulation and localization’ (1253 = 18.6%; Supplementary Fig. S1A). Within the second division, ‘molecular function’,[23] nine major categories were identified. Most of the contigs were functionally related to ‘nucleotide binding’ (1966 = 43.9%) and ‘catalytic activity’ (1266 28.2%; Supplementary Fig. S1B). Finally, the last division, ‘cellular component’,[23] also had nine categories. Gene products were primarily expressed intracellularly (2322 = 41.9%) or in the membrane bound/non-membrane bound organelle (1787 = 32.3%; Supplementary Fig. S1C). All of the GO results should be viewed with caution because the depth of the available sequences ensures that most highly expressed transcripts will have been sequenced but many low-expression transcripts will not have been detected. The normalization techniques used substantially increased the number of low-expression transcripts sequenced, but the number of sequences obtained is insufficient to overcome the bias toward highly expressed transcripts.

Polymorphism detection

We detected a total of 29 982 SNPs/indels that were spread relatively evenly within, between, and among all three populations (Fig. 2, Supplementary Sheet 2). A total of 1402 SNPs/indels were fixed between populations and subspecies (Fig. 3; the sum of all pairwise comparisons is 1635 because some pairwise SNPs are found in more than one pair). Out of the 1402, there were 392 and 410 SNPs/indels between subspecies and within-subspecies, respectively. This provides many SNPs/indels for further study (Supplementary Sheet 2), although given our limited sampling of individuals within populations (n = 2) many will not be true fixed differences (i.e. they are false positives, other individuals contain these variants). We also note that we have used quite stringent criteria for SNP/indel assignment. By requiring at least three reads for the minor allele, a minimum of six times coverage is required to call a SNP. Because our average assembly depth is only about four times, most polymorphic nucleotides in our contigs will not pass our criteria for SNP discovery. Because of this, we have biased the SNPs to be from the relatively highly expressed transcripts. Many additional SNPs/indels occur in song sparrows, we describe only those with a high probability of being real, not sequencing artefacts. None of these issues limits our ability to achieve our stated goals, but we note them so that it is understood that we have made appropriately cautious interpretations of our results.
Figure 2.

Numbers of SNPs and indels that are within and shared between and among three populations of song sparrows.

Figure 3.

SNPs and indels that are fixed between and among three populations of song sparrows. There are 392 SNPs/indels that are identical in Attu and Adak, but different from Cordova. Because sample sizes are small, these figures include false positives.

Numbers of SNPs and indels that are within and shared between and among three populations of song sparrows. SNPs and indels that are fixed between and among three populations of song sparrows. There are 392 SNPs/indels that are identical in Attu and Adak, but different from Cordova. Because sample sizes are small, these figures include false positives.

Orthology with zebra finch and chicken

The reciprocal blast approach identified 4574 contigs as orthologous to both zebra finch and chicken. As expected because of phylogenetic relationships, more contigs were identified as orthologous to the zebra finch than the chicken: the set [unique song sparrow (orthologues) unique zebra finch] was [32 435 (6104) 12 493], whereas the set [unique song sparrow (orthologues) unique chicken] was [32 767 (5772) 16 518]. A substantial number of orthologous contigs (3894) were found to have the same chromosome location in the zebra finch and chicken (Supplementary Sheet 1).

Localization of contigs

The zebra finch and chicken genomes were used as references to locate the contigs. BLAT mapping of our assemblies against these genomes showed sequences that uniquely mapped to particular features of the reference genomes [5′UTR (untranslated region), 3′UTR, CDS (coding sequence), 1 kb upstream, 1 kb downstream; Fig. 4A]. Based on the zebra finch genome annotation, nearly 34% of mapped contigs (2890 of 8561) were found to be in CDS regions. Even with the use of the MINT cDNA construction kit, which is meant only to allow amplification of full-length transcripts, we still observed a substantial bias toward contigs mapping to 3′UTR and 1 kb downstream _relative to 5′UTR and 1 kb upstream. The normalized distributions clearly indicate that our libraries contain relatively few transcripts that are full length (Fig. 4B). Similar patterns, although with slightly fewer hits, were obtained from mapping to the chicken genome. The localization of contigs containing SNPs/indels mapped against the zebra finch and chicken genomes showed that a major proportion of polymorphisms belongs to coding sequences (Supplementary Fig. S2A and B). Contigs with SNPs/indels had more blast hits to the zebra finch than to the chicken, reflecting the overall pattern of all contigs. Few RNA genes were also found by BLAT mapping (Supplementary Fig. S3A and B).
Figure 4.

Histogram displaying the proportion of contigs mapped to particular features of protein coding genes of zebra finch and chicken (UTR is the untranslated region, and CDS is the coding sequence). The upper panel displays the raw count and the lower panel normalized values (the proportion discovered relative to how many could be discovered within each category).

Histogram displaying the proportion of contigs mapped to particular features of protein coding genes of zebra finch and chicken (UTR is the untranslated region, and CDS is the coding sequence). The upper panel displays the raw count and the lower panel normalized values (the proportion discovered relative to how many could be discovered within each category).

Common set transcripts in birds

We determined the orthologous transcripts with respect to zebra finch using the bidirectional blast best hit method in 12 bird species. From the orthologous sequences, we determined the most common set of transcripts of zebra finch which is present in all species or most of the species. The first big set of transcripts (1004 zebra finch sequences) was present in seven bird species. The second largest set comprised 219 and 126 sequences present in 10 and 12 bird species, respectively, and, finally, 19 sequences were present in all 13 species. Detailed information regarding species used and orthologous sequences is given in the Supplementary Sheet 3. Further, we checked the pathways in which these common transcripts might be involved using DAVID[24,25] and found that they mainly related to oxidative phosphorylation, ribosome biogenesis, and cardiac muscle contraction. These are housekeeping genes[26,27] which explains the frequent occurrence of these in all avian species. With respect to the chromosomal location of common transcripts, we did not find any significant bias related to any particular chromosome.

Estimation of K/K

Substitution rates were estimated for the 4574 contigs orthologous to both zebra finch and chicken. After filtering (based on the length of alignment and removing frame shifts), the number of contigs was reduced to 3821. We excluded contigs that were either identical or which had Ks = 0 (which made Ka/Ks incalculable). Thus, Ka/Ks was estimated for 3252 (zebra finch) and 3127 (chicken) contigs. Rate estimation with zebra finch identified 43 contigs with Ka/Ks ≥1 and 283 with values of 0.5–1.0 (Fig. 5A). Rate estimations with chicken yielded 5 and 58 contigs with Ka/Ks ≥1 and between 0.5 and 1.0, respectively (Fig. 5B). Afterwards, assuming the song sparrow contigs have the same chromosome organization as zebra finch and chicken, the calculated ratios were organized into chromosomes (Table 3); this is not an unrealistic assumption considering the high degree of chromosomal conservation among avian genomes[28,29] and the fact that such a high proportion (85.1%) of our orthologous contigs was found to have shared chromosomal locations with zebra finch and chicken.
Figure 5.

The distribution of Ka/Ks ratio for the contigs orthologous to both zebra finch (A) and chicken (B). Contigs with Ka/Ks values of 0.5–1.0 fall above the grey line and values >1.0 fall above the black line.

Table 3.

Number of contgis orthologous to particular zebra finch and chicken chromosomes, and mean Ka/Ks ratio for each chromosome, assuming the orthologous contigs have the same chromosomal location as zebra finch and chicken

ChrContigs orthologous to particular zebra finch chromosomeTotal number of transcripts from particular zebra finch chromosome in Biomart fileKa/Ks (mean ± SD)Contigs orthologous to particular chicken chromosomeTotal number of transcripts from particular chicken chromosome in Biomart fileKa/Ks (mean ± SD)
126111240.2552 ± 0.273349229940.1528 ± 0.1694
233813450.2434 ± 0.246533919950.1457 ± 0.1326
330911690.2434 ± 0.280731416720.1565 ± 0.1497
41887410.2258 ± 0.334725215160.1374 ± 0.1274
52299360.2103 ± 0.218423412990.1280 ± 0.1219
61075620.2447 ± 0.21121067810.1486 ± 0.1187
71245210.2220 ± 0.21031207670.1361 ± 0.1235
81114160.2581 ± 0.21961277230.1436 ± 0.1251
9904580.2286 ± 0.3839865980.1045 ± 0.1087
10863940.1784 ± 0.1738905990.1220 ± 0.1890
11683710.2330 ± 0.2978614990.1429 ± 0.1439
12733490.1799 ± 0.2206684270.1076 ± 0.1122
13773210.1845 ± 0.2319834990.0994 ± 0.1225
14803900.2541 ± 0.3448795780.1333 ± 0.1288
15763500.1817 ± 0.2299735310.0925 ± 0.1207
17493000.1705 ± 0.1597464320.0967 ± 0.0861
18543090.2230 ± 0.1950554280.1085 ± 0.0907
19683130.2004 ± 0.2982664430.0858 ± 0.0952
20503290.2419 ± 0.2444514760.1336 ± 0.1277
21341920.1470 ± 0.1569443460.0847 ± 0.1058
2216980.1000 ± 0.0976111600.0441 ± 0.0593
23342050.1783 ± 0.1828332880.0782 ± 0.0920
24271810.1961 ± 0.1906242700.1000 ± 0.0982
257920.1161 ± 0.106961690.0711 ± 0.1017
26311760.1148 ± 0.1081293410.0824 ± 0.0927
27312520.1471 ± 0.1438283450.0698 ± 0.0727
28272270.1102 ± 0.1256232840.0476 ± 0.0414
Z1497450.2321 ± 0.22931469900.1381 ± 0.1174
Number of contgis orthologous to particular zebra finch and chicken chromosomes, and mean Ka/Ks ratio for each chromosome, assuming the orthologous contigs have the same chromosomal location as zebra finch and chicken The distribution of Ka/Ks ratio for the contigs orthologous to both zebra finch (A) and chicken (B). Contigs with Ka/Ks values of 0.5–1.0 fall above the grey line and values >1.0 fall above the black line. Although Ka/Ks (sometimes calculated as dN/dS or ω) is commonly misinterpreted,[30] this ratio of rates of non-synonymous to synonymous substitutions can give some context to candidate genes and allows for subsequent hypothesis testing.[31,32] Data organized into chromosomes suggest that contigs may have undergone more selection with respect to the zebra finch than the chicken (as high Ka/Ks values are typically interpreted, though see ref. 30). The fact that Ka/Ks values were higher on average for the zebra finch than for the chicken (Table 3) is likely a methodological artefact. The zebra finch is in the same taxonomic order as the song sparrow (Passeriformes), whereas the chicken is taxonomically distant (Galliformes). Estimates of ω necessarily classify sites with differences as non-synonymous or synonymous, and errors in the estimation of either can profoundly affect the outcome of these analyses.[33] Taxonomic or lineage distance (longer branches) will affect the reconstruction of synonymous substitution rates especially (through an expected increase in repeated mutations, or multiple hits), and we consider this to be a likely source of the consistent differences in apparent molecular selection between our song-sparrow-to-zebra-finch and song-sparrow-to-chicken contrasts (Table 3; see also ref. 34). Nevertheless, these contrasts are valuable in highlighting the chromosomal distributions (assuming chromosomal stability[28]) and relative values of ω between closer and more distant relatives of the song sparrow, providing insights into attributes of selection in the coding genome across these scales. Unfortunately, this approach is not valid within species.[35-37] Chromosomes 22 and 26 showed the greatest differences between the zebra finch and the chicken in the percentage of song sparrow contigs mapped (relative to the number of genes available in the Biomart database for the zebra finch and chicken). Both of these chromosomes had significantly different frequencies of mapped-song-sparrow versus Biomart data-available genes between the zebra finch and the chicken (Gadj = 4.4, P< 0.05, and Gadj = 6.9, P< 0.01, respectively at 1 d.f., G-test with Williams' correction; Table 3). In both cases, proportionally more contigs were mapped to the zebra finch than to the chicken given the sizes of the respective databases (Table 3).

Chromosomal distributions of between-subspecies SNPs/indels

Two findings emerged in comparing the among-chromosome locations (mapped against the zebra finch) of the between-subspecies SNPs/indels that were mapped to chromosomes (218 SNP/indel-bearing, between-subspecies song sparrow contigs; Supplementary Sheet 2) versus all orthologous song sparrow contigs (Table 3). First, the chromosomal distribution of the candidate loci was significantly different from the distribution of all orthologous contigs (Gadj = 51.5, 27 d.f., P< 0.005), indicative of a non-random process (e.g. selection). Importantly, the chromosomal distribution of the 199 unique, mappable SNP/indel-bearing contigs between Attu and Adak islands (within the subspecies maxima), where we expected drift rather than selection to be more pronounced, was not significantly different from the chromosomal distribution of all orthologous contigs (Gadj = 35.1, 27 d.f., P> 0.1). Secondly, the greatest differences in the distribution of between-subspecies candidate loci from the distribution of all contigs occurred among chromosomes 2, 5, and Z (where proportionally fewer SNP/indel-bearing contigs occurred than expected) and chromosomes 3 and 11 (where relatively more SNP/indel-bearing contigs occurred than expected). Finally, in contrasting our between-subspecies results with those of our between-species comparisons above, we found that seven of the SNP/indel-bearing contigs between subspecies were also contigs that exhibited evidence suggestive of selection (high Ka/Ks values) when compared with the zebra finch and the chicken. Each contig has one between-subspecies SNP, and the functions of these loci are variable (Supplementary Sheet 4). Three of these seven occurred on chromosome 3 and one on chromosome 11, where the between-subspecies contrasts suggested elevated levels of SNPs/indels. These contigs and their chromosomal locations may thus be important in songbird divergence, but we do not yet know why.

Summary

In summary, our analysis identified the major categories of song sparrow genes and orthologous loci between song sparrow/zebra finch and song sparrow/chicken. Substitution rate estimation yielded the fastest evolving loci, and some of the loci that were fixed between subspecies were also highlighted as possibly under selection between the song sparrow and the zebra finch. Although additional sequencing of these libraries and validation of within-species SNPs/indels in multiple populations and lineages is required, we consider that the loci described here will include some of broad utility for studying the genomics of songbird divergence.

Supplementary data

Supplementary Data are available at www.dnaresearch.oxfordjournals.org.

Funding

This study was supported in part by resources and technical expertise from the University of Georgia, Georgia Advanced Computing Resource Center, a partnership between the Office of the Vice President for Research and the Office of the Chief Information Office.
  35 in total

1.  BLAT--the BLAST-like alignment tool.

Authors:  W James Kent
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

Review 2.  Adaptation genomics: the next generation.

Authors:  Jessica Stapley; Julia Reger; Philine G D Feulner; Carole Smadja; Juan Galindo; Robert Ekblom; Clair Bennison; Alexander D Ball; Andrew P Beckerman; Jon Slate
Journal:  Trends Ecol Evol       Date:  2010-10-16       Impact factor: 17.712

3.  Comparative genomics based on massive parallel transcriptome sequencing reveals patterns of substitution and selection across 10 bird species.

Authors:  Axel Künstner; Jochen B W Wolf; Niclas Backström; Osceola Whitney; Christopher N Balakrishnan; Lainy Day; Scott V Edwards; Daniel E Janes; Barney A Schlinger; Richard K Wilson; Erich D Jarvis; Wesley C Warren; Hans Ellegren
Journal:  Mol Ecol       Date:  2010-03       Impact factor: 6.185

Review 4.  The molecular ecologist's guide to expressed sequence tags.

Authors:  Amy Bouck; Todd Vision
Journal:  Mol Ecol       Date:  2007-03       Impact factor: 6.185

5.  Rapidly developing functional genomics in ecological model systems via 454 transcriptome sequencing.

Authors:  Christopher W Wheat
Journal:  Genetica       Date:  2008-10-18       Impact factor: 1.082

6.  Interpopulation patterns of divergence and selection across the transcriptome of the copepod Tigriopus californicus.

Authors:  Felipe S Barreto; Gary W Moy; Ronald S Burton
Journal:  Mol Ecol       Date:  2010-12-24       Impact factor: 6.185

7.  Nucleotide divergence vs. gene expression differentiation: comparative transcriptome sequencing in natural isolates from the carrion crow and its hybrid zone with the hooded crow.

Authors:  Jochen B W Wolf; Till Bayer; Bernhard Haubold; Markus Schilhabel; Philip Rosenstiel; Diethard Tautz
Journal:  Mol Ecol       Date:  2010-03       Impact factor: 6.185

8.  Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi.

Authors:  Sankar Subramanian; Leon Huynen; Craig D Millar; David M Lambert
Journal:  BMC Evol Biol       Date:  2010-12-15       Impact factor: 3.260

9.  A wing expressed sequence tag resource for Bicyclus anynana butterflies, an evo-devo model.

Authors:  Patrícia Beldade; Stephen Rudd; Jonathan D Gruber; Anthony D Long
Journal:  BMC Genomics       Date:  2006-05-31       Impact factor: 3.969

10.  Statistical methods for detecting molecular adaptation.

Authors: 
Journal:  Trends Ecol Evol       Date:  2000-12-01       Impact factor: 17.712

View more
  7 in total

Review 1.  Fifteen years of genomewide scans for selection: trends, lessons and unaddressed genetic sources of complication.

Authors:  Ryan J Haasl; Bret A Payseur
Journal:  Mol Ecol       Date:  2015-09-16       Impact factor: 6.185

2.  Brain transcriptome of the violet-eared waxbill Uraeginthus granatina and recent evolution in the songbird genome.

Authors:  Christopher N Balakrishnan; Charles Chapus; Michael S Brewer; David F Clayton
Journal:  Open Biol       Date:  2013-09-04       Impact factor: 6.411

3.  Transcriptional Profiling in Experimental Visceral Leishmaniasis Reveals a Broad Splenic Inflammatory Environment that Conditions Macrophages toward a Disease-Promoting Phenotype.

Authors:  Fanping Kong; Omar A Saldarriaga; Heidi Spratt; E Yaneth Osorio; Bruno L Travi; Bruce A Luxon; Peter C Melby
Journal:  PLoS Pathog       Date:  2017-01-31       Impact factor: 6.823

4.  A High-Quality Genome Assembly of the North American Song Sparrow, Melospiza melodia.

Authors:  Swarnali Louha; David A Ray; Kevin Winker; Travis C Glenn
Journal:  G3 (Bethesda)       Date:  2020-04-09       Impact factor: 3.154

5.  Profile of whole blood gene expression following immune stimulation in a wild passerine.

Authors:  Richard Meitern; Reidar Andreson; Peeter Hõrak
Journal:  BMC Genomics       Date:  2014-06-27       Impact factor: 3.969

6.  Brain transcriptome sequencing and assembly of three songbird model systems for the study of social behavior.

Authors:  Christopher N Balakrishnan; Motoko Mukai; Rusty A Gonser; John C Wingfield; Sarah E London; Elaina M Tuttle; David F Clayton
Journal:  PeerJ       Date:  2014-05-22       Impact factor: 2.984

7.  Embryological staging of the Zebra Finch, Taeniopygia guttata.

Authors:  Jessica R Murray; Claire W Varian-Ramos; Zoe S Welch; Margaret S Saha
Journal:  J Morphol       Date:  2013-06-27       Impact factor: 1.804

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.