| Literature DB >> 21909358 |
Inês C Conceição1, Anthony D Long, Jonathan D Gruber, Patrícia Beldade.
Abstract
BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPALEntities:
Mesh:
Substances:
Year: 2011 PMID: 21909358 PMCID: PMC3166123 DOI: 10.1371/journal.pone.0023778
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Main characteristics of the B. anynana BACs analyzed.
| BAC ID | Target Gene | Bany LG | Length | GC content | % repeats | TE type | KAIKO | Validated | # Unigenes | Sequence Annotation |
| (NCBI | CUGI | JGI) | (bp) | (%) | (S+LC) | BAC/cds | (%) | |||||
| AC239117 | 39A22 | 6132 | ap | Z | 98,442 | 36.1 | 0.3+1.9 | - | 35 (90) | 1(182) | 172/0 | 0.6/33.5/65.9 |
| AC239122 | 69H15 | 6133 | APC | 6 | 127,416 | 35.7 | 0.4+1.9 | - | 32 (177) | 4(678) | 163/3 | 6.5/16.9/76.6 |
| AC239115 | 19O01 | 6134 | DP | 4 | 126,420 | 34.7 | 1.2+2.6 | NonLTR/RTE-3 | 36 (183) | 10(449) | 141/23 | 10.6/33.7/55.7 |
| AC239118 | 39L19 | 6135 | dpp | 12 | 146,705 | 36.6 | 0.6+1.8 | NonLTR/DMRT1 | 52 (123) | 5(418) | 132/9 | 4.3/19.9/75.8 |
| AC239120 | 4N12 | 6136 | EcR | 10 | 97,772 | 36.3 | 0.4+2.0 | DNA/Mariner | 30 (122) | 4(401) | 180/10 | 4.9/65.9/29.2 |
| AC239116 | 23M04 | 6137 | en | 2 | 109,203 | 36.7 | 0.6+2.7 | LTR/BEL; Tc1-IS630-Pogo1 | 26 (166) | 2(1582) | 149/6 | 6.2/23.1/70.8 |
| AC239119 | 48N20 | 6138 | Dll | 2 | 122,698 | 35.4 | 0.6+2.8 | - | 45 (82) | 1(173) | 153/3 | 0.4/11.7/87.8 |
| AC239124 | 85J10 | 6139 | nkd | 5 | 121,582 | 35.3 | 0.3+2.2 | - | 28 (232) | 6(595) | 167/5 | 8.8/29.9/61.3 |
| AC239114 | 18H03 | 6140 | Sod2 | 3 | 108,600 | 36.1 | 0.6+3.0 | NonLTR/RTE-3 | 42 (147) | 11(275) | 161/16 | 7.7/32.1/60.3 |
| AC239121 | 68O14 | 6141 | v | 14 | 115,255 | 37.5 | 0.9+1.9 | NonLTR/CR1; LTR/Gypsy | 28 (225) | 8(601) | 120/22 | 12.5/42.3/45.2 |
| AC239123 | 84B11 | 6142 | Wg | 4 | 129,178 | 36.7 | 0.5+2.2 | NonLTR/RTE-1 | 44 (127) | 3(480) | 163/9 | 3.4/12.3/84.3 |
General characteristics observed for each BAC clone, including a summary of the annotation parameters discussed in the text.
BAC ID including NCBI accession number, BAC clone name from library at CUGI [27], and BAC sequence name from assembly by JGI. The CUGI and JGI names are used in the custom database [47];
Genes used to select the BACs for sequencing (see Introduction for abbreviations);
B. anynana linkage group based on mapping of the target genes (LG; [26]);
% repetitive sequence corresponding to single repeats (S) and low complexity (LC) regions as identified by RepeatMasker [41];
type of TEs, according to CENSOR [74] classification;
predicted peptides by Kaikogaas: number and (average aminoacid length);
predicted peptides after manual validation: number and (average aminoacid length);
Number of B. anynana UniGenes matching each BAC, and number corresponding to the validated predicted genes in Figure 1, cf. [47];
Percent sequence annotated as corresponding to protein-coding, intronic and intergenic DNA. Details in [47] and the supplementary files.
Figure 1Annotation of B. anynana genomic regions and fine-scale synteny with B. mori.
Each B. anynana BAC sequence is represented, with the corresponding scaffold in B. mori (including information on chromosomal location). Each putative gene is represented by a different color: B. anynana gene names in bold correspond to those on which BAC selection was based (Table 1), and B. mori gene names reflect SilkDB annotation (e.g., 010572 is SilkDB gene BGIBMGA010572). Exons are explicitly annotated for B. anynana as stripes of the same color (darker shade for duplicated exons). Arrows indicate the direction of transcription of each gene, and fine lines are used for highlighting chromosomal rearrangements. The figure contains a legend for the representation of sequence length, and for the protein-coding genes, repetitive sequence, transposable elements, and microRNA identified in this study. Details on all B. anynana predicted peptides can be found in Table S3.
Figure 2Conservation of DNA sequence in relation to other lepidopterans.
VISTA plots of all BAC sequences against B. mori and, when available, other lepidopterans (moths Bombyx mori, Helicoverpa armigera, Spodoptera frugipera, and butterflies Papilio dardanus, Heliconius melpomene). Regions more than 70% conserved in a 100 bp window (VISTA default settings) appear as peaks with blue corresponding to annotated protein-coding regions and red to conserved non-coding sequence. Figure S2 shows close-up and extended analysis of regions around genes wingless and Ecdysone receptor.
Figure 3Phylogenetic tree of Adh genes.
Neighbour-joining, unrooted tree reconstructed with MEGA 4 using the aminoacid sequence of the putative Adh genes in B. anynana (Bany, in green), together with the corresponding paralogs from chromosome 10 in B. mori (Bmori, in red, showing Silkdb gene accessions and BLAST results) and D. melanogaster (Dmel, in blue, showing FlyBase gene accessions). Numbers are bootstrap values for 1000 replicates.