| Literature DB >> 35330626 |
Yuan Ji1,2, Liming Zhu1, Zhaodong Hao1, Shunde Su3, Xueyan Zheng4, Jisen Shi1, Renhua Zheng3, Jinhui Chen1.
Abstract
Cunninghamia lanceolata (Lamb.)Hook is an important economic timber tree in China. However, its genome characteristics have not been extensively assessed. To better understand its genome information, the bacterial artificial chromosome (BAC) library of chinese fir was constructed. A total of 422 BAC clones were selected and divided into 10 pools and sequenced, and with an average insert size of 121 kb, ranging from 97 to 145 kb. A total of 61,902,523 bp of reference sequences were sequenced and assembled, and based on an estimated genome size of 11.6 Gb for Chinese fir, the BAC library was estimated to have a total coverage of 0.53% genome equivalents. Bioinformatics analyses were also performed for repeated sequences, tRNAs, coding gene prediction, and functional annotation. The results of this study provide insights into the brief structure of the Chinese fir genome and has generated gene data that will facilitate molecular investigations on the mechanisms underlying tree growth.Entities:
Keywords: BAC; China; Chinese fir; Cunninghamia lanceolata (lamb.); genome
Year: 2022 PMID: 35330626 PMCID: PMC8940289 DOI: 10.3389/fbioe.2022.854130
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
FIGURE 1Pulsed-field gel electrophoresis of randomly selected. clones from Chinese fir BAC library.
Summary of sequenced Chinese fir BAC clones.
| BAC | Number |
|---|---|
| Number of BAC sequences | 422 |
| Mean base count (bp) | 146,688 |
| Total base count (bp) | 61,902,523 |
| Maximum BAC length (bp) | 287,835 |
| Minimum BAC length (bp) | 21,158 |
Number and length of repeats in Chinese fir BAC.
| Class | Superfamily | No | Length | Percentage |
|---|---|---|---|---|
| SINEs | 2 | 74 bp | 0% | |
| ALUs | 0 | 0 bp | 0% | |
| MIRs | 1 | 12 bp | 0% | |
| LINEs | 572 | 510,911 bp | 0.83% | |
| LINE1 | 358 | 332,767 bp | 0.54% | |
| LINE2 | 14 | 1,612 bp | 0% | |
| L3/CR1 | 2 | 120 bp | 0% | |
| LTR elements | 18,496 | 29,511,274 bp | 47.67% | |
| ERVL | 0 | 0 bp | 0% | |
| ERVL-MaLRs | 0 | 0 bp | 0% | |
| ERV_classI | 0 | 0 bp | 0% | |
| ERV_classII | 241 | 331,347 bp | 0.54% | |
| DNA elements | 693 | 716,829 bp | 1.16% | |
| hAT-Charlie | 2 | 13 bp | 0% | |
| TcMar-Tigger | 0 | 0 bp | 0% | |
| Unclassified | 14,951 | 9,930,497 bp | 16.04% | |
| Totalinterspersed repeats | 4,0,669,585 bp | 65.7% | ||
| Small RNAs | 7 | 4,443 bp | 0.01% | |
| Satellites | 15 | 3,028 bp | 0% | |
| Simplerepeats | 7,110 | 6,52,511 bp | 1.05% | |
| Lowcomplexity | 1,450 | 1,01,852 bp | 0.16% |
Sequences based on RepeatMasker analysis.
FIGURE 3Venn diagram of the number of proteins annotated against public protein databases. A total of 445 proteins were differentially expressed, of which 336, 257, and 387 genes were functionally annotated using GO, KOG, and KEGG, respectively. Around 169 tags were shared by the three groups.
FIGURE 4GO classification.
FIGURE 5KOG classification.
FIGURE 6KEGG classification.
FIGURE 7Percentagedot plot analysis of SSR locus statistics using BAC data.
FIGURE 8The fragment matching rate insix pools.