| Literature DB >> 28431028 |
Jing Yang1, Guanghui Zhang2, Jing Zhang3, Hui Liu4,5, Wei Chen1,6, Xiao Wang4,5, Yahe Li7, Yang Dong1,6,8, Shengchao Yang2.
Abstract
Background: The plants in the Erigeron genus of the Compositae (Asteraceae) family are commonly called fleabanes, possibly due to the belief that certain chemicals in these plants repel fleas. In the traditional Chinese medicine, Erigeron breviscapus , which is native to China, was widely used in the treatment of cerebrovascular disease. A handful of bioactive compounds, including scutellarin, 3,5-dicaffeoylquinic acid, and 3,4-dicaffeoylquinic acid, have been isolated from the plant. With the purpose of finding novel medicinal compounds and understanding their biosynthetic pathways, we propose to sequence the genome of E. breviscapus . We assembled the highly heterozygous E. breviscapus genome using a combination of PacBio single-molecular real-time sequencing and next-generation sequencing methods on the Illumina HiSeq platform. The final draft genome is approximately 1.2 Gb, with contig and scaffold N50 sizes of 18.8 kb and 31.5 kb, respectively. Further analyses predicted 37 504 protein-coding genes in the E. breviscapus genome and 8172 shared gene families among Compositae species. The E. breviscapus genome provides a valuable resource for the investigation of novel bioactive compounds in this Chinese herb.Entities:
Keywords: Erigeron breviscapus; Illumina sequencing; PacBio sequencing
Mesh:
Substances:
Year: 2017 PMID: 28431028 PMCID: PMC5449645 DOI: 10.1093/gigascience/gix028
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:Example of the E. breviscapus (image from Shengchao Yang).
Figure 2:Assembly pipeline for the E. breviscapus genome.
Statistics of the completeness of the hybrid de novo assembly genome of E. breviscapus by CEGMA.
| Group | Protein Num[ | Completeness (%)[ | Total Num[ | Average Num[ | Ortholog (%)[ |
|---|---|---|---|---|---|
| Complete | 217 | 87.50 | 633 | 2.92 | 82.95 |
| Group 1 | 58 | 87.88 | 158 | 2.72 | 77.59 |
| Group 2 | 49 | 87.50 | 126 | 2.57 | 77.55 |
|
| 53 | 86.89 | 171 | 3.23 | 96.23 |
| Group 4 | 57 | 87.69 | 178 | 3.12 | 80.70 |
| Partial | 240 | 96.77 | 856 | 3.57 | 89.58 |
| Group 1 | 63 | 95.45 | 206 | 3.27 | 85.71 |
| Group 2 | 55 | 98.21 | 185 | 3.36 | 83.64 |
| Group 3 | 59 | 96.72 | 232 | 3.93 | 98.31 |
| Group 4 | 63 | 96.92 | 233 | 3.70 | 90.48 |
aProtein Num.: Number of 248 ultra-conserved core eukaryotic genes (CEGs) present in the E. breviscapus genome.
bCompleteness (%): Percentage of 248 ultra-conserved CEGs present in the E. breviscapus genome.
cTotal Num.: Total number of CEGs including putative orthologs present in the E. breviscapus genome.
dAverage Num: Average number of orthologs per CEG.
eOrtholog (%): Percentage of detected CEGs that have more than one ortholog.
Statistics of the completeness of the hybrid de novo assembly genome of E. breviscapus by BUSCO.
| BUSCO benchmark | Number | Percentage (%) |
|---|---|---|
| Total BUSCO groups searched | 1440 | – |
| Complete BUSCOs | 1161 | 80.63 |
| Complete and single-copy BUSCOs | 635 | 44.10 |
| Complete and duplicated BUSCOs | 526 | 36.53 |
| Fragmented BUSCOs | 90 | 6.25 |
| Missing BUSCOs | 189 | 13.13 |
Figure 3:Venn diagram showing unique and shared gene families among four sequenced dicotyledonous species.