| Literature DB >> 31696928 |
Rui Zhang1, Fangfang Ge1, Huayang Li1, Yudong Chen1, Ying Zhao1, Ying Gao1, Zhiguo Liu1, Long Yang1.
Abstract
Inverted repeats (IRs) serve as potential biomarkers for genomic instability, DNA replication and other genetic processes. However, little information can be found in databases to help researchers recognize potential IR nucleotides, explore junction sites and annotate related functional genes. Plant Chloroplast Inverted Repeats (PCIR) is an interactive, web-based platform containing various sequenced chloroplast genomes that enables detection, searching and visualization of large-scale detailed information on IRs. PCIR contains many datasets, including 21 433 IRs, 113 plants chloroplast genomes, 16 948 functional genes and 21 659 visual maps. This database offers an online prediction tool for detecting IRs based on DNA sequences. PCIR can also analyze phylogenetic relationships using IR information among different species and provide users with high-quality marker maps. This database will be a valuable resource for IR distribution patterns, related genes and architectural features.Entities:
Mesh:
Year: 2019 PMID: 31696928 PMCID: PMC6835207 DOI: 10.1093/database/baz127
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1Schematic overview of IR database. (A) Data sources of PCIR. (B) Workflow of PCIR. (C) Data acquisition in the platform.
Numbers, frequencies and proportion of IRs according to size
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| 6 | 7260 | 0.4214 | 0.3387 | 19 | 283 | 0.0164 | 0.0132 |
| 7 | 3468 | 0.2013 | 0.1618 | 20 | 250 | 0.0145 | 0.0117 |
| 8 | 2069 | 0.1201 | 0.0965 | 21 | 231 | 0.0134 | 0.0108 |
| 9 | 1398 | 0.0811 | 0.0652 | 22 | 216 | 0.0125 | 0.0101 |
| 10 | 1107 | 0.0643 | 0.0516 | 23 | 191 | 0.0111 | 0.0089 |
| 11 | 917 | 0.0532 | 0.0428 | 24 | 174 | 0.0101 | 0.0081 |
| 12 | 685 | 0.0398 | 0.0320 | 25 | 148 | 0.0086 | 0.0069 |
| 13 | 567 | 0.0329 | 0.0265 | 26 | 130 | 0.0075 | 0.0060 |
| 14 | 497 | 0.0288 | 0.0232 | 27 | 108 | 0.0063 | 0.0050 |
| 15 | 437 | 0.0254 | 0.0204 | 28 | 95 | 0.0055 | 0.0044 |
| 16 | 382 | 0.0222 | 0.0178 | 29 | 84 | 0.0049 | 0.0039 |
| 17 | 349 | 0.0203 | 0.0163 | 30 | 74 | 0.0043 | 0.0035 |
| 18 | 313 | 0.0182 | 0.0146 |
Figure 2Retrieval of IR sites in the database. (A) The tool interfaces. The sequences in FASTA format can be filled into the test area to find IR sites. (B) The display of a sample tool’s results. The results contain sequence name, result number, base number, motif structure, region start, region end and sequence size. (C) The search interfaces. IR sites can be searched by species name, markers ID, motif structure and repeat positions. (D) The display of a sample search’s results. The results include species, type, ID, motif, closest genes, start position and end position. (E) A sample (e.g. A. thaliana) detail information of IRs. The results contain species, type, id, number, position, circular complete chloroplast genome map and IR regions map.
Figure 3Evolutionary tree of plants. NJ phylogenetic tree reconstruction contains 113 species among 60 families based on concatenated sequences using a maximum likelihood method of all chloroplast genomes.