| Literature DB >> 26840333 |
Oleg Anatolyevich Zverkov1, Alexandr Vladislavovich Seliverstov2, Vassily Alexandrovich Lyubetsky3.
Abstract
A novel algorithm and original software were used to cluster all proteins encoded in plastids of 72 species of the rhodophytic branch. The results are publicly available at http://lab6.iitp.ru/ppc/redline72/ in a database that allows fast identification of clusters (protein families) both by a fragment of an amino acid sequence and by a phylogenetic profile of a protein. No such integral clustering with the corresponding functions can be found in the public domain. The putative regulons of the transcription factors Ycf28 and Ycf29 encoded in the plastids were identified using the clustering and the database. A regulation of translation initiation was proposed for the ycf24 gene in plastids of certain red algae and apicomplexans as well as a regulation of a putative gene in apicoplasts of Babesia spp. and Theileria parva. The conserved regulation of the ycf24 gene expression and specificity alternation of the transcription factor Ycf28 were shown in the plastids. A phylogenetic tree of plastids was generated for the rhodophytic branch. The hypothesis of the origin of apicoplasts from the common ancestor of all apicomplexans from plastids of red algae was confirmed.Entities:
Keywords: clustering; plastid; protein; transcription factor; translation initiation
Year: 2016 PMID: 26840333 PMCID: PMC4810238 DOI: 10.3390/life6010007
Source DB: PubMed Journal: Life (Basel) ISSN: 2075-1729
Numbers of proteins (P), clusters (C), and singletons (S) per species.
| Locus | Species | P | C | S | Locus | Species | P | C | S |
|---|---|---|---|---|---|---|---|---|---|
| NC_024079.1 | 134 | 129 | 0 | NC_024084.1 | 132 | 130 | 0 | ||
| NC_024080.1 | 145 | 138 | 1 | NC_022667.1 | 30 | 30 | 0 | ||
| NC_012898.1 | 105 | 105 | 0 | NC_024085.1 | 138 | 129 | 0 | ||
| NC_012903.1 | 110 | 110 | 0 | NC_020014.1 | 119 | 116 | 3 | ||
| NC_011395.1 | 32 | 26 | 3 | NC_022259.1 | 125 | 123 | 0 | ||
| LK028575.1 | 31 | 22 | 7 | NC_022262.1 | 124 | 123 | 0 | ||
| NC_028029.1 | 38 | 28 | 7 | NC_022263.1 | 126 | 123 | 1 | ||
| NC_021075.1 | 201 | 200 | 1 | NC_022260.1 | 126 | 123 | 0 | ||
| NC_025313.1 | 132 | 130 | 0 | NC_022261.1 | 123 | 123 | 0 | ||
| NC_025310.1 | 131 | 128 | 0 | NC_001713.1 | 140 | 128 | 9 | ||
| NC_020795.1 | 204 | 204 | 0 | NC_020371.1 | 111 | 103 | 8 | ||
| NC_026522.1 | 71 | 71 | 0 | NC_016703.2 |
| 108 | 108 | 0 | |
| NC_014340.2 | 78 | 51 | 24 | NC_021637.1 | 108 | 108 | 0 | ||
| NC_014345.1 | 81 | 69 | 5 | NC_008588.1 | 132 | 130 | 0 | ||
| NC_024081.1 | 139 | 130 | 0 | NC_023293.1 | 31 | 31 | 0 | ||
| NC_013703.1 |
| 82 | 79 | 3 | NC_017932.1 | 31 | 31 | 0 | |
| NC_004799.1 | 207 | 189 | 18 | NC_000925.1 | 209 | 209 | 0 | ||
| NC_001840.1 | 197 | 186 | 11 | NC_023133.1 | 224 | 183 | 40 | ||
| KP866208.1 | 28 | 27 | 1 | NC_027721.1 | 104 | 103 | 1 | ||
| NC_024082.1 | 161 | 142 | 12 | NC_021189.1 | 211 | 210 | 1 | ||
| NC_024083.1 | 130 | 128 | 0 | NC_024050.1 | 209 | 207 | 2 | ||
| NC_014287.1 | 129 | 127 | 0 | NC_007932.1 | 209 | 206 | 3 | ||
| NC_013498.1 | 148 | 143 | 1 | NC_025311.1 | 135 | 123 | 1 | ||
| NC_004823.1 | 28 | 26 | 2 | NC_009573.1 | 146 | 145 | 1 | ||
| NC_007288.1 | 119 | 112 | 7 | NC_025312.1 | 140 | 126 | 0 | ||
| NC_024928.1 | 160 | 136 | 2 | NC_018523.1 | 139 | 139 | 0 | ||
| NC_015403.1 | 135 | 130 | 1 | NC_027589.1 | 143 | 143 | 0 | ||
| NC_016735.1 | 139 | 139 | 0 | NC_014808.1 | 142 | 126 | 1 | ||
| NC_024665.1 | 182 | 181 | 1 | NC_008589.1 | 141 | 127 | 0 | ||
| NC_023785.1 | 202 | 200 | 2 | NC_025314.1 | 141 | 127 | 0 | ||
| NC_006137.1 | 203 | 201 | 2 | NC_007758.1 | 44 | 27 | 12 | ||
| NC_021618.1 | 233 | 201 | 32 | NC_001799.1 | 26 | 21 | 5 | ||
| NC_000926.1 | 147 | 142 | 5 | NC_026851.1 | 137 | 124 | 8 | ||
| NC_010772.1 | 156 | 139 | 3 | NC_027746.1 | 141 | 135 | 4 | ||
| NC_014267.1 | 139 | 132 | 6 | NC_016731.1 | 130 | 128 | 0 | ||
| NC_027093.1 | 62 | 52 | 7 | NC_026523.1 | 192 | 191 | 1 |
Figure 1Dependence of the number of clusters on the number of species represented in it.
Figure 2Tree of apicoplasts. Chromera velia and Chromerida sp. RM11 plastids were used as the outgroup.
Figure 3Connectivity components of the sparse graph of proteins. Red dots represent proteins and lines represent bidirectional BLAST hits.
Ycf29 proteins encoded in the plastids of the rhodophytic branch.
| Accession | Source | Protein Description |
|---|---|---|
| YP_007878178.1 | conserved hypothetical plastid protein | |
| YP_007627336.1 | conserved hypothetical plastid protein | |
| YP_009122074.1 | hypothetical protein | |
| YP_003359295.1 | TctD-like protein | |
| NP_849011.1 | ompR-like transcriptional regulator | |
| NP_045122.1 | regulatory component of sensory transduction system | |
| YP_009051025.1 | putative transcriptional regulator LuxR | |
| YP_009019567.1 | tctD transcriptional regulator | |
| YP_063559.1 | tctD transcriptional regulator | |
| YP_008144796.1 | putative transcriptional regulator Ycf29 | |
| NP_050668.1 | tctD homolog | |
| NP_053953.1 | ORF29 | |
| YP_007947873.1 | hypothetical chloroplast protein 29 | |
| YP_009027627.1 | hypothetical chloroplast protein 29 | |
| YP_537024.1 | hypothetical chloroplast protein 29 | |
| YP_001293481.1 | TctD-like protein | |
| YP_009159161.1 | TctD-like protein | |
| YP_009122313.1 | hypothetical protein |
Figure 4Graph of Ycf19 and Ycf89 proteins. Colors indicate the proteins annotation: red = Ycf19, green = Ycf89, gray = no name specified.
Figure 5Sequence LOGO of the putative site in the 5’-untranslated region of ycf24 (sufB).
Figure 6The tree of proteins encoded by the plastid genes located between the rpl14 and rps8 genes in Babesia spp. and Theileria parva. The plastid protein L5 from Chromera velia was used as the outgroup.
The plastid proteins in Piroplasmida with a marginal similarity with the ribosomal protein L5 discussed in Section 3.4 and Section 4.4 are shown.
| Accession | Source | Protein Description |
|---|---|---|
| YP_002290869.1 | hypothetical protein | |
| CDR32594.1 | ribosomal protein L5 | |
| YP_009170371.1 | ribosomal protein L5 | |
| XP_762679.1 | hypothetical protein |