| Literature DB >> 27018591 |
Zhipeng Su1, Jiawen Zhu1, Zhuofei Xu1, Ran Xiao1, Rui Zhou1,2, Lu Li1,2, Huanchun Chen1,2.
Abstract
Actinobacillus pleuropneumoniae is the pathogen of porcine contagious pleuropneumoniae, a highly contagious respiratory disease of swine. Although the genome of A. pleuropneumoniae was sequenced several years ago, limited information is available on the genome-wide transcriptional analysis to accurately annotate the gene structures and regulatory elements. High-throughput RNA sequencing (RNA-seq) has been applied to study the transcriptional landscape of bacteria, which can efficiently and accurately identify gene expression regions and unknown transcriptional units, especially small non-coding RNAs (sRNAs), UTRs and regulatory regions. The aim of this study is to comprehensively analyze the transcriptome of A. pleuropneumoniae by RNA-seq in order to improve the existing genome annotation and promote our understanding of A. pleuropneumoniae gene structures and RNA-based regulation. In this study, we utilized RNA-seq to construct a single nucleotide resolution transcriptome map of A. pleuropneumoniae. More than 3.8 million high-quality reads (average length ~90 bp) from a cDNA library were generated and aligned to the reference genome. We identified 32 open reading frames encoding novel proteins that were mis-annotated in the previous genome annotations. The start sites for 35 genes based on the current genome annotation were corrected. Furthermore, 51 sRNAs in the A. pleuropneumoniae genome were discovered, of which 40 sRNAs were never reported in previous studies. The transcriptome map also enabled visualization of 5'- and 3'-UTR regions, in which contained 11 sRNAs. In addition, 351 operons covering 1230 genes throughout the whole genome were identified. The RNA-Seq based transcriptome map validated annotated genes and corrected annotations of open reading frames in the genome, and led to the identification of many functional elements (e.g. regions encoding novel proteins, non-coding sRNAs and operon structures). The transcriptional units described in this study provide a foundation for future studies concerning the gene functions and the transcriptional regulatory architectures of this pathogen.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27018591 PMCID: PMC4809551 DOI: 10.1371/journal.pone.0152363
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Clusters of orthologous groups (COG) classification of the expressed A. pleuropneumoniae genes.
Bars in dark gray indicate numbers of expressed genes (n = 1845); while bars in white indicate numbers of all protein-coding genes (n = 2,036) in A. pleuropneumoniae JL03. COG categories: J, translation; A, RNA processing and modification; K, transcription; L, DNA replication, recombination and repair; D, cell division and chromosome partitioning; V, defense mechanisms; T, signal transduction; M, cell wall/membrane biogenesis; U, intracellular trafficking, secretion and vesicular transport; O, posttranslational modification, protein turnover and chaperones; C, energy production and conversion; G, carbohydrate transport and metabolism; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; H, coenzyme metabolism; I, lipid metabolism; P, inorganic ion transport and metabolism; Q, secondary metabolites biosynthesis, transport and catabolism; R, general functional prediction only; S, function-unassigned; -, unknown proteins not in the COG categories.
Fig 2Worlflow of RNA-Seq data analysis.
RNA-seq reads were mapped to the genome using the alignment program BWA. Combining with the existing annotation, expressed annotated regions and expressed intergenic regions are generated by using SAMTools and Perl scripts. Novel protein coding genes, UTRs and sRNAs are identified.
Novel protein coding genes identified in the A. pleuropneumoniae JL03 genome.
| ID | Strand | Length (bp) | Start | End | Top BLASTX Hit (Protein Accession No.) | Identity (%) | Query coverage (%) | Annotation |
|---|---|---|---|---|---|---|---|---|
| APP-P01 | - | 159 | 14888 | 15046 | EFX91114.1 | 79 | 81 | hypothetical protein HMPREF0027_1817 [ |
| APP-P02 | + | 141 | 748863 | 749003 | EFM92177.1 | 98 | 97 | hypothetical protein appser6_7570 [ |
| APP-P03 | + | 123 | 126371 | 126493 | EFM86287.1 | 100 | 97 | hypothetical protein appser1_1260 [ |
| APP-P04 | + | 144 | 130000 | 130143 | WP_041603459.1 | 100 | 100 | hypothetical protein [ |
| APP-P05 | - | 96 | 165125 | 165220 | EFM86331.1 | 100 | 100 | hypothetical protein appser1_1700 [ |
| APP-P06 | - | 138 | 215622 | 215759 | ACE60849.1 | 100 | 97 | hypothetical protein APP7_0197 [ |
| APP-P07 | + | 123 | 348715 | 348837 | EFM86224.1 | 100 | 97 | hypothetical protein appser1_3330 [ |
| APP-P08 | - | 114 | 564175 | 564288 | AFU19023.1 | 100 | 94 | hypothetical protein ASU2_04425 [ |
| APP-P09 | - | 120 | 753874 | 753993 | EFM85822.1 | 97 | 95 | hypothetical protein appser1_7490 [Actinobacillus pleuropneumoniae serovar 1 str. 4074] |
| APP-P10 | + | 138 | 780517 | 780654 | AGQ24929.1 | 100 | 91 | DNA methylase [ |
| APP-P11 | - | 78 | 965335 | 965412 | EFL78484.1 | 100 | 100 | hypothetical protein APP2_1809 [ |
| APP-P12 | - | 150 | 977266 | 977415 | EFM96564.1 | 81 | 64 | Leucyl-tRNA synthetase [ |
| APP-P13 | - | 120 | 1005003 | 1005122 | EFM94226.1 | 100 | 97 | hypothetical protein appser9_9720 [ |
| APP-P14 | + | 114 | 1035429 | 1035542 | EFM85421.1 | 100 | 97 | hypothetical protein appser1_10430 [ |
| APP-P15 | + | 162 | 1051786 | 1051946 | EFM91996.1 | 96 | 98 | hypothetical protein appser6_10890 [ |
| APP-P16 | - | 123 | 1113719 | 1113841 | EFM85479.1 | 100 | 97 | hypothetical protein appser1_11010 [ |
| APP-P17 | - | 108 | 1229502 | 1229609 | ACE61803.1 | 42 | 91 | phosphoribosylformylglycinamidine cyclo-ligase [ |
| APP-P18 | - | 219 | 1457550 | 1457768 | WP_039709255.1 | 100 | 98 | hypothetical protein [ |
| APP-P19 | + | 270 | 1457617 | 1457886 | AFU19571.1 | 72 | 98 | hypothetical protein ASU2_07165 [ |
| APP-P20 | - | 84 | 1459981 | 1460064 | EFM85119.1 | 100 | 73 | hypothetical protein appser2_13220 [ |
| APP-P21 | + | 201 | 1492300 | 1492500 | EFX92265.1 | 78 | 47 | hypothetical protein HMPREF0027_0694 [ |
| APP-P22 | - | 138 | 1519116 | 1519253 | EFM85119.1 | 100 | 73 | hypothetical protein appser1_14900 [ |
| APP-P23 | - | 144 | 1627977 | 1628120 | EFM91430.1 | 100 | 97 | hypothetical protein appser6_16080 [ |
| APP-P24 | - | 111 | 1660694 | 1660804 | AEC16624.1 | 47 | 91 | putative transposase [ |
| APP-P25 | - | 123 | 1706458 | 1706580 | EFM84885.1 | 100 | 97 | hypothetical protein appser1_16470 [ |
| APP-P26 | - | 128 | 1753350 | 1753477 | EQA09464.1 | 96 | 63 | naphthoate synthase [ |
| APP-P27 | - | 138 | 1928943 | 1929080 | EFM86903.1 | 100 | 100 | hypothetical protein appser2_17520 [Actinobacillus pleuropneumoniae serovar 2 str. S1536] |
| APP-P28 | + | 162 | 1933706 | 1933867 | EDN73192.1 | 64 | 70 | hypothetical protein MHA_0204 [ |
| APP-P29 | + | 214 | 1995103 | 1995316 | AEC16624.1 | 99 | 47 | putative transposase [ |
| APP-P30 | + | 207 | 2026720 | 2026926 | KDD80553.1 | 68 | 49 | hypothetical protein HPS41_03095, partial [ |
| APP-P31 | + | 126 | 2062043 | 2062168 | EFM84509.1 | 98 | 97 | hypothetical protein appser1_19900 [ |
| APP-P32 | + | 111 | 2164127 | 2164237 | EFM84431.1 | 100 | 97 | hypothetical protein appser1_20960 [ |
Fig 3Identification of a novel protein coding gene within the single-nucleotide resolution transcriptome map of A. pleuropneumoniae.
Genomic loci and transcriptional profiles were visualized in the Artemis genome browser. The figure shows identification of a novel protein coding gene ‘‘APP-P04” using RNA-seq data. A BLASTX search of the ORF shows the homology (similarity 100%, sequence coverage 100%) to a hypothetical protein in A. pleuropneumoniae.
Corrections made to the existing genome annotation.
| Gene id | Previous location (Start-End) | Corrected location (Start-End) | Strand |
|---|---|---|---|
| APJL_0019 | 20445–21500 | 20262–21500 | + |
| APJL_0022 | 24103–24792 | 23998–24792 | + |
| APJL_0040 | 42108–43754 | 42069–43754 | + |
| APJL_0076 | 85484–86218 | 85484–86341 | - |
| APJL_0181 | 200430–200951 | 200430–201254 | - |
| APJL_0188 | 207972–209312 | 207972–209409 | - |
| APJL_0194 | 213857–214846 | 213857–214957 | - |
| APJL_0295 | 316686–317570 | 316566–317570 | - |
| APJL_0486 | 537367–538905 | 537148–538905 | + |
| APJL_0500 | 550347–553106 | 550233–553106 | + |
| APJL_0519 | 565329–566192 | 565197–566192 | + |
| APJL_0617 | 670415–670906 | 670415–670996 | - |
| APJL_0669 | 731972–733252 | 731972–733378 | - |
| APJL_0734 | 804664–805113 | 804541–805113 | + |
| APJL_0743 | 811177–811851 | 811177–811882 | - |
| APJL_0838 | 926340–927020 | 926340–927185 | - |
| APJL_1170 | 1291694–1292098 | 1291628–1292098 | + |
| APJL_1179 | 1302358–1303407 | 1302358–1303479 | - |
| APJL_1230 | 1351089–1353617 | 1351089–1353755 | - |
| APJL_1291 | 1419511–1420647 | 1419511–1420770 | - |
| APJL_1320 | 1447455–1447955 | 1447455–1448069 | - |
| APJL_1343 | 1473151–1474092 | 1473079–1474092 | + |
| APJL_1347 | 1481231–1481647 | 1481231–1481752 | - |
| APJL_1363 | 1501745–1502311 | 1501745–1502398 | - |
| APJL_1384 | 1519258–1520565 | 1519258–1520709 | - |
| APJL_1510 | 1660978–1661679 | 1660978–1661838 | - |
| APJL_1590 | 1748017–1749288 | 1748017–1749402 | - |
| APJL_1598 | 1756635–1758095 | 1756635–1758278 | - |
| APJL_1617 | 1774727–1775452 | 1774655–1775452 | + |
| APJL_1658 | 1811042–1811797 | 1810898–1811797 | + |
| APJL_1761 | 1916914–1917699 | 1774655–1775452 | + |
| APJL_1947 | 2099009–2099719 | 2098925–2099719 | + |
| APJL_2009 | 2160426–2161499 | 2160132–2161499 | + |
| APJL_2035 | 2180950–2181522 | 2180776–2181522 | + |
| APJL_2074 | 2218634–2219587 | 2218511–2219587 | + |
Fig 4Correction made to the start site of a gene.
Visualization of the single-nucleotide resolution transcriptome map shows the transcription of upstream of the gene ‘‘APJL_1947”. The predicted and actual start codons within the ORF are marked.
sRNAs identified in A. pleuropneumoniae JL03.
| ID | Start | End | Length (nt) | Promoter | Terminator | Flanking gene (left) | Flanking gene (right) | Rfam annotation | Conservation across other genome |
|---|---|---|---|---|---|---|---|---|---|
| APP-S1 | 43810 | 43854 | 45 | Y | - | APJL_0040(+) | APJL_0041(+) | - | A |
| APP-S2 | 69850 | 69943 | 94 | Y | - | APJL_0062(-) | APJL_0062(+) | - | A |
| APP-S3 | 69955 | 70012 | 58 | Y | - | APJL_0062(-) | APJL_0062(+) | - | B |
| APP-S4 | 73341 | 73502 | 162 | Y | - | APJL_0062(-) | APJL_0062(+) | - | A |
| APP-S5 | 128399 | 128496 | 98 | Y | Y | APJL_0108(-) | APJL_0109(-) | - | A |
| APP-S6 | 129934 | 130203 | 270 | Y | - | APJL_0110(-) | APJL_0111(-) | 6S | C |
| APP-S7 | 136144 | 136180 | 36 | - | Y | APJL_0118(+) | APJL_0119(-) | - | A |
| APP-S8 | 136390 | 136532 | 132 | Y | - | APJL_0119(-) | APJL_0120(+) | cspA | C |
| APP-S9 | 147458 | 147839 | 381 | - | Y | APJL_0131(-) | APJL_0132(+) | GcvB | B |
| APP-S10 | 184303 | 184395 | 93 | Y | - | APJL_0165(-) | APJL_0166(+) | - | A |
| APP-S11 | 184404 | 184496 | 93 | Y | - | APJL_0165(-) | APJL_0166(+) | - | A |
| APP-S12 | 193258 | 193351 | 94 | Y | Y | APJL_0174(-) | APJL_0175(+) | - | A |
| APP-S13 | 283501 | 283565 | 65 | Y | - | APJL_0261(+) | APJL_0262(-) | - | A |
| APP-S14 | 383608 | 383698 | 91 | Y | Y | APJL_0358(-) | APJL_0359(-) | - | A |
| APP-S15 | 466389 | 466478 | 90 | - | Y | APJL_0425(-) | APJL_0426(+) | - | A |
| APP-S16 | 517137 | 517215 | 79 | - | Y | APJL_0468(+) | APJL_0471(-) | - | A |
| APP-S17 | 598881 | 599063 | 182 | - | Y | APJL_0553(+) | APJL_0559(+) | t44 | B |
| APP-S18 | 703124 | 703153 | 30 | Y | Y | APJL_0644(-) | APJL_0645(-) | - | C |
| APP-S19 | 753851 | 754045 | 195 | - | Y | APJL_0686(-) | APJL_0687(-) | MOCO_RNA_motif | B |
| APP-S20 | 755351 | 755515 | 164 | - | Y | APJL_0687(-) | APJL_0688(+) | MOCO_RNA_motif | B |
| APP-S21 | 765539 | 765588 | 50 | Y | Y | APJL_0696(-) | APJL_0697(-) | - | B |
| APP-S22 | 773794 | 773833 | 40 | - | Y | APJL_0700(-) | APJL_0703(-) | - | A |
| APP-S23 | 783905 | 783980 | 76 | Y | - | APJL_0707(-) | APJL_0708(+) | - | A |
| APP-S24 | 801903 | 801952 | 50 | Y | Y | APJL_0730(+) | APJL_0731(-) | - | B |
| APP-S25 | 888250 | 888433 | 184 | Y | - | APJL_0809(-) | APJL_0810(-) | Bacteria_small_SRP | C |
| APP-S26 | 908467 | 908556 | 90 | Y | - | APJL_0825(-) | APJL_0826(+) | - | A |
| APP-S27 | 1004946 | 1005007 | 62 | Y | - | APJL_0908(+) | APJL_0909(+) | - | A |
| APP-S28 | 1144702 | 1144793 | 92 | Y | Y | APJL_1031(-) | APJL_1032(-) | - | A |
| APP-S29 | 1211704 | 1211736 | 33 | Y | Y | APJL_1093(-) | APJL_1094(-) | - | D |
| APP-S30 | 1227906 | 1227961 | 56 | Y | Y | APJL_1107(-) | APJL_1108(-) | - | A |
| APP-S31 | 1236854 | 1236954 | 101 | Y | Y | APJL_1117(+) | APJL_1118(+) | - | A |
| APP-S32 | 1239160 | 1239199 | 40 | Y | Y | APJL_1119(+) | APJL_1120(+) | - | C |
| APP-S33 | 1253148 | 1253181 | 34 | Y | Y | APJL_1129(+) | APJL_1130(+) | - | B |
| APP-S34 | 1269545 | 1269848 | 303 | Y | - | APJL_1146(-) | APJL_1147(-) | Glycine | C |
| APP-S35 | 1304170 | 1304219 | 50 | - | Y | APJL_1180(-) | APJL_1181(-) | - | A |
| APP-S36 | 1327989 | 1328023 | 35 | - | Y | APJL_1205(+) | APJL_1206(-) | - | B |
| APP-S37 | 1375766 | 1375806 | 41 | - | Y | APJL_1251(+) | APJL_1252(+) | - | B |
| APP-S38 | 1431012 | 1431043 | 32 | Y | - | APJL_1305(-) | APJL_1306(+) | - | B |
| APP-S39 | 1471055 | 1471094 | 40 | Y | - | APJL_1340(+) | APJL_1341(+) | - | A |
| APP-S40 | 1575463 | 1575524 | 62 | Y | Y | APJL_1433(+) | APJL_1434(+) | - | A |
| APP-S41 | 1591719 | 1591803 | 85 | Y | - | APJL_1447(-) | APJL_1448(+) | - | A |
| APP-S42 | 1610164 | 1610233 | 70 | Y | - | APJL_1463(-) | APJL_1464(+) | - | A |
| APP-S43 | 1742806 | 1742919 | 114 | Y | - | APJL_1585(-) | APJL_1586(+) | - | A |
| APP-S44 | 1810752 | 1810833 | 82 | Y | - | APJL_1657(-) | APJL_1658(+) | - | A |
| APP-S45 | 1861523 | 1861613 | 91 | - | Y | APJL_1704(+) | APJL_1705(+) | - | A |
| APP-S46 | 1893513 | 1893575 | 63 | Y | - | APJL_1732(-) | APJL_1733(+) | - | A |
| APP-S47 | 1968939 | 1969074 | 136 | Y | Y | APJL_1816(+) | APJL_1817(+) | Alpha_RBS | C |
| APP-S48 | 1994835 | 1994869 | 35 | Y | - | APJL_1847(-) | APJL_1848(+) | - | B |
| APP-S49 | 2032687 | 2032872 | 186 | Y | - | APJL_1880(+) | APJL_1881(-) | - | A |
| APP-S50 | 2054874 | 2054924 | 51 | Y | Y | APJL_1898(+) | APJL_1899(+) | - | A |
| APP-S51 | 2212991 | 2213203 | 212 | - | Y | APJL_2068(-) | APJL_2069(+) | His_leader | B |
#The start and end represents the boundaries of identified TAR(transcriptionally active region)which is a potential sRNA region.
*sRNA sequences conserved in; A- Actinobacillus pleuropneumoniae. B- Actinobacillus. C- Pasteurellales. D-across distant bacterial species.
Fig 5Identification of potential sRNA and UTR.
(A) Identification of a well conserved sRNA ‘‘APP-S17” in the 5’ UTR region of the gene ‘‘APJL_0559”, which was classified as t44 RNA by Rfam. (B) Identification of a highly expressed sRNA ‘‘APP-S25” annotated by Rfam as bacterial signal recognition particle RNA. (C) Identification of a novel sRNA ‘‘APP-S4” in the intergenic region of A. pleuropneumoniae JL03 genome. (D) Identification of the 5’UTR region of the gene ‘‘APJL_0132”.
Fig 6Single-nucleotide resolution transcriptome map revealing the operon structures of A. pleuropneumoniae JL03.
The operons predicted by RNA-seq and the operons predicted by DOOR were compared. (A) An operon common to both DOOR and RNA-Seq. (B) An operon identified based on RNA-seq data but not previously predicted by DOOR. (C) An operon identified based on RNA-seq data having different sizes with the one predicted by DOOR.