| Literature DB >> 23557257 |
Alberto Rastrojo1, Fernando Carrasco-Ramiro, Diana Martín, Antonio Crespillo, Rosa M Reguera, Begoña Aguado, Jose M Requena.
Abstract
BACKGROUND: Although the genome sequence of the protozoan parasite Leishmania major was determined several years ago, the knowledge of its transcriptome was incomplete, both regarding the real number of genes and their primary structure.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23557257 PMCID: PMC3637525 DOI: 10.1186/1471-2164-14-223
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Transcript assembling and annotation from RNASeq data. The figure shows a region of the chromosome 7. Panel A: reads aligned in this region; a small window of the total mapped reads is shown (bottom panel); the relative coverage (logarithmic scale) is depicted as a sky-line on the panel. Each vertical dash represents a read. Reads aligned with the plus strand of the chromosome are shown in pink and those aligned with the minus strand in violet (note that the direction of the reads was assigned arbitrarily, as sequencing was not oriented). Panel B: mapping of SL-containing reads. Panel C: previously annotated L. major genome (GeneDB database). Panel D: crude transcripts as assembled by Cufflinks. Panel E: new transcript annotation after mapping of both SL addition sites and the 3’ ends generated by polyadenylation. The images were generated after loading the RNA-seq data in the Integrative Genomics Viewer (IGV 2.1) [23].
Transcriptome of promastigotes
| 1 | 92 | | 7 | 1 |
| 2 | 93 | 2 | 19 | 9 (1) |
| 3 | 110 | | 13 | 9 |
| 4 | 140 | 1 | 11 | 14 (1) |
| 5 | 146 | 1 | 22 | 4 (2) |
| 6 | 154 | | 19 | 7 (3) |
| 7 | 158 | | 28 | 6 (7) |
| 8 | 171 | 1 | 36 | 3 (1) |
| 9 | 189 | 1 | 21 | 6 (3) |
| 10 | 175 | 4 | 32 | 5 (4) |
| 11 | 171 | 1 | 34 | 1 (2) |
| 12 | 183 | | 41 | 12 (1) |
| 13 | 207 | 1 | 38 | 7 (5) |
| 14 | 194 | | 35 | 5 (2) |
| 15 | 196 | 2 | 28 | 11 (2) |
| 16 | 213 | 1 | 37 | 8 (3) |
| 17 | 211 | 1 | 52 | 5 |
| 18 | 242 | | 70 | 5 (2) |
| 19 | 216 | | 38 | 4 (1) |
| 20 | 215 | 5 | 40 | 9 |
| 21 | 264 | | 37 | 11 (2) |
| 22 | 214 | | 45 | 7 (1) |
| 23 | 253 | 3 | 53 | 8 (3) |
| 24 | 286 | 1 | 48 | 14 (1) |
| 25 | 302 | 4 | 50 | 22 (1) |
| 26 | 330 | | 51 | 10 (2) |
| 27 | 340 | 2 | 59 | 10 (1) |
| 28 | 403 | 1 | 84 | 13 (2) |
| 29 | 372 | 3 | 77 | 10 |
| 30 | 465 | 2ª | 69 | 11 (3) |
| 31 | 463 | 5 | 126 | 51 (12) |
| 32 | 509 | 6 | 85 | 25 (4) |
| 33 | 492 | 5 | 115 | 13 (4) |
| 34 | 570 | 6 | 88 | 25 (1) |
| 35 | 673 | 5 | 133 | 9 (10) |
| 36 | 873 | 9 | 143 | 40 (7) |
*In brackets is indicated the number of genes that might be truncated by addition of SL in secondary trans-splicing sites.
aTranscript LmjF.30.T1460-1470-1480-1490 is tetracistronic.
Figure 2Mis-annotation of gene. (A) Upper panels show the mapping of RNA-seq reads (either total or SL-containing reads) in the genomic region containing the annotated LmjF04.0860 gene; the bottom panel contains the transcripts delimited in this region. Arrows indicate SL addition sites, the red arrow points at the main SL addition site. Reads aligned with the plus strand of the chromosome are shown in violet and those aligned with the minus strand in pink (note that the direction of the reads was a consequence of sequencing process, as sequencing was not oriented). (B) Nucleotide sequence (and predicted amino acid sequence) of the LmjF04.0860 gene as annotated in the GeneDB database [25]. Shaded in gray is shown the position of the main AG dinucleotide used for trans-splicing in the LmjF.04.T0860 transcripts, and underlined are those AG dinucleotide representing alternative SL addition sites. Shaded in green it is shown the first ATG found downstream the SL addition sites. (C) Alignment between the protein predicted in the LmjF.04.T0860 transcript and the Tb927.9.8290 protein annotated in the T. brucei GeneDB database [25]. Identical amino acids are shaded in gray.
The 50 most abundant transcripts in promastigotes
| LmjF.28.T2770 | LmjF28.2770 | 1357.39 ± 5.12 | heat-shock protein (HSP70; gene |
| LmjF.35.T0240 | LmjF35.0240 | 1034.68 ±12.39 | ribosomal protein L30 |
| LmjF.28.T2780 | LmjF28.2780 | 987.24 ± 4.45 | heat-shock protein hsp70 (HSP70; gene |
| LmjF.36.T1940 | LmjF36.1940 | 952.68 ± 4.37 | inosine-guanosine transporter (NT2) |
| LmjF.31.T0900 | LmjF31.0900 | 809.07 ± 7.12 | hypothetical protein, conserved |
| LmjF.28.T2205 | LmjF28.2205 | 792.33 ± 8.16 | ribosomal protein S29 |
| LmjF.35.T2220 | LmjF35.2220 | 780.12 ± 5.99 | kinetoplastid membrane protein-11 (KMP11) |
| LmjF.19.T0983 | Non-annotated | 674.85 ± 5.26 | - |
| LmjF.35.T0600 | LmjF35.0600 | 672.00 ± 5.81 | ribosomal protein L18a |
| LmjF.06.T0010 | LmjF06.0010 | 666.85 ± 8.31 | histone H4 |
| LmjF.35.T3800 | LmjF35.3800 | 617.33 ± 7.36 | ribosomal protein L23 |
| LmjF.36.T3620 | LmjF36.3620 | 616.48 ± 5.07 | hypothetical protein, conserved |
| LmjF.35.T2210 | LmjF35.2210 | 603.26 ± 4.69 | kinetoplastid membrane protein-11 (KMP11) |
| LmjF.28.T2460 | LmjF28.2460 | 596.61 ± 6.27 | ribosomal protein S29 |
| LmjF.20.T1285 | Non-annotated | 560.05 ± 5.48 | - |
| LmjF.31.T0964 | Non-annotated | 539.44 ± 5.07 | - |
| LmjF.31.T0895 | Non-annotated | 524.19 ± 10.51 | - |
| LmjF.35.T3290 | LmjF35.3290 | 514.13 ± 5.67 | ribosomal protein L31 |
| LmjF.13.T0570 | LmjF13.0570 | 496.01 ± 5.02 | ribosomal protein S12 |
| LmjF.35.T3790 | LmjF35.3790 | 493.33 ± 8.18 | ribosomal protein L23 |
| LmjF.35.T4191 | Non-annotated | 490.94 ± 5.15 | - |
| LmjF.35.T3760 | LmjF35.3760 | 483.03 ± 7.04 | ribosomal protein L27A/L29 |
| LmjF.30.T3340 | LmjF30.3340 | 482.87 ± 5.89 | ribosomal protein L9 |
| LmjF.35.T2050 | LmjF35.2050 | 464.76 ± 5.1 | ribosomal protein L32 |
| LmjF.08.T0640 | LmjF08.0640 | 452.92 ± 3.18 | hypothetical protein |
| LmjF.14.T0850 | LmjF14.0850 | 451.18 ± 3.65 | calpain-like cysteine peptidase |
| LmjF.35.T1910 | LmjF35.1910 | 448.56 ± 5.89 | ribosomal protein L15 |
| LmjF.35.T0420 | LmjF35.0420 | 446.58 ± 5.12 | ribosomal protein S3A |
| LmjF.35.T1920 | LmjF35.1920 | 446.57 ± 9.14 | ribosomal protein L36 |
| LmjF.25.T0910 | LmjF25.0910 | 436.47 ± 3.61 | cyclophilin a |
| LmjF.35.T3780 | LmjF35.3780 | 427.63 ± 4.95 | ribosomal protein L27A/L29 |
| LmjF.28.T2740 | LmjF28.2740 | 426.82 ± 4.82 | activated protein kinase c receptor |
| LmjF.13.T0450 | LmjF13.0450 | 425.12 ± 4.7 | hypothetical protein, conserved |
| LmjF.20.T1280 | LmjF20.1280 | 424.16 ± 3.22 | small myristoylated protein 4 |
| LmjF.31.T0966 | Non-annotatedd | 419.44 ± 8.52 | hypothetical protein, conserved |
| LmjF.28.T2750 | LmjF28.2750 | 414.41 ± 4.15 | activated protein kinase c receptor |
| LmjF.31.T1170 | LmjF31.1170 | 414.01 ± 3.8 | hypothetical protein |
| LmjF.35.T0410 | LmjF35.0410 | 411.95 ± 4.52 | ribosomal protein S3A |
| LmjF.15.T1240 | LmjF15.1240 | 410.71 ± 3.38 | nucleoside transporter 1 |
| LmjF.24.T2230 | LmjF24.2230 | 409.67 ± 3.11 | hypothetical predicted multi-pass transmembrane protein |
| LmjF.35.T3280 | LmjF35.3280 | 406.24 ± 5.44 | ribosomal protein L31 |
| LmjF.35.T0400 | LmjF35.0400 | 403.61 ± 4.43 | ribosomal protein S3A |
| LmjF.24.T1280 | LmjF24.1280 | 403.46 ± 3.63 | amastin-like surface protein |
| LmjF.13.T0370 | LmjF13.0370 | 403.05 ± 3.84 | alpha tubulin |
| LmjF.35.T1670 | LmjF35.1670 | 403.05 ± 6.46 | ribosomal protein L26 |
| LmjF.13.T0360 | LmjF13.0360 | 403.04 ± 3.84 | alpha tubulin |
| LmjF.13.T0350 | LmjF13.0350 | 399.2 ± 3.82 | alpha tubulin |
| LmjF.13.T0380 | LmjF13.0380 | 399.06 ± 3.82 | alpha tubulin |
| LmjF.33.T3230 | LmjF33.3230 | 396.93 ± 7.5 | ribosomal protein L44 |
| LmjF.13.T0330 | LmjF13.0330 | 396.86 ± 3.81 | alpha tubulin |
a GeneDB identification code.
b FPKM, fragments (reads) per kilobase per million mapped reads.
c Hypothetical: predicted by informatics tools; conserved: present in other trypanosomatids (i.e. T. brucei and T. cruzi genome).
d This transcript has 97% of sequence identity with gene LmjF31.0900.
Figure 3Relative expression levels of transcripts derived from loci coding for HSP70 and ribosomal protein L23. Panel A: upper, current annotation of the two types of genes found in the L. major HSP70 locus, LmjF28.2770 (also known as HSP70-II) and LmjF28.2780 (HSP70-I); bottom, transcript annotation, as defined in this study, for the L. major HSP70 locus. Panel B: upper, current annotation of the two genes coding for the ribosomal protein L23, LmjF35.3790 and LmjF35.3800; bottom, transcripts annotated in this study. Each locus is composed by two types of genes with identical ORFs (blue boxes). Transcript mapping has allowed the identification of 5’-UTRs (purple boxes) and 3’-UTRs (green boxes). The number of reads mapped by Cufflinks within the different regions is shown at bottom.
Figure 4Nucleotide frequencies for sequences surrounding SL and polyadenylation addition sites. Panels show the compositional profiles of sequences around the main SL addition sites (n = 9530), alternative SL addition sites (n = 4531), main polyadenylation sites (n = 3178) and alternative polyadenylation sites (n = 1238).