| Literature DB >> 26645802 |
Lingli Dong1, Hongfang Liu2, Juncheng Zhang3, Shuangjuan Yang4,5, Guanyi Kong6, Jeffrey S C Chu7,8, Nansheng Chen9,10, Daowen Wang11,12.
Abstract
BACKGROUND: The large and complex hexaploid genome has greatly hindered genomics studies of common wheat (Triticum aestivum, AABBDD). Here, we investigated transcripts in common wheat developing caryopses using the emerging single-molecule real-time (SMRT) sequencing technology PacBio RSII, and assessed the resultant data for improving common wheat genome annotation and grain transcriptome research.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26645802 PMCID: PMC4673716 DOI: 10.1186/s12864-015-2257-y
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Mapping FLNC reads to the draft genome sequence of Chinese Spring (CS). a Division of the 186,765 FLNC reads into five groups (G1 to G5) based on their genome mapping characteristics. The number of reads in each group is depicted in the pie chart. b An example illustrating the FLNC read mapped to two different CS contigs located on the same chromosome arm. The read is shown as a split-mapped molecule (SMM) with both exon (filled box) and intron (line between two neighboring exons) depicted. The arrow indicates the direction of alignment to the genomic sequence. The two CS contigs to which the shown FLNC read mapped were both located on the long arm of chromosome 1D (1DL). The representative transcripts annotated for the two contigs by the draft genome sequence of CS are shown below as SMMs (boxed in green). The bottom panel is the Ae. tauschii contig orthologous to the two 1DL contigs of CS. The transcript annotated for this Ae. tauschii contig is also provided as a SMM (boxed in purple). The Ae. tauschii contig was identified by mapping the exemplary FLNC read to the Dt genome sequencing database (http://aegilops.wheat.ucdavis.edu/ATGSP/data.php). The diagrams shown are not drawn to scale
Genomic loci and unique transcripts represented by 91,881 high-quality FLNC reads
| FLNC read | Locus | Transcript | |
|---|---|---|---|
| 83,736 | 13,162 (extant) | 19,023 (13,177 extant, 5846 new) | |
| 8145 | 3026 (new) | 3745 (new ) | |
| Total | 91,881 | 16,188 | 22,768 |
Fig. 2Chromosomal distributions of 91,881 high-quality FLNC reads and the loci identified by them. The known and new loci identified by the 91,881 reads were 16,188 and 3026, respectively. The three values were used as backgrounds for calculating the percentages displayed along each short arm (SA) and long arm (LA)
Fig. 3Analysis of representative transcripts spanning two or three Chinese Spring (CS) loci. a The transcript 2BS_5155291.1.1 and the three CS loci (Traes_2BS_D46E40C29, Traes_2BS_033FD1621 and Traes_2BS_00DF01F06) it covered. These loci are located on the CS contig 2BS_5155291, and the transcripts annotated for the three loci by the draft genome sequence are boxed in green. The bottom panel shows B. distachyon genomic region orthologous to 2BS_5155291. A single locus (Bradi1g21372.1) and a corresponding transcript (boxed in purple) are annotated for this B. distachyon genomic region (http://www.plantgdb.org/BdGDB/). b The PacBio transcript 1AL_3888283.1.2 and the two CS loci (Traes_1AL_2D4B01C64 and Traes_1AL_6275047AA) it covered. The two loci reside on the CS contig 1AL_3888283, and the representative transcripts annotated for them by the draft genome sequence are boxed in green. The bottom panel is the rice genomic region orthologous to 1AL_3888283. A single locus (Os05g0345400.02) and a corresponding transcript (boxed in purple) are annotated for this rice genomic region (http://www.plantgdb.org/OsGDB/). The transcripts in (a and b) are all shown as SMMs with exon (filled box) and intron (line between two neighboring exons) depicted. The diagrams are not scaled
Estimation of the genes expressed in unfertilized caryopses and developing grains
| Developmental stagea | ||||
|---|---|---|---|---|
| S1 | S2 | S3 | S4 | |
| Total number of genes expressedb | 50,650 | 42,444 | 44,547 | 37,369 |
| Number of genes with coverage by PacBio sequencing identified transcripts | 11,798 | 9452 | 10,626 | 8676 |
| Total number of PacBio sequencing identified transcripts detected at each stage | 17,330 | 13,938 | 15,909 | 12,943 |
aS1, unfertilized caryopses; S2-S4, developing grains at 5, 15 or 25 days after anthesis
bJudged based on RPKM (reads per kilobase per million mapped reads) > 1
Fig. 4Analysis of the two different transcript isoforms of TRAES_1DS_114C78BF4 by RT-PCR. a A diagram showing the exon (filled box)-intron (line in between filled boxes) patterns of the two isoforms (designated as a and b, respectively). Arrows mark the positions of the primers (FP, RP1 and RP2) used for specifically amplifying each of the two isoforms. The length of the amplicon (bp) is indicated for each isoform. b The result of amplifying isoforms a and b by RT-PCR in the caryopsis samples of four developmental stages (S1 - S4). Amplification of the common wheat actin gene (GenBank accession AB181991) served as internal control for normalizing cDNA content prior to PCR amplification. S1, unfertilized caryopses; S2-S4, developing grains collected at 5, 15 or 25 DAA. The data displayed are typical of three independent experiments