| Literature DB >> 23536760 |
Yoji Nakamura1, Naobumi Sasaki, Masahiro Kobayashi, Nobuhiko Ojima, Motoshige Yasuike, Yuya Shigenobu, Masataka Satomi, Yoshiya Fukuma, Koji Shiwaku, Atsumi Tsujimoto, Takanori Kobayashi, Ichiro Nakayama, Fuminari Ito, Kazuhiro Nakajima, Motohiko Sano, Tokio Wada, Satoru Kuhara, Kiyoshi Inouye, Takashi Gojobori, Kazuho Ikeo.
Abstract
Nori, a marine red alga, is one of the most profitable mariculture crops in the world. However, the biological properties of this macroalga are poorly understood at the molecular level. In this study, we determined the draft genome sequence of susabi-nori (Pyropia yezoensis) using next-generation sequencing platforms. For sequencing, thalli of P. yezoensis were washed to remove bacteria attached on the cell surface and enzymatically prepared as purified protoplasts. The assembled contig size of the P. yezoensis nuclear genome was approximately 43 megabases (Mb), which is an order of magnitude smaller than the previously estimated genome size. A total of 10,327 gene models were predicted and about 60% of the genes validated lack introns and the other genes have shorter introns compared to large-genome algae, which is consistent with the compact size of the P. yezoensis genome. A sequence homology search showed that 3,611 genes (35%) are functionally unknown and only 2,069 gene groups are in common with those of the unicellular red alga, Cyanidioschyzon merolae. As color trait determinants of red algae, light-harvesting genes involved in the phycobilisome were predicted from the P. yezoensis nuclear genome. In particular, we found a second homolog of phycobilisome-degradation gene, which is usually chloroplast-encoded, possibly providing a novel target for color fading of susabi-nori in aquaculture. These findings shed light on unexplained features of macroalgal genes and genomes, and suggest that the genome of P. yezoensis is a promising model genome of marine red algae.Entities:
Mesh:
Year: 2013 PMID: 23536760 PMCID: PMC3594237 DOI: 10.1371/journal.pone.0057122
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Assembly statistics of the P. yezoensis genome.
| Total contig size (bp) | 43,483,963 |
| Number of contigs | 46,634 |
| Average contig length (bp) | 932 |
| Contig N50 (bp) | 1,669 |
| Contig coverage | 166 |
| Percentage of mapped cDNA read pairs (%) | 97.9 |
| G+C content (%) | 63.6 |
| Percentage of repetitive sequences (%) | 1.4 |
Figure 1Microsatellite distribution.
Repeats with frequency > = 3 are shown. Locations of microsatellite repeats are classified by color: exon (green), intron (red) and intergenic (blue) regions.
Characteristics of P. yezoensis genes.
| Species |
|
|
|
|
| |
| Genome or contig size (Mb) | 43 | 16.5 | 46.2 | 121 | 195.8 | |
| All genes predicted | Validated genes | |||||
| Gene content | 10,327 | 1,314 | 5,331 | 9,791 | 15,143 | 16,256 |
| Gene density (kb/gene) | 4.2 | – | 3.1 | 4.7 | 8.0 | 12.0 |
| Average CDS length (bp) | 849 | 1,247 | 1,552 | 1,371 | 1,335 | 1,563 |
| Average exon legth (bp) | 634 | 755 | 1,540 | 170 | 190 | 242 |
| Average intron length (bp) | 304 | 300 | 248 | 209 | 373 | 704 |
| Introns per gene | 0.3 | 0.7 | 0.005 | 6.3 | 7.3 | 7.0 |
Figure 2Distribution of intron number.
“All genes” and “Validated genes” correspond to those shown in Table 2, respectively.
Figure 3Correlation between gene statistics and algal genome sizes.
The y value of P. yezoensis (plotted in red) indicates the contig size (43 Mb). The species and data that are not shown in Table 2 are summarized in Table S2. (A) Intron density and genome size; (B) Average intron length and genome size; (C) Gene content and (logarithmic-transformed) genome size.
Figure 4BLAST top hit distribution and gene set comparison.
(A) Taxonomic distributions of BLASTP top hits of P. yezoensis genes. (left) Eukaryotes; (right) Prokaryotes; (B) A Venn diagram of gene sets among four species (P. yezoensis, C. merolae, C. reinhardtii, and A. thaliana). Numbers of gene groups are shown on the diagram. Each gene group is defined as a singleton or a cluster of paralogs; (C) GO category comparison among P. yezoensis, C. merolae and C. reinhardtii.
Figure 5Phylogenetic relationships of METH and METE.
Nodes with bootstrap probabilities > = 90% (1000 replicates) are shown. (A) Phylogenetic tree for METH and (B) Phylogenetic tree for METE. The accession numbers of sequences compared are summarized in Table S3.
Light-harvesting genes in the P. yezoensis nuclear genome.
| Gene ID | Top hit accession | Top hit species | Description | Identity (%) | E-value | G+C% | WoLF PSORT | ChloroP |
|
| ||||||||
| g2786 | AAP59423 |
| R-phycoerythrin gamma subunit | 47.3 | 4E-83 | 64.4 | ✓ | ✓ |
| g5483 | AAB37302 |
| gamma 31 kDa subunit of phycoerythrin precursor | 37.5 | 4E-35 | 70.4 | ✓ | ✓ |
| g2698 | ZP_10229441 |
| Phycobilisome 27.9 kDa linker polypeptide, phycoerythrin-associated, rod | 49.7 | 5E-42 | 65.5 | ✓ | ✓ |
| g2303 | AAP80724 |
| phycobilisome 31.8kD linker polypeptide | 61.1 | 2E-107 | 64.0 | ✓ | ✓ |
| g8704 | YP_001521651 |
| phycobilisome 32.1 kDa linker polypeptide | 41.6 | 4E-28 | 66.6 | ✓ | ✓ |
| g9334 | AAP80835 |
| phycobilisome 7.8 kDa linker polypeptide | 64.6 | 1E-35 | 65.0 | ✓ | ✓ |
| g5407 | ZP_08427732 |
| phycobilisome linker polypeptide | 46.0 | 4E-30 | 68.5 | ✓ | ✓ |
| g4291 | NP_924210 |
| phycoerythrin-associated linker protein | 52.1 | 4E-40 | 65.9 | ✓ | ✓ |
| g3612 | YP_001867135 |
| phycobilisome degradation protein NblA | 47.2 | 1E-07 | 65.0 | ||
|
| ||||||||
| g3715 | ZP_06307232 |
| chlorophyll synthetase | 65.65 | 6E-154 | 66.8 | ✓ | ✓ |
| g6733 | CBN79181 |
| Protochlorophyllide reductase, putative chloroplast precursor | 54.86 | 7E-110 | 69.5 | ✓ | |
Numbered based on AUGUSTUS prediction.
Cyanobacteria.
Figure 6Structure of the NblA locus and sequence analysis.
(A) A phylogenetic tree of NblA proteins. Nodes with bootstrap probabilities > = 90% (1000 replicates) are shown. Red algae are indicated in red and cyanobacteria are indicated in black. The accession numbers of sequences compared are summarized in Table S4; (B) A partial alignment of NblA proteins. More highly conserved residues are shown in deeper blue. The numbers in the corners indicate alignment start/end positions of amino acid residues in the P. yezoensis nuclear/plastid NblA homologs, respectively; (C) Predicted genomic structure and PCR amplification of NblA locus. The position of forward (F) and reverse (R) primers used is indicated by arrows on the predicted genomic structure. PCR amplification of the gene was performed using genomic DNA (gDNA) from protoplasts and complementary DNA (cDNA) from thalli of P. yezoensis as templates. The dotted line in the genomic structure represents undetermined nucleotide sequence.