Literature DB >> 35775942

A de novo genome assembly of Solanum verrucosum Schlechtendal, a Mexican diploid species geographically isolated from other diploid A-genome species of potato relatives.

Awie J Hosaka^1,2, Rena Sanetomo³, Kazuyoshi Hosaka³.

Abstract

There are over 100 known species of cultivated potatoes and their wild relatives. Many of these species, including cultivated potatoes, share the A genome; these species are mainly distributed in South America and are reproductively isolated from Mexican diploid species. The only diploid A-genome species distributed in Mexico is Solanum verrucosum Schlechtendal, which is also a maternal progenitor of Mexican polyploid species. In this study, we constructed a high-quality de novo assembly of the S. verrucosum genome using PacBio long-read sequencing and Hi-C scaffolding technologies. A monohaploid clone (2n = x = 12) of S. verrucosum was used to reduce assembly difficulty due to the heterozygous nature of the species. The final sequence assembly consisted of 780.2 Mb of sequence, 684.0 Mb of which were anchored to the 12 chromosomes, with a scaffold N50 of 55.2 Mb. Putative centromeres were identified using publicly available data obtained via chromatin immunoprecipitation sequencing against a centromere-specific histone 3 protein. Transposable elements accounted for approximately 61.8% (482.1 Mb) of the genome, and 46,904 genes were functionally annotated. High gene synteny and similarity were revealed among the genomes of S. verrucosum, Solanum commersonii, Solanum chacoense, Solanum phureja, Solanum tuberosum, and Solanum lycopersicum. The reference-quality S. verrucosum genome will provide new insights into the evolution of Mexican polyploid species and contribute to potato breeding programs.

Entities: Chemical

Keywords: zzm321990 Solanum verrucosumzzm321990 ; A-genome species; Mexican wild potato; genome assembly

Mesh：

Year: 2022 PMID： 35775942 PMCID： PMC9339273 DOI： 10.1093/g3journal/jkac166

Source DB: PubMed Journal: G3 (Bethesda) ISSN： 2160-1836 Impact factor: 3.542

Introduction

Potato (Solanum tuberosum L., 2n = 4x = 48) is the most important noncereal food crop in the world. High genetic diversity is observed among primitive cultivated potatoes and the over 100 wild potato species distributed from North and Central America to South America (Hawkes 1990; Spooner ). These species are classified into 2 reproductively isolated groups: (1) a group including all Mexican diploid species except for S. verrucosum Schlechtendal and (2) a group including all Mexican polyploid species, S. verrucosum, and all South American species (Hawkes 1958). Based on the meiotic chromosome pairing of interspecific hybrids, the A genome is assigned to the species in the second group (Matsubayashi 1991). Since sexual hybrids between Mexican diploid species and A-genome species are extremely difficult to obtain, their genome affinity has long been debated (Matsubayashi 1991; Pendinen ). S. verrucosum is the only diploid A-genome species from Mexico and is assumed to contribute the A genome of Mexican polyploid species (Hosaka ). Most diploid tuber-bearing Solanum species are self-incompatible (Pushkarnath 1942), whereas S. verrucosum is self-compatible (Hawkes 1990). S. verrucosum is cross-compatible with most South American species as the female parent (Eijlander ) and with some Mexican diploid species, which provides an opportunity to transfer useful traits from Mexican diploid species to cultivated potatoes as a bridging species (Hermsen and Ramanna 1976; Jansky and Hamernik 2009). The Mexican species, including S. verrucosum, are valuable sources of disease and pest resistance in potato breeding (Hein ; Chen ). The first potato genome was sequenced from the DM 1-3 516 R44 clone (hereafter referred to as DM) (Potato Genome Sequencing Consortium 2011). DM resulted from the chromosome doubling of a monoploid derived via anther culture of the cultivated diploid species Solanum phureja Juz. et Buk. (Lightbourn and Veilleux 2007). The homozygous nature of the clone facilitated genome sequencing. Since then, potato whole genomes have been sequenced mainly from cultivated potato species (Kyriakidou, Achakkagari, ; Kyriakidou, Anglin, ; van Lieshout ; Zhou ; Yan ). Recent advances such as long-read sequencing coupled with high-throughput chromosome conformation capture (Hi-C) scaffolding technologies have resulted in great improvements in the DM genome (DM v6.1; Pham ), and chromosome-scale phased assemblies have been obtained from heterozygous diploid and tetraploid potatoes (Zhou ; Yan ; Hoopes ; Sun ). However, whole-genome sequencing in wild species has been limited to Solanum commersonii Dun. (Aversano ) and Solanum chacoense Bitt. (Leisner ), both of which are distributed in the southern marginal distribution area of the South American A-genome species (Hawkes and Hjerting 1969). Only a draft genome sequence has been reported for the Mexican diploid species S. pinnatisectum Dunal (Tiwari ). In this study, we generate a chromosome-scale assembly of the genome of the Mexican diploid species S. verrucosum using PacBio long-read sequencing and Hi-C scaffolding technologies. A monohaploid S. verrucosum clone (2n = x = 12) was used to reduce complexity caused by the heterozygous nature of the species. The constructed reference-quality genome will provide new insights into the evolutionary process in Mexican polyploid species and contribute to potato breeding programs.

Materials and methods

Plant material

A monohaploid clone of S. verrucosum (11H23, available as PI 666968 from the U.S. Potato Genebank) that was derived from anther culture (Irikura and Sakaguchi 1972) and maintained in vitro in our laboratory (Sanetomo and Hosaka 2021) was used for sequencing.

DNA extraction

A plant grown in vitro was transferred to a pot filled with soil and further grown for DNA extraction. Fresh leaves were collected, frozen in liquid nitrogen, and ground into powder with a mortar and pestle. The powder was suspended in 7 ml of 2× CTAB buffer (100 mM Tris-Cl buffer pH 8.0, 20 mM EDTA pH 8.0, 1.4 M NaCl, 2% CTAB, 1% PVP-40, and 0.2% 2-mercaptoethanol) and incubated at 60°C for 30 min. The suspension was gently mixed with 5 ml of chloroform: isoamyl alcohol (24:1) and centrifuged at 10,000 rpm for 5 min at 20°C. Using a wide-bore pipet tip, the supernatant was transferred to a 50-ml tube containing 5 ml of isopropanol and mixed gently by inverting the tube. Aggregated DNA strands were hooked and drawn up using a Pasteur pipet modified by flaming the tip and bending it into a U shape, after which they were transferred to a tube containing 10 ml of 75% ethanol washed for 30 min. Then, the aggregated DNA was dissolved in 2 ml of TE buffer (10 mM Tris-Cl buffer pH 8.0 and 1 mM EDTA pH 8.0) and incubated with 5 μl of RNase (10 mg/ml) for 3 h at room temperature. After complete dissolution, 100 μl of 5 M NaCl and 700 μl of 99.9% ethanol were added, followed by mixing and incubation at 4°C overnight to precipitate polysaccharides. After centrifugation at 10,000 rpm for 5 min at 4 °C, the supernatant was collected and mixed gently with 9 ml of 75% ethanol with 10 mM ammonium acetate. The aggregated DNA strands were hooked and drawn up using a U-shaped Pasteur pipet, transferred to a tube containing 10 ml of 75% ethanol, and washed for 30 min. Then, the DNA was dried completely while hanging on the U-shaped Pasteur pipet and dissolved in 100 μl of sterile water.

Genome sequencing and assembly

The quality of the extracted DNA was measured with a Genomic DNA ScreenTape System (Agilent) and a Qubit Fluorometer (Thermo Fisher Scientific). A long-read DNA library was prepared with the SMRTbell Express Template Prep Kit 2.0 (PacBio) and sequenced using the PacBio Sequel lle system in CCS mode (PacBio). The resulting raw data were converted to FASTQ format using BAM2fastx 1.3.1 (PacBio). Reads longer than 5 kb were extracted with SeqKit 0.15.0 (Shen ) and used for genome assembly with the Hifiasm 0.15.5-r350 assembler (Cheng ).The -l 0 option was specified to disable the purge haplotigs function since the plant was monohaploid.

Hi-C sequencing and scaffolding

The Hi-C library was prepared using the Dovetail Omni-C Kit (Dovetail Genomics) following the Proximity Ligation Assay Nonmammalian Samples Protocol version 1.0. The prepared library was sequenced on the NovaSeq 6000 (Illumina) platform. The read quality was assessed using FastQC 0.11.8 (Andrews 2010) and MultiQC v1.8 (Ewels ) and then filtered using Trimmomatic 0.39 (Bolger ) with the “ILLUMINACLIP: TruSeq3-PE.fa : 2:30:10 TRAILING : 20 SLIDINGWINDOW : 4:15 HEADCROP : 10 MINLEN : 50” options. The trimmed reads were aligned to the contigs using Juicer 1.6 (Durand ). Since DNase I was used to digest fixed nucleosomes, the “-s none -y none” options were specified. The generated contact maps were then used for scaffolding with a 3D-DNA pipeline (Dudchenko ) with the default parameters. The scaffolds were manually corrected using JuiceBox 1.11.08 (https://github.com/aidenlab/Juicebox). The corrected scaffolds were aligned to the DM v6.1 reference genome using D-GENIES (Cabanettes and Klopp 2018). The identities and directions of the scaffolds were determined based on the alignment.

Identification of organelle sequences

To identify regions or contigs derived from organelle genomes, sequences of chloroplast genome of S. verrucosum (MH021593.1; Huang ) and mitochondrial genome of S. tuberosum cultivar Désirée (MN104801, MN104802, and MN104803; Varré ) were obtained from the National Center for Biotechnology Information (NCBI), and a nucleotide homology search was performed against S. verrucosum contigs using BLASTN 2.12.0 (Altschul ) with the “-outfmt 6 -evalue 0.0001” options. Regions with more than 10 kb of alignment length and with more than 90% homology were selected and reformatted to BED files. Overlapped regions were merged using BEDTools 2.30.0 (Quinlan and Hall 2010).

Identification of putative centromeres

To identify centromeres, sequence reads generated via chromatin immunoprecipitation sequencing (ChIP-seq) against a centromere-specific histone 3 (CENH3) protein that were publicly available from NCBI were obtained for S. verrucosum (SRR18548893; Zhang ) and S. phureja (SRR18548894; Gong ) and aligned to their genomes using Bowtie2 (Langmead and Salzberg 2012) in single-end mode. The resulting BAM files were converted to BigWig files using DeepTools 3.5.1 (Ramírez ) for visualization on the IGV 2.11.0 genome browser (Robinson ). Centromeric regions of chromosomes were manually identified with IGV.

Annotation

Transposable elements (TEs) were identified using EDTA 1.9.6 (Ou ), and the defined TE regions were hard masked. To evaluate assembly completeness, the long terminal repeat (LTR) assembly index (LAI) score (Ou ) was calculated using the EDTA output files. Tandem repeats were defined using Tandem Repeats Finder v4.09 (Benson 1999) with the default parameters, and the defined repeats were soft masked using BEDTools. The masked scaffolds were subjected to gene prediction using the MAKER 3.01.03 (Cantarel ) annotation pipeline by providing mRNA and protein sequences of DM v6.1 (Pham ) and pretrained AUGUSTUS (Stanke ) gene models of tomato. The functional annotation of the predicted proteins was performed using Hayai-Annotation Plants v.2 (Ghelfi ). The density of the annotated TE families, Miniature Inverted-repeat Transposable Element (MITE) derivatives, genes, and CENH3 ChIP-seq reads within every 1 Mb segment was calculated using BEDTools and visualized in a circular heatmap generated by Circos (Krzywinski ).

Genome synteny and orthologs

The genome of S. verrucosum was compared with those of S. phureja (DM v6.1; Pham ), diploid S. tuberosum (Solyntus v1.1; van Lieshout ), S. chacoense (M6; Leisner ), S. commersonii (Aversano ), and Solanum lycopersicum L. (Hosmani ). Syntenic gene pairs were searched using MCScan (python version) (Tang ) with the default parameters, and syntenic blocks containing more than 30 genes were visualized. The orthologous relationships of S. verrucosum genes were assessed using OrthoFinder (Emms and Kelly 2015, 2019). All protein-coding genes except for those encoding isoforms or sequences shorter than 10 amino acids were compared. Intersections of orthogroups were visualized with UpSetR 1.4.0 (Lex ; Conway ).

Results and discussion

Genome assembly

We obtained 46.5 Gb of HiFi reads using a PacBio Sequel IIe system with an N50 read size of 15.6 kb and an average read size of 14.9 kb. Reads longer than 5 kb were used for assembly with Hifiasm. The resulting assembly consisted of 1,437 contigs with an N50 contig size of 21.0 Mb (Table 1). The contigs were error corrected and scaffolded with 101 million Omni-C read pairs using Juicer and a 3D-DNA pipeline (Supplementary Fig. 1). The final sequence assembly consisted of 780.2 Mb, among which 684.0 Mb were anchored to the 12 chromosomes, with a scaffold N50 of 55.2 Mb (Table 1). Of the remaining 1,535 unanchored contigs (a total of 96.3 Mb in size), 688 contigs (33.3 Mb) and 102 contigs (3.6 Mb) showed high homology to the chloroplast and mitochondrial genomes, respectively (Supplementary File 1). The dot plot analysis using these contigs against the organelle genomes indicated that these contigs were fragments of the organelle genomes (Supplementary Fig. 2). This is in accordance with previous studies that most of smaller contigs from Hifiasm assembly corresponded to small portions of the chloroplast and mitochondrial genomes (Sharma ; Sun ). The other unanchored contigs (59.4 Mb) consisted mostly of TEs (88.6%) and showed homology to localized regions in the chromosomes (Supplementary File 1 and Fig. 1).

Table 1.

Assembly statistics.

	Primary contigs with PacBio reads	Final scaffolded contigs after Hi-C sequencing
Number of contigs	1,437	1,547
Total size, bp	779,910,189	780,238,689
Longest size, bp	55,137,318	84,109,000
Mean size, bp	542,735	504,356
N50 size, bp	20,992,750	55,157,000

Fig. 1.

Dot plot analysis between S. verrucosum and S. phureja using D-GENIES with the “hide noise” option.

Dot plot analysis between S. verrucosum and S. phureja using D-GENIES with the “hide noise” option. Assembly statistics.

Putative centromeres

The dot plot analysis performed between the genome of S. verrucosum and that of S. phureja DM v6.1 using D-GENIES showed significant consistency in the distal regions of each chromosome, while the central regions diverged considerably (Fig. 1). This is because centromere sequences evolve rapidly (Henikoff ) and might be distinct between S. verrucosum and S. phureja (Gong ; Zhang ). To precisely determine the centromeres and compare these structures between the 2 species, we used publicly available ChIP-seq data generated against the CENH3 protein. The sequence reads from S. verrucosum (SRR18548893) and S. phureja (SRR18548894) were aligned to their genomes. Strong signals were observed on each chromosome, and the mapping rates were high (>97%) and comparable between the genomes of the 2 species, indicating that the centromere sequences were properly assembled in both genomes (Supplementary Table 1, Supplementary Fig. 3 and Fig. 2a). It was noted that 17.2% of the mapped reads were aligned to unanchored contigs, suggesting that some centromere sequences could not be assembled into chromosomes, possibly due to their highly repetitive nature.

Fig. 2.

Putative centromeres. a) Distribution of CENH3 ChIP-seq signals in every 100 kb window in S. verrucosum and S. phureja. b) Dot plot between S. verrucosum and S. phureja for chromosome 12. ChIP-seq signals against the CENH3 proteins of S. verrucosum and S. phureja are shown on the right and at the top of the plot, respectively, and are highlighted on the plot. In accordance with a previous report (Zhang ), centromere regions showed little conservation between the sequences of S. verrucosum and S. phureja (Supplementary Fig. 4). In particular, the centromere position on chromosome 12 differed between S. verrucosum and S. phureja indicating that a massive rearrangement occurred during their speciation (Fig. 2b).

Gene prediction

Since TEs are one of the major forces driving genome evolution (Hosaka and Kakutani 2018), the quantity and diversity of TEs were analyzed using the EDTA transposon annotation pipeline. TEs accounted for approximately 61.8% (482.1 Mb) of the S. verrucosum genome, among which LTR-type retrotransposons accounted for 39.9%, and terminal inverted repeat (TIR)-type transposons accounted for 9.7% (Supplementary Table 2). MITEs, which are nonautonomous derivatives of TIR-type transposons, were identified in 20.5% of the TIR-type transposons. The most abundant TEs were Gypsy elements (24.2%), as reported previously in other Solanum species (Aversano ; Gaiero ; Hosmani ). Putative protein-coding genes were searched in the genome using the MAKER pipeline, and their functions were predicted using the Hayai-Annotation Plants v2 pipeline. As a result, 64,294 genes were predicted, and 46,904 genes were functionally annotated. Their chromosomal distributions and the correlations of their locations are shown in Fig. 3. The genes were densely distributed in telomeric and subtelomeric regions. Some class II transposons, such as Tc1_Mariner, hAT, helitron, and MITEs, were distributed in a pattern similar to that of genes. In contrast, Gypsy and unknown LTR retrotransposons were densely distributed in pericentromeric and centromeric regions. Similar distribution patterns have been reported in S. phureja DM v4.03 (Zavallo ).

Fig. 3.

Chromosomal distribution of genes and transposons (a) and the correlations of their locations (b).

Assembly completeness and quality assessment

The LAI score used to measure assembly completeness was 11.97, which was slightly lower than that of DM v6.1 but much higher than that of DM v4.04 (LAI scores of 13.56 and 7.87, respectively; Pham ). Higher LAI scores correspond to more complete genome assemblies, and genome LAI scores between 10 and 20 are considered to indicate reference genome quality (Ou ). Thus, the S. verrucosum genome was highly contiguous and is categorized as showing reference genome quality. The quality of the gene predictions was assessed using the Benchmarking Universal Single-Copy Orthologs (BUSCO) database. Among 5,950 BUSCOs that are conserved in Solanales species, the number of complete BUSCOs identified was 5,759 (96.8%) in genome mode and 5,490 (92.3%) in protein mode. The number of missing BUSCOs identified in protein mode was 272 (4.6%), which was almost equivalent to the numbers found in the genomes of the other 5 species, ranging from 2.6% in S. phureja to 15.4% in S. tuberosum (Supplementary Table 3). Thus, most of the known BUSCOs were identified in the S. verrucosum genome, demonstrating robust representation of protein-coding genes.

Synteny and phylogenetic analyses

Structural variation and gene similarity were compared between the S. verrucosum genome and the other 5 genomes. The chromosome lengths varied by species (Supplementary Table 4 and Fig. 4a). Each of the S. chacoense chromosomes except for the chromosome 7 was shorter than the corresponding chromosomes of the 5 other species, likely because S. chacoense showed the largest number of unanchored sequences (Leisner ). The highest size similarity was observed between S. verrucosum and S. phureja, with a Pearson’s correlation coefficient of 0.978.

Fig. 4.

Size and structural differences among 6 species genomes. a) Chromosome size in Mb. b) Synteny plot between S. verrucosum and the other species.

Size and structural differences among 6 species genomes. a) Chromosome size in Mb. b) Synteny plot between S. verrucosum and the other species. The gene collinearity analyses between S. verrucosum and the other 5 species showed putative inversions and interchromosomal translocations, indicating that genome rearrangements have occurred (Fig. 4b). The S. verrucosum genome was most syntenic to the S. phureja genome. Furthermore, all genomes except for that of S. commersonii showed similar gene synteny. The S. commersonii genome showed a large translocated segment on chromosome 2, which might indicate that a unique genome rearrangement occurred in this species, or this could be a result of simple misassembly. The analysis of orthologous relationships using OrthoFinder showed that 288,020 genes (90.1%) among the 319,562 genes identified in the 6 species were assigned to 38,937 orthogroups (Supplementary Table 5). Among the 38,937 orthogroups, 16,964 (43.6%) were present in all the species analyzed, while only 1,027 (2.6%) were present in S. verrucosum (Fig. 5a). These S. verrucosum-specific orthogroups included 7,103 genes, of which 77.4% lacked functional annotations and 8.9% had similarities to the genes encoded in TEs based on the Hayai-Annotation Plants v2 pipeline. S. verrucosum presented the second largest number of shared orthogoups (28,568) after S. commersonii.

Fig. 5.

Orthologous relationships among 6 species genomes. a) UpSet plot of the shared orthogroups among 38,937 orthogroups assigned from a total of 288,020 genes identified in the 6 species genomes. The top 20 orthogroup intersections are shown. b) OrthoFinder-generated phylogenetic tree constructed using 16,964 orthogroups present in all 6 species. S. lycopersicum was used as the outgroup species. The species phylogeny was inferred from the similarity of the 16,964 orthogroups present in all the 6 species using OrthoFinder with the Species Tree inference from All Genes (STAG) algorithm (Emms and Kelly 2018). S. commersonii was distantly related among tuber-bearing species (Fig. 5b). In tuber-bearing Solanum species, the interspecific crossing barrier is explained by the Endosperm Balance Number (EBN) hypothesis (Johnston ; Ehlenfeldt and Ortiz 1995). According to this hypothesis, a 2:1 ratio of maternal to paternal EBN in the endosperm is necessary for normal endosperm development (Johnston ). S. commersonii shows an EBN of 1, whereas most of the other A-genome diploid species show an EBN of 2 (Ehlenfeldt and Hanneman 1988; Hanneman 1994). An imbalanced EBN causes endosperm abortion following interspecific hybridization, which is one of the major reproductive barriers among potato species (Johnston ; Hawkes and Jackson 1992; Hanneman 1999). S. verrucosum was observed to be most closely related to S. chacoense, indicating that the 2 species have relatively similar gene sequences while frequent rearrangements by intrachromosomal inversions were observed (Fig. 4b). Interestingly, geographical distributions of the 2 species are most distant among the A-genome species (Hawkes 1990).

Conclusions

We constructed a high-quality de novo assembly of the geographically isolated A-genome species S. verrucosum with a scaffold N50 of 55.2 Mb. The evaluation of variability within the A genome, including that of S. verrucosum, encompassed the geographic range of these species from the north (S. verrucosum in Mexico) to the south (S. chacoense in Argentina), which will be useful for understanding genomic differentiation among A-genome species. Since S. verrucosum has been considered a maternal progenitor of Mexican polyploid species (Hosaka ; Spooner ; Rodríguez and Spooner 2009; Sanetomo and Hosaka 2013), this whole-genome sequence will be a valuable resource for understanding polyploid evolution. Furthermore, the whole-genome sequence of S. verrucosum will facilitate the exploration of its unique crossing behaviors, such as self-compatibility and unilateral cross-compatibility (Hawkes 1990; Eijlander ), which will help us understand its function as a bridging species (Hermsen and Ramanna 1976; Hamernik ; Dinu ; Jansky and Hamernik 2009; Bamberg ) and will promote the introgression of useful traits from reproductively isolated Mexican diploid species into cultivated potatoes.

Data availability

The raw DNA sequencing reads, genome assembly, and annotation have been deposited into the National Center for Biotechnology Information under BioProject Number PRJNA820895. Supplemental material is available at G3 online. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file.

50 in total

1. Basic local alignment search tool.

Authors: S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal: J Mol Biol Date: 1990-10-05 Impact factor: 5.469

2. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes.

Authors: Brandi L Cantarel; Ian Korf; Sofia M C Robb; Genis Parra; Eric Ross; Barry Moore; Carson Holt; Alejandro Sánchez Alvarado; Mark Yandell
Journal: Genome Res Date: 2007-11-19 Impact factor: 9.043

3. Taxonomic and evolutionary implications of the Endosperm Balance Number hypothesis in potatoes.

Authors: J G Hawkes; M T Jackson
Journal: Theor Appl Genet Date: 1992-06 Impact factor: 5.699

Review 4. Transposable elements, genome evolution and transgenerational epigenetic variation.

Authors: Aoi Hosaka; Tetsuji Kakutani
Journal: Curr Opin Genet Dev Date: 2018-03-08 Impact factor: 5.578

5. Novel inter-series hybrids in Solanum, section Petota.

Authors: I I Dinu; R J Hayes; R G Kynast; R L Phillips; C A Thill
Journal: Theor Appl Genet Date: 2004-10-29 Impact factor: 5.699

6. Repeatless and repeat-based centromeres in potato: implications for centromere evolution.

Authors: Zhiyun Gong; Yufeng Wu; Andrea Koblízková; Giovana A Torres; Kai Wang; Marina Iovene; Pavel Neumann; Wenli Zhang; Petr Novák; C Robin Buell; Jirí Macas; Jiming Jiang
Journal: Plant Cell Date: 2012-09-11 Impact factor: 11.277

7. Assessing genome assembly quality using the LTR Assembly Index (LAI).

Authors: Shujun Ou; Jinfeng Chen; Ning Jiang
Journal: Nucleic Acids Res Date: 2018-11-30 Impact factor: 16.971

8. Analyses of 202 plastid genomes elucidate the phylogeny of Solanum section Petota.

Authors: Binquan Huang; Holly Ruess; Qiqi Liang; Christophe Colleoni; David M Spooner
Journal: Sci Rep Date: 2019-03-14 Impact factor: 4.379

9. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline.

Authors: Shujun Ou; Weija Su; Yi Liao; Kapeel Chougule; Jireh R A Agda; Adam J Hellinga; Carlos Santiago Blanco Lugo; Tyler A Elliott; Doreen Ware; Thomas Peterson; Ning Jiang; Candice N Hirsch; Matthew B Hufford
Journal: Genome Biol Date: 2019-12-16 Impact factor: 13.583

10. Complete Sequence, Multichromosomal Architecture and Transcriptome Analysis of the Solanum tuberosum Mitochondrial Genome.

Authors: Jean-Stéphane Varré; Nunzio D'Agostino; Pascal Touzet; Sophie Gallina; Rachele Tamburino; Concita Cantarella; Elodie Ubrig; Teodoro Cardi; Laurence Drouard; José Manuel Gualberto; Nunzia Scotti
Journal: Int J Mol Sci Date: 2019-09-26 Impact factor: 5.923