Literature DB >> 29329601

Nicotiana glauca whole-genome investigation for cT-DNA study.

Galina Khafizova1, Pavel Dobrynin2,3, Dmitrii Polev4, Tatiana Matveeva5.   

Abstract

OBJECTIVE: Nicotiana glauca (tree tobacco) is a naturally transgenic plant, containing sequences acquired from Agrobacterium rhizogenes by horizontal gene transfer. Besides, N. glauca contains a wide profile of alkaloids of medical interest. DATA DESCRIPTION: We report a high-depth sequencing and de novo assembly of N. glauca full genome and analysis of genome elements with bacterial origin. The draft genome assembly is 3.2 Gb, with N50 size of 31.1 kbp. Comparative analysis confirmed the presence of single, previously described gT insertion. No evidence was acquired to support idea of multiple T-DNA insertions in the N. glauca genome. Our data is the first comprehensive de novo assembly of tree tobacco and provide valuable information for researches in pharmacological and in phylogenetic fields.

Entities:  

Keywords:  Cellular T-DNA; Genome assembly; Nicotiana glauca; Whole genome sequencing

Mesh:

Substances:

Year:  2018        PMID: 29329601      PMCID: PMC5766994          DOI: 10.1186/s13104-018-3127-x

Source DB:  PubMed          Journal:  BMC Res Notes        ISSN: 1756-0500


Objective

Nicotiana glauca (tree tobacco) is a member of the Solanaceae family, which includes important crops (potato, tomato, eggplant, pepper) and many medicinal plants [1]. This diploid plant is native to South America and is one of the first Nicotiana species with Agrobacterium cellular T-DNA (cT-DNA) [2]. Its cT-DNA is a partial, inverted repeat, called gT [3]. Tree tobacco belongs to the section Noctiflorae. Sequencing of the genomes of N. tomentosiformis and N. otophora (section Tomentosae) and N. tabacum (section Nicotiana) allowed the detection of previously unknown multiple cT-DNAs [4], raising the question whether there are other T-DNA insertions in the N. glauca. NGS data can help answer this question. Besides, N. glauca contains a profile of alkaloids different from N. tabacum [5]. The plant is used for medicinal purposes. Comparative analysis of genomic data of phylogenetically distant tobacco species will provide valuable information on the genetic basis for various traits, especially secondary metabolism. Our data complement the list of species for the comparative genomics of Nicotiana, which opens up new opportunities for pharmacological and phylogenetic studies.

Data description

One plant isolate was sequenced on Illumina HiSeq machine, yielding in total 210 Gb of raw sequence data. De novo assembly resulted in 385116 scaffolds, with N50 and L50 of 31.1 kbp and 27293 respectively. Genome size suggested by K-mer analysis is 2 Gb, while the final size of the assembled genome equaled 3.2 Gb. Comparative analyses of N. glauca scaffolds against genome assembly of N. tabacum TN90 cultivar strain resulted in 3.2 Gbp of aligned sequences median identity of 88%. T-DNA analysis revealed sequences homologous to agrobacterial genes orf13a, orf13, orf14, rolC, rolB and mis. The fragment of T-DNA obtained in the assembly is organized in an imperfect inverted repeat. The similarity of the nucleotide sequences, that we found, and sequence of gT, previously described by Suzuki [3] was 99%, while its similarity to Agrobacterium T-DNA is 77–89%. Sequences of PCR fragments, amplified from T-DNA/plantDNA junction areas, coincide with known ones (Acs. AB071335, AB071334).

Methodology

Sample collection

Leaf tissue of aseptic plants N. glauca was used for DNA extraction, with a modified version of Doyle and Doyle protocol [6], yielding 30 ng/μl of high molecular weight DNA.

Library construction

Purified genomic DNA from N. glauca was used to construct both pair-end and mate pair libraries in order to generate a high coverage de novo assembly. A pair-end library with an insert size of 350 bp was constructed using the TruSeq® Nano DNA Library Prep Reference Guide. To improve resolution of repeats during the assembly stage and scaffolding process, one mate pair library with an insert size of 4 kbp was constructed, according to the Nextera® Mate Pair Library Prep Reference Guide.

Read sequencing, quality analysis and filtering

Pair-end and mate pair libraries were sequenced on four and two lanes using Illumina HiSeq. Quality of raw reads was analyzed with the FastQC [7] program, followed by filtering and trimming raw PE reads with Trimgalore [8]. Mate pair raw reads were processed and splitted with Nextclip [9] and additionally filtered with Trimgalore [8].

Genome assembly

The genome was assembled with the MaSuRCA-3.2.2 genome assembler [10], [config in data file 1].

Whole genome alignment of Nicotiana glauca and Nicotiana tabacum

To identify the location of the N. glauca cT-DNA insertion relative to the N. tabacum genome, we mapped all N. glauca scaffolds to N. tabacum scaffolds downloaded from the Sol Genomics Network [11]. To increase accuracy of alignment we masked all known plant repeat classes and their homologs in the N. glauca genome. For repeat identification, we used the RepeatMasker software [12] and the latest Repbase Update library from 09.27.2017. For whole genome alignment, we used the Last software [13].

T-DNA analysis

The Last software [13] was used to carry out the alignment of the database, containing all known T-DNA-like sequences, that were detected as part of cT-DNA [data file 2], to the N. glauca genome. To reaffirm T-DNA/plantDNA junction areas Long PCR was carried out using “LONG PCR enzyme Mix” (Thermo scientific) according to the instructions for the kit (Table 1).
Table 1

Overview of data files

LabelName of data fileFile typesData repository and identifierLicense
Supplementary file 1Methodology description.docx file 10.6084/m9.figshare.5732427.v1 CC BY
Data file 1Parameters for the assembly.txt file 10.6084/m9.figshare.5645854.v1 CC BY
Data file 2T-DNA database.fa file 10.6084/m9.figshare.5754120.v1 CC BY
Overview of data files

Limitations

85% of the mate pair library proved to be PCR duplicates, which we filtered before assembling. Low coverage of MP reads resulted in low N50 and big number of contigs and scaffolds. A better quality or/and a bigger number of MP libraries should be used in future to improve the assembly.
  7 in total

1.  Adaptive seeds tame genomic sequence comparison.

Authors:  Szymon M Kiełbasa; Raymond Wan; Kengo Sato; Paul Horton; Martin C Frith
Journal:  Genome Res       Date:  2011-01-05       Impact factor: 9.043

2.  Using RepeatMasker to identify repetitive elements in genomic sequences.

Authors:  Maja Tarailo-Graovac; Nansheng Chen
Journal:  Curr Protoc Bioinformatics       Date:  2009-03

3.  Tobacco plants were transformed by Agrobacterium rhizogenes infection during their evolution.

Authors:  Kenji Suzuki; Ichiro Yamashita; Nobukazu Tanaka
Journal:  Plant J       Date:  2002-12       Impact factor: 6.417

4.  Deep sequencing of the ancestral tobacco species Nicotiana tomentosiformis reveals multiple T-DNA inserts and a complex evolutionary history of natural transformation in the genus Nicotiana.

Authors:  Ke Chen; François Dorlhac de Borne; Ernö Szegedi; Léon Otten
Journal:  Plant J       Date:  2014-10-14       Impact factor: 6.417

5.  Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm.

Authors:  Aleksey V Zimin; Daniela Puiu; Ming-Cheng Luo; Tingting Zhu; Sergey Koren; Guillaume Marçais; James A Yorke; Jan Dvořák; Steven L Salzberg
Journal:  Genome Res       Date:  2017-01-27       Impact factor: 9.043

6.  NextClip: an analysis and read preparation tool for Nextera Long Mate Pair libraries.

Authors:  Richard M Leggett; Bernardo J Clavijo; Leah Clissold; Matthew D Clark; Mario Caccamo
Journal:  Bioinformatics       Date:  2013-12-02       Impact factor: 6.937

7.  Sequencing and characterization of leaf transcriptomes of six diploid Nicotiana species.

Authors:  Ni Long; Xueliang Ren; Zhidan Xiang; Wenting Wan; Yang Dong
Journal:  J Biol Res (Thessalon)       Date:  2016-04-18       Impact factor: 1.889

  7 in total
  1 in total

1.  Parental origin of the allotetraploid tobacco Nicotiana benthamiana.

Authors:  Matteo Schiavinato; Marina Marcet-Houben; Juliane C Dohm; Toni Gabaldón; Heinz Himmelbauer
Journal:  Plant J       Date:  2020-01-13       Impact factor: 7.091

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.