| Literature DB >> 31673575 |
Adriana Bastías1, Francisco Correa2, Pamela Rojas2, Constanza Martin2, Jorge Pérez-Diaz3, Cristian Yáñez3, Mara Cuevas3, Ricardo Verdugo3, Boris Sagredo2.
Abstract
Maqui (Aristotelia chilensis [Molina] Stunz) is a small dioecious tree, belonging to the Elaeocarpaceae family. Maqui fruit has high levels of antioxidant activity, which are due to elevated anthocyanin and polyphenol content. Here we describe a draft genome sequence data of maqui (A. chilensis). The genomic sequence datasets were obtained using Illumina NextSeq platform. Nucleotide sequences of raw reads and the assembled draft genome are available at NCBI's Sequence Read Archive as BioProject PRJNA544858. Also, a total of 210067 microsatellite or simple sequence repeat (SSR) markers were identified.Entities:
Keywords: Aristotelia chilensis; Draft genome; Illumina NextSeq platform; Maqui; Microsatellite; SSR markers; Sequencing
Year: 2019 PMID: 31673575 PMCID: PMC6817651 DOI: 10.1016/j.dib.2019.104545
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Dataset of maqui (A. chilensis) reads obtained by Illumina NextSeq 550 sequencing before and after filtering.
| Species | Before filtering | After filtering | |||
|---|---|---|---|---|---|
| Total reads (×2) | GC (%) | Total reads (×2) | GC (%) | % total reads | |
| 187,132,040 | 36 | 179,407,345 | 35.13 | 95,87 | |
Data on contig measurements that were assembled by MaSuRCA software with high-quality reads.
| Item | Number | Description |
|---|---|---|
| Total number of sequences | 58,451 | Counts |
| N50 | 13,213 | A + T + C + G + N (bp) |
| Max contig | 113,184 | (A + T + C + G) not include Ns |
| Min contig | 500 | (A + T + C + G) not include Ns |
| Total length of sequences | 326,414,674 | A + T + C + G + N (bp) |
| Total valid length of sequences | 326,169,547 | A + T + C + G (bp) |
| Unknown bases (Ns) in sequences | 245,127 | bp |
| Percentage of unknown bases | 0.08 | Percentage (%) |
| GC content | 35.13 | (G + C)/(A + T + C + G) not include Ns (%) |
Fig. 1Percentage of 1375 single-copy orthologs genes from 60 plants by BUSCO analysis.
Dataset of microsatellite (SSRs) searches of maqui (A. chilensis) using PERF software.
| Item | Number | Description |
|---|---|---|
| Total number of perfect SSRs | 210,067 | Counts |
| Total length of perfect SSRs | 3,153,200 | bp |
| The average length of SSRs | 15.02 | total ssr length/total ssr counts (bp) |
| SSRs per sequence | 4 | total SSR counts/sequence counts |
| % of sequence occupied by SSRs | 0.97 | ssr total length/total sequence size (%) |
| Relative abundance | 644.04 | total SSRs/total valid length (loci/Mb) |
| Relative density | 9667.36 | total SSR length/total valid length (bp/Mb) |
Distribution to microsatellites di-to hexanucleotide motifs in the assembled genomic DNA of maqui (A. chilensis).
| Type | Counts | Length (bp) | Percent (%) | Relative Abundance (loci/Mb) | Relative Density (bp/Mb) |
|---|---|---|---|---|---|
| Di | 115,254 | 1,765,324 | 54.87 | 353.36 | 5412.29 |
| Tri | 32,972 | 480,600 | 15.7 | 101.09 | 1473.47 |
| Tetra | 37,247 | 481,296 | 17.73 | 114.2 | 1475.6 |
| Penta | 15,190 | 242,440 | 7.23 | 46.57 | 743.29 |
| Hexa | 9,404 | 183,540 | 4.48 | 28.83 | 562.71 |
Fig. 2Distribution of SSR from maqui (A. chilensis) with Di-to Hexa-nucleotides by repeat numbers. The graph is based on a total of 210,067 SSRs detected in non-redundant genomic maqui DNA. Di, tri, tetra, penta and hexa, refer to dinucleotides, trinucleotides, tetranucleotides, pentanucleotides, and hexanucleotides, respectively.
Specifications table
| Subject area | |
| More specific subject area | |
| Type of data | |
| How data was acquired | Paired-end–tag DNA sequencing was realized using illumina NexSeq 550 platform. |
| Data format | Raw and analyzed data of draft genome assembly; SSR table |
| Experimental factors | Leaves of maqui, DNA extraction and |
| Experimental features | Genomic DNA was extracted from leaves of maqui ( |
| Data source location | Rengo, Chile, INIA-Rayentue (Avda. Salamanca s/n, Km 105 ruta 5 sur, sector Los Choapinos). Latitude 34°19′16.1″S and longitude 70°50′03.6″W. |
| Data accessibility | The nucleotide sequences of raw reads and assembled draft genome are available at NCBI's Sequence Read Archive as BioProject PRJNA544858 ( |
| Related research article | Bastías, A., Correa, F., Rojas, P., Almada, R., Muñoz, C., Sagredo, B., 2016. Identification and Characterization of Microsatellite |
Data of raw sequence reads and assembled draft genome of maqui ( Draft genome data can facilitate the identification of molecular mechanisms that underlie properties of maqui products, thereafter contribute to improve them by classical and/or biotechnological approaches. The draft genome data will accelerate functional genomics research in this species. The newly developed SSR markers dataset of maqui should be useful tools to assesses its genetic diversity and understand its genetic structure, facilitating the implementation of effective conservation system of its natural populations. |