| Literature DB >> 24673733 |
Louise Brousseau, Alexandra Tinaut, Caroline Duret, Tiange Lang, Pauline Garnier-Gere, Ivan Scotti1.
Abstract
BACKGROUND: The Amazonian rainforest is predicted to suffer from ongoing environmental changes. Despite the need to evaluate the impact of such changes on tree genetic diversity, we almost entirely lack genomic resources.Entities:
Mesh:
Year: 2014 PMID: 24673733 PMCID: PMC3986928 DOI: 10.1186/1471-2164-15-238
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Species description: distribution range, ecological properties relative to light (successional status) and soil, spatial population structure and seed dispersal properties
| Species name | Range | Ecology - light | Ecology-soil | Spatial population structure | Seed dispersal |
|---|---|---|---|---|---|
|
| Neotropics [ | Light-responsive [ | Indifferent [ | Non-aggregated [ | Gravity, rodents [ |
|
| Guiana shield [ | Shade tolerant [ | Mostly seasonally flooded [ | Aggregated [ | Gravity [ |
|
| Neotropics, paleotropics [ | Shade tolerant [ | Seasonally flooded [ | Non-aggregated [ | Gravity, vertebrates [ |
|
| Neotropics [ | Light-responsive [ | Seasonally flooded [ | Non-aggregated [ | Large vertebrates [ |
Figure 1Bioinformatics flowchart.
Partitioning of reads among different organs (leaves, stems, roots) in each species cDNA library ( , , and ) with percentages in parenthesis
| Number of reads |
|
|
|
|
|---|---|---|---|---|
| From leaves [MID1] | 63016 [43334 (28%)] | 17421 [11417 (9%)] | 49894 [32190 (30%)] | 31526 [22077 (11%)] |
| From stems [MID2] | 47100 [29720 (20%)] | 28362 [18088 (14%)] | 110373 [66874 (66%)] | 41435 [28284 (14%)] |
| From roots [MID3] | 132030 [77052 (50%)] | 175551 [100909 (76%)] | 7 [2 (0%)] | 141948 [89918 (72%)] |
| Without tag | 5999 [3435 (2%)] | 3260 [1799 (1%)] | 6866 [4367 (4%)] | 4314 [2691 (2%)] |
Numbers of assembled reads are shown in brackets.
Assembly results: number of assembled reads, number of contigs, total transcriptome coverage, average length per contig, and average number of reads per contig
|
|
|
|
| |
|---|---|---|---|---|
| Number of reads | 248145 | 224554 | 167140 | 219223 |
| Number of assembled reads | 153551 (61.9%) | 132213 (58.9%) | 103433 (61.9%) | 142970 (65.2%) |
| Number of contigs | 21770 | 23390 | 17103 | 21070 |
| Total length (bp) | 11393209 | 9688583 | 7743116 | 9725915 |
| Average length per contig (bp) | 523 | 414 | 453 | 462 |
| N50 | 558 | 441 | 486 | 506 |
| Average number of reads per contig | 7 | 6 | 6 | 7 |
| Proportion of contigs with 10 reads or fewer | 89% | 92% | 91% | 88% |
Figure 2Number of contigs associated with each organ (leaves, stems, roots) (Note: sequencing from roots failed). Carapa = Carapa guianensis; Eperua = Eperua falcata; Symphonia = Symphonia globulifera; Virola = Virola surinamensis. L, S and R indicate contigs specific to Leaf, Stem and Root, respectively; combinations of symbols correspond to contigs occurring in multiple organs.
BlastX statistics per species, performed on consensus sequences obtained from the MIRA assemblies
|
|
|
|
| |
|---|---|---|---|---|
| No of unigenes that did not return any blast result | 4586 (21.1%) | 7231 (30.9%) | 4463 (26.1%) | 6384 (30.3%) |
| No of blasted unigenes | 17184 (78.9%) | 16159 (69.1%) | 12640 (73.9%) | 14686 (69.7%) |
| [No unigenes after contaminant removal] | [16912] | [15664] | [12603] | [14545] |
| No of mapped unigenes | 15879 (72.9%) | 13629 (56.3%) | 11639 (68.1%) | 13000 (61.7%) |
| No of annotated unigenes | 13962 (64.1%) | 11240 (48.1%) | 10164 (59.4%) | 11073 (52.6%) |
| Total assembly length without contaminant (bp) | 11266552 | 9501561 | 7728777 | 9666680 |
| [Total length of blasted unigenes after removal of contaminant and unigenes with e-values >10−25] | [7746737] | [4789056] | [4748202] | [5887279] |
Figure 3Sharing of GO terms (level 3) across species. Only non-contaminant contigs with an e-value lower or equal to 10−25 were retained for the analysis. Cg: Carapa guianensis; Ef: Eperua falcata; Sg: Symphonia globulifera; Vs: Virola surinamensis.
Figure 4Box-plot of permuted values of differences between observed and randomised for individual GO terms in each organ/species. Only biological processes showing a positive difference (i.e. having a bootstrap interval that does not overlap zero, indicating higher expression levels than average) are shown. For detailed names of the biological processes shown, see Additional file 15. (A) C. guianensis; (B) E. falcata; (C) S. globulifera; (D) V. surinamensis (Note: sequencing from S. globulifera roots failed).
Mismatch identification
|
|
|
|
| |
|---|---|---|---|---|
| Total length with depth ≥ 8X after assembly cleaning (bases) | 956876 | 603897 | 499694 | 862357 |
|
| ||||
| N mismatches |
|
|
|
|
| N variant-containing contigs | 1716 (7.88%) | 1299 (5.55%) | 987 (5.77%) | 1752 (8.32%) |
| mismatch density (/100 bp) | 1.11 | 1.17 | 1.09 | 1.26 |
| N mismatches with 2 variants | 10420 (98.16%) | 6968 (98.36%) | 5362 (98.44%) | 10757 (98.72%) |
| N transitions | 2655 | 1875 | 1090 | 2182 |
| N transversions | 1699 | 1155 | 779 | 1474 |
| Ti/Tv | 1.56 | 1.62 | 1.40 | 1.48 |
| N indel | 6066 | 3938 | 3493 | 7101 |
| N mismatches > 2 variants | 195 (1.84%) | 116 (1.64%) | 85 (1.56%) | 140 (1.28%) |
|
| ||||
| N mismatches |
|
|
|
|
| N variant-containing contigs | 1706 (7.83%) | 1283 (5.5%) | 979 (5.72%) | 1746 (8.29%) |
| mismatch density (/100 bp) | 0.90 | 0.95 | 0.89 | 1.05 |
| N mismatches with 2 variants | 8534 (95.70%) | 5649 (98.89%) | 4388 (98.96%) | 8981 (98.95%) |
| N transitions | 2380 | 1657 | 989 | 1989 |
| N transversions | 1488 | 1000 | 681 | 1310 |
| Ti/Tv | 1.60 | 1.66 | 1.45 | 1.52 |
| N indel | 4666 | 2992 | 2718 | 5682 |
| N mismatches > 2 variants | 112 (1.13%) | 64 (1.12%) | 46 (1.04%) | 95 (1.05%) |
Figure 5Mismatches represented based on their allelic pattern.