| Literature DB >> 20067627 |
Luca Santuari1, Sylvain Pradervand, Amelia-Maria Amiguet-Vercher, Jerôme Thomas, Eavan Dorcey, Keith Harshman, Ioannis Xenarios, Thomas E Juenger, Christian S Hardtke.
Abstract
Identification of small polymorphisms from next generation sequencing short read data is relatively easy, but detection of larger deletions is less straightforward. Here, we analyzed four divergent Arabidopsis accessions and found that intersection of absent short read coverage with weak tiling array hybridization signal reliably flags deletions. Interestingly, individual deletions were frequently observed in two or more of the accessions examined, suggesting that variation in gene content partly reflects a common history of deletion events.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20067627 PMCID: PMC2847716 DOI: 10.1186/gb-2010-11-1-r4
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1UHTS and tiling array statistics for the investigated accessions. (a) Total number of short reads (35 bp for paired end runs; 36 bp for single end runs) obtained for each accession after quality filtering and calculated raw coverage (single end runs were performed for Eil-0, Lc-0 and Sav-0; additional paired end runs for Eil-0 and Lc-0; Tsu-1 and Col-0 reads from single end runs were obtained from published data). For Tsu-1, a subset of reads was retrieved (Tsu-1red) for comparative purposes. (b) Average coverage after MAQ mapping of the short reads onto the Col-0 reference genome and number of base-pairs in the reference genome with zero coverage. (c) Genomic tiling array statistics. Left: number of unique tiles with relative hybridization signal ratio <-1.0 (log2) calculated from the averages of two array hybridizations with divergent DNA versus two array hybridizations with the reference DNA. Middle: number of unique tiles with no UHTS coverage across all 25 bp of the tile. Right: intersection between the two groups of tiles. (d) Example plot of tiling array signal ratio (top panel) versus UHTS coverage (bottom panel). The entire gene (At1g31100) appears to be deleted in Eil-0, but appears to be intact in Lc-0. Please refer to Figure 3c for detailed plot labels.
Figure 2Genome-wide distribution and size of deletions within genes. Deletions were called by intersecting a tiling array signal ratio <-1.0 (log2) of individual 25-bp tiles with no coverage of the entire tiles by UHTS short reads, for at least 100 bp. Tiles within a gene that fulfilled these criteria were added up, taking into account the gaps between tiles (typically 10 bp; maximum 39 bp), to calculate the approximate proportion of a gene affected by a deletion(s) (y axis). Each dot represents a gene: red dots represent transposable element genes (which cluster around the centromeres); black dots represent protein coding genes. The genes are plotted along the five Arabidopsis chromosomes (chr.), drawn to scale (x axis).
Figure 3Overlap of deletions between two or more of the four accessions examined. (a) Venn diagram of the overlap between transposable element genes for which deletions (that is, tiling array signal ratio <-1.0 (log2) and no short read coverage for at least 100 bp) could be detected in the different accessions. (b) Same as (a), for protein coding genes. (c) Example plot of tiling array signal ratio versus UHTS short read coverage for a gene (At1g09840) in all four accessions. Top panels: tiling array signal ratio (log2), with the -1.0 threshold indicated by a red line. Bottom panel: corresponding short read coverage after MAQ mapping. A major deletion shared by two accessions (Eil-0 and Lc-0) and another shared by all four accessions are highlighted.