| Literature DB >> 27153597 |
William Chow1, Kim Brugger2, Mario Caccamo3, Ian Sealy1, James Torrance1, Kerstin Howe1.
Abstract
MOTIVATION: For most research approaches, genome analyses are dependent on the existence of a high quality genome reference assembly. However, the local accuracy of an assembly remains difficult to assess and improve. The gEVAL browser allows the user to interrogate an assembly in any region of the genome by comparing it to different datasets and evaluating the concordance. These analyses include: a wide variety of sequence alignments, comparative analyses of multiple genome assemblies, and consistency with optical and other physical maps. gEVAL highlights allelic variations, regions of low complexity, abnormal coverage, and potential sequence and assembly errors, and offers strategies for improvement. Although gEVAL focuses primarily on sequence integrity, it can also display arbitrary annotation including from Ensembl or TrackHub sources. We provide gEVAL web sites for many human, mouse, zebrafish and chicken assemblies to support the Genome Reference Consortium, and gEVAL is also downloadable to enable its use for any organism and assembly.Entities:
Mesh:
Year: 2016 PMID: 27153597 PMCID: PMC4978925 DOI: 10.1093/bioinformatics/btw159
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Region on GRCh38 Chromosome 11 with variation and missing sequence. (A) Purple clone end pair mappings indicate same end repeated, while red mappings indicate incorrect orientation of paired ends. (B) Two clone components are used to build this region of the assembly. The green box indicates a reliable overlap region (red would indicate high variation). (C) Orange indicates an incomplete transcript mapping. (D) Six Single molecule genome maps (orange/red) can be compared to in silico digest (purple). Red regions indicate discordance. In this case, a ∼7.5 kb block variation is shared between three maps and the reference, whilst three other maps share two fragments. Furthermore, in the ∼39 kb digest block, all six maps indicate a size of ∼45–47 kb, giving evidence of missing sequence (∼7–8 kb). (E) Comparative analysis between HuRef and YH2 assemblies, reveal this missing sequence (dotted box) as well as the region of variation (Supplementary Figure S1)