| Literature DB >> 31899510 |
Nikolai Hecker1,2,3, Michael Hiller1,2,3.
Abstract
BACKGROUND: Multiple alignments of mammalian genomes have been the basis of many comparative genomic studies aiming at annotating genes, detecting regions under evolutionary constraint, and studying genome evolution. A key factor that affects the power of comparative analyses is the number of species included in a genome alignment.Entities:
Keywords: comparative gene annotation; enhancers; genome alignment; mammals; ultraconserved elements
Mesh:
Year: 2020 PMID: 31899510 PMCID: PMC6941714 DOI: 10.1093/gigascience/giz159
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:Phylogeny of 120 mammals included in our alignment and number of annotated genes. Bars visualize the number of human genes for which we projected ≥1 intact exon. Major groups of mammals are indicated. The 57 Laurasiatheria species are shown on the right side for space reasons.
Figure 2:Example of a sequence alignment of 120 mammals showing an 88-bp region inside a UCE. This UCE is located in an intron of the DACH1 gene, which encodes a transcription factor important for development. Dots in the 120-mammal alignment refer to bases that are identical to those in the human genome. For space reasons, 25 primates, 13 carnivora, and 10 bats that all have identical sequence to human are not shown. Green and blue fonts indicate species of different clades. The alignment of this ultraconserved region shows that most columns are identical across all 120 mammals but also reveals a few substitutions. Some of these substitutions are species-specific and may be attributed to base errors in the assembly. Other substitutions are shared among independently sequenced genomes of related species (red boxes), which makes base errors very unlikely. We used shared substitutions to calculate a lower bound for the percentage of UCE positions that can vary across placental mammals. We used both shared and species-specific substitutions to calculate an upper bound for this percentage.
Figure 3:Variability of UCEs across placental mammals. For each alignment position in the 441 UCEs for which ≥110 placental mammals had aligning sequence in our genome alignment, we examined whether positions in the UCE are identical or were substituted at least once across the 115 non-human placental mammals. (A) Violin and box plots show the distribution of the fraction of variable positions per UCE across placental mammals. The white box spans the first to the third quartile, the middle line indicates the median. In addition to considering all 115 non-human placental mammals, we also determined the fraction of variable positions per UCE considering only 60 non-human placental mammals. This illustrates that analyzing fewer species would underestimate UCE variability. (B) Bar plots show the number of substitutions observed in UCEs with respect to their relative position in UCEs. UCEs were divided into 100 equally sized bins. Both upper and lower bounds show that UCEs are more variable at their flanks than in their center.