| Literature DB >> 24133500 |
Emily J Wood1, Kwanrutai Chin-Inmanu, Hui Jia, Leonard Lipovich.
Abstract
Previous efforts to characterize conservation between the human and mouse genomes focused largely on sequence comparisons. These studies are inherently limited because they don't account for gene structure differences, which may exist despite genomic sequence conservation. Recent high-throughput transcriptome studies have revealed widespread and extensive overlaps between genes, and transcripts, encoded on both strands of the genomic sequence. This overlapping gene organization, which produces sense-antisense (SAS) gene pairs, is capable of effecting regulatory cascades through established mechanisms. We present an evolutionary conservation assessment of SAS pairs, on three levels: genomic, transcriptomic, and structural. From a genome-wide dataset of human SAS pairs, we first identified orthologous loci in the mouse genome, then assessed their transcription in the mouse, and finally compared the genomic structures of SAS pairs expressed in both species. We found that approximately half of human SAS loci have single orthologous locations in the mouse genome; however, only half of those orthologous locations have SAS transcriptional activity in the mouse. This suggests that high human-mouse gene conservation overlooks widespread distinctions in SAS pair incidence and expression. We compared gene structures at orthologous SAS loci, finding frequent differences in gene structure between human and orthologous mouse SAS pair members. Our categorization of human SAS pairs with respect to mouse conservation of expression as well as structure points to limitations of mouse models. Gene structure differences, including at SAS loci, may account for some of the phenotypic distinctions between primates and rodents. Genes in non-conserved SAS pairs may contribute to evolutionary lineage-specific regulatory outcomes.Entities:
Keywords: bidirectional promoters; complex loci; evolution; expressed sequence tags (ESTs); long non-coding RNA (lncRNA); sense-antisense; transcriptome
Year: 2013 PMID: 24133500 PMCID: PMC3783845 DOI: 10.3389/fgene.2013.00183
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1Three major types of sense-antisense pairs, and one possible type of a gene chain.
Figure 2Conservation of SAS gene pairs and their member genes between human and mouse.
Human-mouse comparative analysis of genomic and transcriptomic orthology at sense-antisense loci.
| See “Construction of the human SAS dataset,” Methods | 9000 human SAS pair member genes | UCSC LiftOver from Hg19 to Mm9 | 2227 pairs with genomic orthology |
| 2227 pairs with genomic orthology (above at right) | Lists of genes and pairs with genomic 1:1 orthology in Hg19 and Mm9 | Pair non-conservation analysis, see Methods | 66 human gene pairs with one gene transcriptionally silent in mouse |
| Conservation Analysis (above at right) | 66 human gene pairs | EST interrogation, see Methods | 37 transcriptionally active human gene pairs with one member silent in mouse |
| EST interrogation (above at right) | 37 human gene pairs | Manual annotation of sequence and structure conservation in mouse | |
| 3 Complete conservation | |||
| 7 Positional equivalents | |||
| 7 Complete non-conservation | |||
| 20 Other |
Each row corresponds to a sequential stage in our analysis pipeline. The results (output) from the last column of each row serve as input into the first column of the next row.
Figure 3Genomic structure conservation at 986 SAS pairs putatively orthologous between human and mouse.
Extent of mouse gene structure conservation for 37 manually annotated human sense-antisense gene pairs.
| No genes at orthologous locus | 7 | ||
| Single gene at orthologous locus | 13 | ||
| SAS pair | 19 | SAS pair at orthologous locus | 12 |
| 3-gene chains | 13 | 3-gene chains at orthologous locus | 3 |
| 4-gene chains | 4 | 4-gene chains at orthologous locus | 1 |
| 5-gene chains | 1 | 5-gene chains at orthologous locus | 1 |
| No BDPs | 23 | No BDPs at orthologous locus | 31 |
| 1 BDP | 12 | 1 BDP at orthologous locus | 6 |
| 2 BDPs | 2 | 2 BDPs at orthologous locus | 0 |
SAS, sense-antisense; BDP, bidirectional promoter.
Figure 4Manual annotation of selected orthologous loci with human-mouse gene structure distinctions. Positive-strand transcription, relative to the genome assembly, is in red. Negative-strand transcription, relative to the genome assembly, is in blue. Beige boxes delineate bidirectional promoters (BDP) and sense-antisense overlaps (SAS). A 5′/5′ SAS is an overlap of two genes at their 5′ ends (a divergent overlap). A 3′/3′ SAS is an overlap of two genes at their 3′ ends (a convergent overlap). (A) Two protein-coding genes have orthologs: PITX and H2AFY. H2AFY has a positionally equivalent [see Babak et al. (2007) for definition] SAS lncRNA at its 3′end (AK026965 in human) in both species, suggesting a sequence-independent requirement for SAS pairing of H2AFY. In human, H2AFY shares a bidirectional promoter with another lncRNA (AK092789) for which no genomic or transcriptional conservation exists in mouse. (Supplementary Dataset 5: rows 62–63.). (B) The human protein-coding gene AY358799 has a mouse ortholog, “Ncrna00085” (encoding a 339-aa protein, despite its misleading name that arose out of incorrect public “lincRNA” annotations that are loaded into the UCSC Genome Database). The same cluster of three conserved microRNAs is observed immediately upstream of this gene in both species. However, this protein-coding gene has a SAS lncRNA, AK125996, only in human. Despite the more comprehensive mouse cDNA/EST coverage by the FANTOM3 data, no antisense cDNAs or ESTs are found at the orthologous mouse locus. (Supplementary Dataset 5: rows 14–15.). (C) The human TSSC4 gene overlaps a SAS lncRNA, AK095568, at its 5′ end. The human TSSC4 and TRPM5 genes are clearly separated along the genome, with no intervening transcription. Mouse Tssc4 is SAS to an extended 3′-end isoform of Trpm5, and also lacks any cDNA or EST evidence of a 5′-end SAS transcript. The nearby CD81 gene has a conserved SAS lncRNA in human and mouse. (Supplementary Dataset 5: rows 68–69.).
Figure 5Transcriptional activity at the 2227 mouse orthologs of human sense-antisense loci.