| Literature DB >> 23173096 |
Daniel J Fairbanks1, Aaron D Fairbanks, T Heath Ogden, Glendon J Parker, Peter J Maughan.
Abstract
NANOGP8 is a human (Homo sapiens) retrogene, expressed predominantly in cancer cells where its protein product is tumorigenic. It arose through retrotransposition from its parent gene, NANOG, which is expressed predominantly in embryonic stem cells. Based on identification of fixed and polymorphic variants in a genetically diverse set of human NANOG and NANOGP8 sequences, we estimated the evolutionary origin of NANOGP8 at approximately 0.9 to 2.5 million years ago, more recent than previously estimated. We also discovered that NANOGP8 arose from a derived variant allele of NANOG containing a 22-nucleotide pair deletion in the 3' UTR, which has remained polymorphic in modern humans. Evidence from our experiments indicates that NANOGP8 is fixed in modern humans even though its parent allele is polymorphic. The presence of NANOGP8-specific sequences in Neanderthal reads provided definitive evidence that NANOGP8 is also present in the Neanderthal genome. Some variants between the reference sequences of NANOG and NANOGP8 utilized in cancer research to distinguish RT-PCR products are polymorphic within NANOG or NANOGP8 and thus are not universally reliable as distinguishing features. NANOGP8 was inserted in reverse orientation into the LTR region of an SVA retroelement that arose in a human-chimpanzee-gorilla common ancestor after divergence of the orangutan ancestral lineage. Transcription factor binding sites within and beyond this LTR may promote expression of NANOGP8 in cancer cells, although current evidence is inferential. The fact that NANOGP8 is a human-specific retro-oncogene may partially explain the higher genetic predisposition for cancer in humans compared with other primates.Entities:
Keywords: SVA element; cancer; human diversity; pseudogene evolution; retroelement
Mesh:
Substances:
Year: 2012 PMID: 23173096 PMCID: PMC3484675 DOI: 10.1534/g3.112.004366
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Primers for PCR amplification and sequencing, primer binding sites, primer pairings, and amplified fragment sizes and sites
| Primer | Binding Site | |
|---|---|---|
| F1: 5′ | c.–212 to c.–193 (5′ insertion site of | |
| F2: 5′ | c.473 to c.498 (exon 3 in | |
| F3: 5′ | c.502–64C to c.502–37C (intron 3 in | |
| F4: 5′ | c.477 to c.496 (splice site for exons 3 and 4 in | |
| F5: 5′ | c.189 to c.212 (second exon in | |
| R1: 5′ | c.*512 to c.*543 (3′ boundary of | |
| R2: 5′ | c.*546 to c.*551 and c.*574 to c.*593 (22 nucleotide-pair deletion in 3′ UTR of | |
| R3: 5′ | c.*550 to c.*573 (ancestral, non-deletion site in | |
| R4: 5′ | c.*140 to c.*160 (3′ UTR in | |
| F1/R1 | 1681 | c.–212 to c.*543 from |
| F2/R1 | 1132 | c.473 to c.*543 from |
| 997 | c.473 to c.*543 from | |
| F2/R2 | 1157 | c.473 to c.*593 from |
| 1025 | c.473 to c.*593 from | |
| F2/R3 | 1154 | c.473 to c.*573 from NANOG allele without deletion |
| F3/R1 | 1029 | c.502–64C to c.*543 from |
| F4/R1 | 990 | c.477 to c.*543 from |
Nucleotides are numbered in accordance with Nomenclature for the Description of Sequence Variants of the Human Genome Variation Society (http://www.hgvs.org/mutnomen), as follows: The symbol “c.” refers to coding sequence. Numbering in the reading frame begins at the first nucleotide of the initiation codon and ends with the final nucleotide of the termination codon. Nucleotides in the 5′ UTR are numbered in reverse, denoted with a negative sign (–), with the nucleotide preceding the first nucleotide of the reading frame designated as –1. Nucleotides in the 3′ UTR are numbered consecutively, denoted with an asterisk (*), with the first nucleotide beyond the final nucleotide of the reading frame designated as *1.
Figure 1 Comparison of NANOGP8 with its parent gene NANOG. 3′ UTR, 3′ untranslated region; 5′ UTR, 5′ untranslated region; Alu, Alu element in 3′ UTR; RF, reading frame; TSD, target site duplication.
Variants between NANOG and NANOGP8 sequences detected by comparison of current primary and alternate reference assemblies and sequences we obtained experimentally
| Variant in Coding DNA | Variant in Protein | ||
|---|---|---|---|
| — | T | T/C | |
| C | C/A | ||
| = | T | T/C | |
| = | T/C | T | |
| G | G/T | ||
| T/G | T | ||
| = | G/A | G | |
| = | C/T | C | |
| = | C/T | C | |
| = | A | A/T | |
| C | C/T | ||
| A | A/C | ||
| = | C/T | C | |
| TG | TG/del | ||
| — | G | G/A | |
| — | G | G/A | |
| — | poly(T) | poly(T) | |
| — | poly(T) | poly(T) | |
| — | G/A | G | |
| — | T/C | T | |
| — | C | C/G | |
| — | C | C/T | |
| — | G/A | G | |
| — | G | G/A | |
| — | T/del | T | |
| — | G | G/A | |
| — | =/del | del | |
| — | |||
| — | A/G | A | |
| — | C/T | C | |
| — | C/T | C | |
| — | G/A | A | |
| — | C/A | C | |
| — | A/G | A |
Polymorphic variants are indicated with a forward slash separating the ancestral and derived variants, with the ancestral variant indicated first. All variants are polymorphic in either NANOG or NANOGP8 except for three, 144G > A, 759G > C, and , indicated in boldface.
In accordance with human genetic nomenclature guidelines for designating DNA variants in genes (http://www.hgvs.org/mutnomen), nucleotides in the reading frame are numbered relative to the first nucleotide in the ATG initiation codon; positions preceded by a negative (–) sign are in the 5′ UTR and are numbered in reverse relative to the first nucleotide in the ATG initiation codon; and positions indicated with an asterisk (*) are in the 3′ UTR relative to the first nucleotide beyond the termination codon. The symbol “=” denotes that a nucleotide substitution has no effect on the protein (i.e. a synonymous variant in the reading frame).
Variants at these sites were not present in the comparison of primary and alternate reference assemblies of NANOGP8 or in other publications. However, we observed polymorphism in at least two individuals for each of these sites in the sequences we obtained.
Distribution of the ancestral (=) and deletion (*552–*573del) alleles in the 3′ UTR of NANOG among 119 geographically diverse individuals
| Population | Number of Individuals Tested | Homozygous for Ancestral (=) Allele | Heterozygous | Homozygous for Deletion ( |
|---|---|---|---|---|
| Africans South of the Sahara | 9 | 1 | 6 | 2 |
| Biaka Pygmy | 4 | 1 | 2 | 1 |
| Mbuti Pygmy | 5 | 2 | 1 | 2 |
| African-American | 14 | 2 | 8 | 4 |
| Druze | 5 | 5 | 0 | 0 |
| Indo-Pakistani | 3 | 3 | 0 | 0 |
| Russian Krasnodar | 3 | 2 | 1 | 0 |
| Ami | 5 | 0 | 2 | 3 |
| Chinese | 5 | 1 | 4 | 0 |
| Japanese | 4 | 0 | 3 | 1 |
| Southeast Asian | 3 | 0 | 2 | 1 |
| Pacific | 4 | 3 | 1 | 0 |
| South American | 7 | 4 | 2 | 1 |
| Mexican | 8 | 4 | 4 | 0 |
| Mexican-American | 9 | 2 | 4 | 3 |
| Puerto Rican | 8 | 1 | 2 | 5 |
| CEPH-Utah | 22 | 12 | 10 | 0 |
| Unidentified | 1 | 0 | 1 | 0 |
| Total | 119 | 43 | 53 | 23 |
Frequency of the ancestral (=) sequence = 0.5840.
Frequency of deletion () = 0.4160.
Figure 2 The evolutionary relationship of NANOGP8 with the two major haplotypes of NANOG, as distinguished by evidently fixed reading-frame variants c.144G > A and c.759G > C, and the 22-nucleotide 3′ UTR deletion c.*552_*573del.
Figure 3 Insertion of NANOGP8 into the LTR region of an SVA_A retroelement. env, envelope gene of a HERV-K endogenous retrovirus; LTR, long terminal repeat of a HERV-K endogenous retrovirus; poly(A), poly(A) tail; TSD, target site duplication; VNTR, variable nucleotide tandem repeat region.