| Literature DB >> 33995491 |
Abstract
The origin of genes has been a major topic of research for many years, albeit in some cases, it has been a difficult process to elucidate. Insightful is a recent publication that experimentally shows how one gene, linc-UR-UB was born. This gene is regulated in a complex manner in male germ cells during spermatogenesis and is believed to participate in the regulation of levels of the ubiquitin specific peptidase 18 (USP18) mRNA. The process of formation of linc-UR-UB appears relatively simple. It involves a transcription read through from an upstream gene to a downstream functional element, the USP18 3' UTR sequence. This small element also shares the same sequence as the 3' ends of the lincRNA FAM247 family genes. In addition to linc-UR-UB, it is possible that other genes formed in a similar fashion that involves a genomic sequence read through to a functional element.Entities:
Keywords: 3' UTR; USP18; evolution; gene birth; gene structure; lincRNA
Year: 2021 PMID: 33995491 PMCID: PMC8120154 DOI: 10.3389/fgene.2021.661425
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1Comparison of LOC102725072 and linc-UR-B1. Top diagram: Annotated genes that neighbor LOC102725072 in human chr22 with chromosomal coordinates of the genomic segment shown. The schematic is from the NLM/NCBI website: https://www.ncbi.nlm.nih.gov/gene/?term=LOC102725072 (O’Leary et al., 2016). Bottom highlighted diagrams: The BCRP7 gene is a counter transcript within the LOC102725072 gene. LOC102725072 consists of the sequence of the paralogs gene POM121L8P and the POM121L1 (LOC102724151) sequence at its 5' end that is carried by POM121L8P; and in addition, a segment of the BCRP2 gene that carries a portion of the 3' end of the BCR gene (Supplementary Figure S1). The BCRP2 sequence extends from BCRP7 to the 3' end of LOC102725072 (and beyond) and is contiguous with the BCRP7 sequence. The lower diagrams represent the linc-UR-B1 gene and the signature gene/sequence motifs that LOC102725072 and linc-UR-B1 contain. Also shown are the bp lengths of the genes. The BCRP2 gene sequence (of which 2,124 bp of the sequence resides beyond the 3' end of LOC102725072) forms part of linc-UR-B. In addition, 1,225 bp of a homologous sequence to FAM247 is also in linc-UR-B1, which contains the USP18 exon 11 3' UTR sequence but also contains additional sequences of FAM247. POM121L1 is the putative POM121 transmembrane nucleoporin like 1 protein gene LOC10272415 but has not yet been characterized outside of computational methods. The diagrams are not drawn to scale.
Figure 2A diagrammatic comparison of non-coding chromosomal loci with linc-UR-B1 and genes that display similar gene/sequence motifs. The numbers next to FAM247 are the bp positions of the FAM247 sequences that are present in each non-coding region and each gene; the segments of the FAM247 sequence that is present in the genes/non-coding regions differ in each example shown. In chr 20, 4,425 bp of FAM247 includes 3,305 bp present in and contiguous with the immunoglobulin lambda (IGL) sequence; in chr 13, the 7,442 bp shown includes 3,298 bp of FAM247 that are also in the IGL sequence. The genes highlighted in yellow flank the non-coding regions and represent guideposts. Supplementary Figures S3A–C provides an analysis of these regions. The POM121L9P and BCRP3 gene structures are from Delihas (2020).