| Literature DB >> 28338730 |
Cédric Meersseman1,2, Rabia Letaief1, Véronique Léjard1, Emmanuelle Rebours1, Gabriel Guillocheau1, Diane Esquerré3, Anis Djari4, Amanda Chamberlain5,6, Christy Vander Jagt5,6, Christophe Klopp4, Mekki Boussaha1, Gilles Renand1, Abderrahman Maftah2, Daniel Petit2, Dominique Rocha1.
Abstract
Bidirectional promoters are regulatory regions co-regulating the expression of two neighbouring genes organized in a head-to-head orientation. In recent years, these regulatory regions have been studied in many organisms; however, no investigation to date has been done to analyse the genetic variation of the activity of this type of promoter regions. In our study, we conducted an investigation to first identify bidirectional promoters sharing genes expressed in bovine Longissimus thoracis and then to find genetic variants affecting the activity of some of these bidirectional promoters. Combining bovine gene information and expression data obtained using RNA-Seq, we identified 120 putative bidirectional promoters active in bovine muscle. We experimentally validated in vitro 16 of these bidirectional promoters. Finally, using gene expression and whole-genome genotyping data, we explored the variability of the activity in muscle of the identified bidirectional promoters and discovered genetic variants affecting their activity. We found that the expression level of 77 genes is correlated with the activity of 12 bidirectional promoters. We also identified 57 single nucleotide polymorphisms associated with the activity of 5 bidirectional promoters. To our knowledge, our study is the first analysis in any species of the genetic variability of the activity of bidirectional promoters.Entities:
Keywords: bidirectional promoter; cattle genome; genetic variability; muscle
Mesh:
Substances:
Year: 2017 PMID: 28338730 PMCID: PMC5499805 DOI: 10.1093/dnares/dsx004
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Top five transcripts with the most assigned reads
| Ensembl gene ID | Ensembl transcript ID | Description | Gene symbol | % total number of reads |
|---|---|---|---|---|
| ENSBTAG00000026986 | ENSBTAT00000061449 | Titin | TTN | 4.46 |
| ENSBTAG00000018204 | ENSBTAT00000009327 | Myosin heavy chain 1 | MYH1 | 4.30 |
| ENSBTAG00000043561 | ENSBTAT00000060569 | Cytochrome c oxidase subunit I | COX1 | 3.61 |
| ENSBTAG00000007090 | ENSBTAT00000012797 | Myosin heavy chain 2 | MYH2 | 2.36 |
| ENSBTAG00000004965 | ENSBTAT00000006534 | Nucleoporin 133 kDa | NUP133 | 1.99 |
Figure 1Chromosomal distribution of the 120 putative bovine bidirectional promoters active in muscle.
Information on the 16 tested predicted bovine bidirectional promoters
| BiP # | Ensembl gene ID | BTA | Gene start | Gene end | Strand | Description | Gene name | Distance |
|---|---|---|---|---|---|---|---|---|
| ENSBTAG00000016337 | 23 | 32835596 | 32849025 | −1 | Acyl-CoA thioesterase 13 | THEM2 | 293 | |
| ENSBTAG00000000365 | 23 | 32849318 | 32861445 | 1 | tyrosyl-DNA phosphodiesterase 2 | TDP2 | ||
| ENSBTAG00000014646 | 17 | 74392831 | 74397072 | −1 | Mitotic spindle organizing protein 2B | MZT2 | 143 | |
| ENSBTAG00000002130 | 17 | 74397215 | 74412807 | 1 | sphingomyelin phosphodiesterase 4, neutral membrane (neutral sphingomyelinase-3) | SMPD4 | ||
| ENSBTAG00000003550 | 29 | 1032168 | 1056464 | −1 | Chromosome 11 open reading frame 54 | C11orf54 | 163 | |
| ENSBTAG00000003545 | 29 | 1056627 | 1062580 | 1 | TATA box binding protein (TBP)-associated factor, RNA polymerase I, D, 41kDa | TAF1D | ||
| ENSBTAG00000004295 | 11 | 93011815 | 93029730 | −1 | NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, 8, 19kDa | NDUFA8 | 144 | |
| ENSBTAG00000004296 | 11 | 93029874 | 93064357 | 1 | MORN repeat containing 5 | MORN5 | ||
| ENSBTAG00000004991 | 8 | 100163708 | 100212829 | −1 | Inhibitor of kappa light polypeptide gene enhancer in B-cells, kinase complex-associated protein | IKBKAP | 324 | |
| ENSBTAG00000004996 | 8 | 100213153 | 100219346 | 1 | family with sequence similarity 206, member A | FAM206A | ||
| ENSBTAG00000006195 | 29 | 44691682 | 44693901 | −1 | Chromosome 29 open reading frame, human C11orf68 | C29H11orf68 | 270 | |
| ENSBTAG00000006199 | 29 | 44694171 | 44696295 | 1 | DR1-associated protein 1 (negative cofactor 2 alpha) | DRAP1 | ||
| ENSBTAG00000023018 | 28 | 44283041 | 44397693 | −1 | Poly(ADP-ribose) glycohydrolase | PARG | 121 | |
| ENSBTAG00000011694 | 28 | 44397814 | 44418582 | 1 | translocase of inner mitochondrial membrane 23 homolog (yeast) | TIMM23 | ||
| ENSBTAG00000011930 | 17 | 63448841 | 63466323 | −1 | DEAD (Asp-Glu-Ala-Asp) box polypeptide 54 | DDX54 | 327 | |
| ENSBTAG00000011937 | 17 | 63466650 | 63472165 | 1 | chromosome 17 open reading frame, human C12orf52 (C17H12orf52) | RITA | ||
| ENSBTAG00000012586 | 2 | 86438979 | 86449372 | −1 | Heat shock 60kDa protein 1 (chaperonin) | HSPD1 | 153 | |
| ENSBTAG00000012589 | 2 | 86449525 | 86451564 | 1 | heat shock 10kDa protein 1 (chaperonin 10) | HSPE1 | ||
| ENSBTAG00000015667 | 11 | 2324293 | 2335178 | −1 | Transmembrane protein 127 | TMEM127 | 204 | |
| ENSBTAG00000015659 | 11 | 2335382 | 2342308 | 1 | cytosolic iron-sulfur protein assembly 1 | CIAO1 | ||
| ENSBTAG00000016559 | 29 | 43146856 | 43149460 | −1 | Nudix (nucleoside diphosphate linked moiety X)-type motif 22 | NUDT22 | 150 | |
| ENSBTAG00000016555 | 29 | 43149610 | 43152143 | 1 | tRNA phosphotransferase 1 | TRPT1 | ||
| ENSBTAG00000016558 | 25 | 1354553 | 1360122 | −1 | splA/ryanodine receptor domain and SOCS box containing 3 | SPSB3 | 457 | |
| ENSBTAG00000016561 | 25 | 1360579 | 1365802 | 1 | nucleotide binding protein 2 | NUBP2 | ||
| ENSBTAG00000025028 | 12 | 47768049 | 47781910 | −1 | Mitotic spindle organizing protein 1 | MZT1 | 205 | |
| ENSBTAG00000019886 | 12 | 47782115 | 47809879 | 1 | bora, aurora kinase A activator | BORA | ||
| ENSBTAG00000021547 | 3 | 103353700 | 103422206 | −1 | WD repeat domain 65 | WDR65 | 399 | |
| ENSBTAG00000021544 | 3 | 103422605 | 103429520 | 1 | EBNA1 binding protein 2 | EBNA1BP2 | ||
| ENSBTAG00000016679 | 17 | 41262754 | 41307126 | −1 | Electron-transferring-flavoprotein dehydrogenase | ETFDH | 252 | |
| ENSBTAG00000033486 | 17 | 41307378 | 41310601 | 1 | chromosome 17 open reading frame, human C4orf46 | C17H4orf46 | ||
| ENSBTAG00000000489 | 11 | 10186102 | 10189962 | −1 | WD repeat domain 54 | WDR54 | 90 | |
| ENSBTAG00000040226 | 11 | 10190052 | 10195932 | 1 | chromosome 11 open reading frame, human C2orf81 | C11H2orf81 |
Distance, distance between transcription start sites, in bp.
Figure 2Fluorescence microscopy analysis of bidirectional promoter BiP100 in C2C12 cells. Images were taken at x20 magnification, 36h after transfection.
List of SNPs from the Illumina BovineSNP50 Beadchip with genotypes highly correlated to the activity of some bidirectional promoters
| SNP ID | BTA | Position | Alleles | Consequence | Ensembl gene ID | Ensembl transcript ID | Gene symbol | BiP # | ||
|---|---|---|---|---|---|---|---|---|---|---|
| Hapmap42177-BTA-31679 | 1 | 61,125,554 | A/G | intergenic_variant | 118 | 0.99 | 1.78E-24 | |||
| Hapmap42933-BTA-40454 | 1 | 85,731,453 | A/G | intergenic_variant | 101 | −0.99 | 1.95E-20 | |||
| ARS-BFGL-NGS-108514 | 2 | 105,521,127 | A/G | intergenic_variant | 118 | −0.99 | 1.78E-24 | |||
| Hapmap54229-rs29017613 | 4 | 30,200,987 | A/G | intergenic_variant | 118 | 0.99 | 1.78E-24 | |||
| ARS-BFGL-NGS-35869 | 5 | 21,707,809 | A/G | intergenic_variant | 118 | −0.99 | 1.78E-24 | |||
| ARS-BFGL-NGS-93953 | 5 | 22,737,219 | A/G | intergenic_variant | 118 | 0.99 | 1.78E-24 | |||
| Hapmap49852-BTA-107572 | 6 | 29,790,005 | A/G | intergenic_variant | 104 | −0.99 | 1.78E-24 | |||
| Hapmap25168-BTC-033275 | 6 | 33,713,818 | A/G | intergenic_variant | 104 | 0.99 | 1.78E-24 | |||
| Hapmap36813- SCAFFOLD50174_9004 | 6 | 33,768,128 | A/G | intergenic_variant | 104 | 0.99 | 1.78E-24 | |||
| Hapmap23923-BTC-066021 | 6 | 39,721,727 | A/C | intergenic_variant | 104 | −0.99 | 1.78E-24 | |||
| Hapmap36286- SCAFFOLD260285_24265 | 7 | 111,161,115 | A/C | intergenic_variant | 104 | −0.99 | 1.78E-24 | |||
| ARS-BFGL-NGS-99031 | 8 | 101,251,865 | A/C | intergenic_variant | 118 | 0.99 | 1.78E-24 | |||
| ARS-BFGL-NGS-60678 | 9 | 85,454,475 | A/G | intergenic_variant | 104 | 0.99 | 1.78E-24 | |||
| ARS-BFGL-NGS-4488 | 10 | 6,476,252 | A/G | intron_variant | ENSBTAG00000024878 | ENSBTAT00000048409 | ANKRD31 | 104 | −1 | 0 |
| ARS-BFGL-NGS-5819 | 10 | 10,428,184 | A/G | intron_variant | ENSBTAG00000025853 | ENSBTAT00000017647 | HOMER1 | 118 | 0.99 | 1.78E-24 |
| ARS-BFGL-NGS-27341 | 11 | 91,438,914 | A/G | intergenic_variant | 118 | −0,99 | 1.78E-24 | |||
| ARS-BFGL-NGS-115889 | 12 | 79,814,959 | A/C | intron_variant | ENSBTAG00000010395 | ENSBTAT00000013726 | DOCK9 | 64 | 0.99 | 1.95E-20 |
| ARS-BFGL-NGS-23509 | 14 | 70,636,087 | A/G | intergenic_variant | 118 | −0.99 | 1.78E-24 | |||
| ARS-BFGL-NGS-859 | 14 | 73,919,098 | A/G | intergenic_variant | 104 | −0.99 | 1.78E-24 | |||
| ARS-BFGL-NGS-99802 | 16 | 74,999,809 | A/C | 5_prime_UTR_variant | ENSBTAG00000010850 | ENSBTAT00000014402 | SERTAD4 | 104 | 0.99 | 1.78E-24 |
| Hapmap52466-rs29015577 | 19 | 3,617,183 | A/C | intergenic_variant | 118 | −0.99 | 1.78E-24 | |||
| Hapmap47625-BTA-44726 | 19 | 21,878,635 | A/G | intergenic_variant | 104 | −0.99 | 1.78E-24 | |||
| ARS-BFGL-NGS-14187 | 19 | 25,165,920 | A/G | intron_variant | ENSBTAG00000014806 | ENSBTAT00000019703 | ATP2A3 | 118 | −0.99 | 1.78E-24 |
| ARS-BFGL-NGS-109291 | 23 | 13,517,193 | T/A | intron_variant | ENSBTAG00000027197 | ENSBTAT00000064399 | KIF6 | 64 | −0.99 | 1.95E-20 |
| ARS-BFGL-NGS-57958 | 26 | 23,000,155 | A/G | intergenic_variant | 104 | −0.99 | 1.78E-24 | |||
| ARS-BFGL-NGS-42033 | 28 | 14,243,381 | C/G | intergenic_variant | 118 | −0.99 | 1.78E-24 | |||
| ARS-BFGL-NGS-77028 | 28 | 26,689,199 | A/G | intron_variant | ENSBTAG00000005666 | ENSBTAT00000007444 | LRRC20 | 118 | 0.99 | 1.78E-24 |
List of coding SNPs with genotypes highly correlated to the activity of some bidirectional promoters
| cSNP ID | BTA | Position | dbSNP ID | Alleles | Consequence | Ensembl gene ID | Ensembl transcript ID | Gene symbol | BiP # | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
| ENSBTAT00000002157_515 | 9 | 61,291,956 | G/A | intron_variant | ENSBTAG00000001644 | ENSBTAT00000002157 | MDN1 | 104 | −0.99 | 1.78E-24 | |
| ENSBTAT00000003078_106 | 15 | 82,337,162 | G/A | intron_variant | ENSBTAG00000002381 | ENSBTAT00000003078 | ZDHHC5 | 118 | 0.99 | 1.78E-24 | |
| ENSBTAT00000003826_479 | 21 | 22,218,555 | C/T | upstream_gene_variant | ENSBTAG00000002939 | ENSBTAT00000003826 | FURIN | 104 | −0.99 | 1.78E-24 | |
| ENSBTAT00000005923_585 | 25 | 2,977,744 | T/C | upstream_gene_variant | ENSBTAG00000004509 | ENSBTAT00000005923 | SLX4 | 118 | −0.99 | 1.78E-24 | |
| ENSBTAT00000008129_240 | 16 | 19,500,611 | G/C | intron_variant | ENSBTAG00000006186 | ENSBTAT00000008129 | KCTD3 | 118 | 0.99 | 1.78E-24 | |
| ENSBTAT00000008507_123 | 15 | 82,265,194 | T/C | intron_variant | ENSBTAG00000006493 | ENSBTAT00000008507 | CLP1 | 118 | −0.99 | 1.78E-24 | |
| ENSBTAT00000008728_747 | 12 | 18,197,212 | C/T | intron_variant | ENSBTAG00000006640 | ENSBTAT00000008728 | RB1 | 118 | 0.99 | 1.78E-24 | |
| ENSBTAT00000018163_665 | 21 | 66,866,347 | T/G | upstream_gene_variant | ENSBTAG00000013666 | ENSBTAT00000018163 | SLC25A29 | 104 | −0.99 | 1.78E-24 | |
| ENSBTAT00000020493_130 | 7 | 62,925,278 | A/G | intron_variant | ENSBTAG00000015419 | ENSBTAT00000020493 | ARHGEF37 | 64 | −0.99 | 1.95E-20 | |
| ENSBTAT00000025492_402 | 19 | 55,675,666 | T/C | intron_variant | ENSBTAG00000019153 | ENSBTAT00000025492 | JMJD6 | 24 | −1 | 6.38E-137 | |
| ENSBTAT00000028060_520 | 21 | 16,720,681 | C/T | intron_variant | ENSBTAG00000037383 | ENSBTAT00000028060 | AKAP13 | 118 | −0.99 | 1.78E-24 | |
| ENSBTAT00000028277_365 | 15 | 76,757,181 | T/G | intron_variant | ENSBTAG00000021223 | ENSBTAT00000028277 | CRY2 | 118 | −0.99 | 1.78E-24 | |
| ENSBTAT00000028656_569 | 14 | 75,708,914 | T/G | intron_variant | ENSBTAG00000039968 | ENSBTAT00000028656 | TMEM55A | 104 | 0.99 | 1.78E-24 | |
| ENSBTAT00000028865_103 | 25 | 39,432,034 | G/A | intron_variant | ENSBTAG00000037400 | ENSBTAT00000028865 | TNRC18 | 118 | 0.99 | 1.78E-24 | |
| ENSBTAT00000029400_195 | 17 | 73,658,214 | C/T | upstream_gene_variant | ENSBTAG00000021656 | ENSBTAT00000029400 | SPECC1L | 104 | 0.99 | 1.78E-24 | |
| ENSBTAT00000029403_203 | 18 | 62,789,937 | A/G | intron_variant | ENSBTAG00000030393 | ENSBTAT00000029403 | RDH13 | 104 | −0.99 | 1.78E-24 | |
| ENSBTAT00000033704_233 | 6 | 113,701,647 | rs442770236 | G/T | upstream_gene_variant | ENSBTAG00000004316 | ENSBTAT00000033704 | BOD1L | 104 | 0.99 | 1.78E-24 |
| ENSBTAT00000035362_466 | 20 | 10,305,279 | G/A | downstream_gene_variant | ENSBTAG00000027980 | ENSBTAT00000035362 | TAF9 | 118 | −0.99 | 1.78E-24 | |
| ENSBTAT00000037465_138 | 25 | 10,053,612 | T/C | intron_variant | ENSBTAG00000026375 | ENSBTAT00000037465 | RMI2 | 118 | −0.99 | 1.78E-24 | |
| ENSBTAT00000043122_702 | 10 | 86,384,804 | A/G | upstream_gene_variant | ENSBTAG00000020379 | ENSBTAT00000043122 | AREL1 | 118 | 0.99 | 1.78E-24 | |
| ENSBTAT00000043260_900 | 10 | 80,246,761 | A/G | upstream_gene_variant | ENSBTAG00000014334 | ENSBTAT00000043260 | ZFYVE26 | 118 | 0.99 | 1.78E-24 | |
| ENSBTAT00000043778_165 | 15 | 82,285,586 | T/C | intron_variant | ENSBTAG00000002411 | ENSBTAT00000043778 | CTNND1 | 118 | 0.99 | 1.78E-24 | |
| ENSBTAT00000043778_492 | 15 | 82,284,426 | G/A | intron_variant | ENSBTAG00000002411 | ENSBTAT00000043778 | CTNND1 | 118 | −0.99 | 1.78E-24 | |
| ENSBTAT00000043778_760 | 15 | 82,284,694 | C/T | intron_variant | ENSBTAG00000002411 | ENSBTAT00000043778 | CTNND1 | 118 | −0.99 | 1.78E-24 | |
| ENSBTAT00000054096_509 | 8 | 103,426,705 | G/A | upstream_gene_variant | ENSBTAG00000038335 | ENSBTAT00000054096 | IGBP1 | 118 | 0.99 | 1.78E-24 | |
| ENSBTAT00000056520_352 | 23 | 27,378,739 | C/T | upstream_gene_variant | ENSBTAG00000039620 | ENSBTAT00000056520 | MGC151586 | 104 | 0.99 | 1.78E-24 | |
| ENSBTAT00000061199_185 | 14 | 20,989,316 | A/G | upstream_gene_variant | ENSBTAG00000044106 | ENSBTAT00000061199 | SPIDR | 118 | −0.99 | 1.78E-24 | |
| ENSBTAT00000061451_269 | 23 | 3,434,098 | G/A | intron_variant | ENSBTAG00000021237 | ENSBTAT00000061451 | DST | 118 | −0.99 | 1.78E-24 | |
| ENSBTAT00000061451_704 | 23 | 3,432,111 | G/T | intron_variant | ENSBTAG00000021237 | ENSBTAT00000061451 | DST | 118 | 0.99 | 1.78E-24 | |
| ENSBTAT00000034195_104 | 8 | 53,736,890 | G/A | intron_variant | ENSBTAG00000017734 | ENSBTAT00000034195 | VPS13A | 118 | 0.99 | 1.78E-24 |
Correlation scores between cSNPs and the expression level of the corresponding gene
| cSNP ID | BTA | Position | dbSNP ID | Alleles | Consequence | Ensembl transcript ID | Gene symbol | ||
|---|---|---|---|---|---|---|---|---|---|
| ENSBTAT00000003826_4794 | 21 | 22,218,555 | C/T | upstream_gene_variant | ENSBTAT00000003826 | FURIN | −0.52 | 0.02 | |
| ENSBTAT00000018163_665 | 21 | 66,866,347 | T/G | upstream_gene_variant | ENSBTAT00000018163 | SLC25A29 | −0.49 | 0.03 | |
| ENSBTAT00000043122_702 | 10 | 86,384,804 | A/G | upstream_gene_variant | ENSBTAT00000043122 | AREL1 | 0.49 | 0.03 | |
| ENSBTAT00000043260_900 | 10 | 80,246,761 | A/G | upstream_gene_variant | ENSBTAT00000043260 | ZFYVE26 | 0.46 | 0.04 | |
| ENSBTAT00000005923_585 | 25 | 2,977,744 | T/C | upstream_gene_variant | ENSBTAT00000005923 | SLX4 | −0.43 | 0.06 | |
| ENSBTAT00000035362_4667 | 20 | 10,305,279 | G/A | downstream_gene_variant | ENSBTAT00000035362 | TAF9 | −0.43 | 0.06 | |
| ENSBTAT00000029400_195 | 17 | 73,658,214 | C/T | upstream_gene_variant | ENSBTAT00000029400 | SPECC1L | 0.35 | 0.14 | |
| ENSBTAT00000061199_1851 | 14 | 20,989,316 | A/G | upstream_gene_variant | ENSBTAT00000061199 | SPIDR | −0.29 | 0.22 | |
| ENSBTAT00000033704_233 | 6 | 113,701,647 | rs442770236 | G/T | upstream_gene_variant | ENSBTAT00000033704 | BOD1L | 0.17 | 0,46 |
| ENSBTAT00000054096_509 | 8 | 103,426,705 | G/A | upstream_gene_variant | ENSBTAT00000054096 | IGBP1 | 0.14 | 0.54 | |
| ENSBTAT00000056520_3528 | 23 | 27,378,739 | C/T | upstream_gene_variant | ENSBTAT00000056520 | MGC151586 | −0.12 | 0.63 |