| Literature DB >> 18093339 |
Cas Simons1, Igor V Makunin, Michael Pheasant, John S Mattick.
Abstract
BACKGROUND: We recently reported the existence of large numbers of regions up to 80 kb long that lack transposon insertions in the human, mouse and opossum genomes. These regions are significantly associated with loci involved in developmental and transcriptional regulation.Entities:
Mesh:
Substances:
Year: 2007 PMID: 18093339 PMCID: PMC2241635 DOI: 10.1186/1471-2164-8-470
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Counts of TFRs in four vertebrate genomes
| ≥ 10 kb | ≥ 5 kb | |||
| Number | Size (bp) | Number | Size (bp) | |
| Zebrafish | 470 | 6,202,556 | 4,891 | 34,677,352 |
| Human | 856 | 12,090,440 | 9,203 | 65,097,113 |
| Mouse | 1,112 | 15,136,372 | 14,154 | 97,707,658 |
| Opossum | 396 | 5,334,925 | 4,818 | 33,312,949 |
The fifteen longest TFRs in zebrafish
| TFR ID | Genomic position | Size (bp) | Overlapping genesa | TFR in humanb |
| dr23.144 | chr23:35,638,710-35,705,409 | 66,700 | Yes | |
| dr3.77 | chr3:22,940,664-22,984,418 | 43,755 | Yes | |
| dr19.54 | chr19:13,924,211-13,955,334 | 31,124 | Yes | |
| dr4.159 | chr4:16,685,114-16,714,306 | 29,193 | No | |
| dr5.225 | chr5:57,548,679-57,577,252 | 28,574 | Yes | |
| dr15.69 | chr15:25,034,169-25,061,004 | 26,836 | - | |
| dr3.89 | chr3:24,406,349-24,432,207 | 25,859 | IGF2BP3 | Yes |
| dr3.78 | chr3:22,984,979-23,010,566 | 25,588 | Yes | |
| dr4.190 | chr4:18,669,203-18,694,678 | 25,476 | Yes | |
| dr6.213 | chr6:59,099,468-59,123,108 | 23,641 | Yes | |
| dr2.134 | chr2:31,040,372-31,063,335 | 22,964 | Yes | |
| dr24.53 | chr24:17,570,229-17,592,632 | 22,404 | Yes | |
| dr23.116 | chr23:31,417,962-31,439,642 | 21,681 | MLL2 | No |
| dr9.78 | chr9:21,302,333-21,323,748 | 21,416 | ZFHX1B | Yes |
| dr15.179 | chr15:41,033,477-41,054,729 | 21,253 | Yes |
a Zebrafish genes are listed in lower case italic; human proteins mapped to corresponding loci in the zebrafish genome are given in upper case.
b The orthologous gene in human overlaps a TFR ≥ 10 kb.
Figure 1Summary of the number of zebrafish TFRs that are orthologous to TFRs in three mammalian species. (A) Proportions of zebrafish TFRs that have an orthologous TFR in three, two or one mammal species. The bar on the left relates to zebrafish TFRs ≥ 10 kb with orthologous mammalian TFRs ≥ 10 kb. The center bar relates to zebrafish TFRs ≥ 10 kb with orthologous mammalian TFRs ≥ 5 kb. The bar on the right relates to zebrafish TFRs ≥ 5 kb with orthologous mammalian TFRs ≥ 5 kb. (B) Venn diagram of zebrafish TFRs ≥ 10 kb with orthologs ≥ 10 kb in one or more mammal species. The numbers on the graph represent the count of zebrafish TFRs in each category.
Figure 2Orthologous human and zebrafish TFRs that contain the miRNA mir-129-2. (A) 20 kb of the human genome (chr11:43,548,001–43,568,000) including the non-genic 13 kb TFR hs11.145 (red bar). Thick blue bars indicate blocks of sequence that are alignable to the orthologous zebrafish TFR dr25.92. Small purple bar indicates the position of the human miRNA mir-129-2. (B) A close up view of 130 bp around mir-129-2, thick purple bar indicates the mature miRNA, thin purple line indicates pre-miRNA hairpin. Blue conservation plot is based on the alignment of 17 vertebrate species and green plot based on pairwise alignment of human and zebrafish that shows a conservation profile consistent with the presence of a miRNA conserved in each species [38]. (C) Syntenic region of the zebrafish genome (20 kb chr25:31,421,001–31,441,000) including the TFR dr25.92. Thick blue bars indicate blocks of sequence that are alignable to the orthologous human TFR hs11.145. Although there are currently no genes annotated in this region, the conservation profile suggests that an ortholog of mir-129-2 resides within the TFR. All images are modified screen shots taken from the UCSC genome browser [31].
Figure 3Comparison of the GC content of TFRs in zebrafish and human. (A) Histogram of the GC content of zebrafish TFRs. Area indicated in blue describes the subset of TFRs that have an orthologous TFR in human larger than 10 kb, the area in red have an orthologous TFR in human larger than 5 kb. (B) Histogram of the GC content of human TFRs. Area indicated in blue describes the subset of TFRs that have an orthologous TFR in zebrafish larger than 10 kb, the area in red have an orthologous TFR in zebrafish larger than 5 kb. (C) Scatter plot of the GC content of orthologous pairs of zebrafish and human TFRs ≥ 10 kb in both species. Points in red indicate TFR pairs with a difference of absolute GC% greater than 20.
Figure 4Zebrafish transposon-free region dr3.89 and the orthologous regions of four vertebrate species. Each panel shows a modified screenshot displaying a 60 kb region from the UCSC genome browser. Horizontal red bars indicate TFRs and brown ticks indicate transposons. (A) Zebrafish (chr3:24,391-24,451 kb, March 2006) including the 25.9 kb TFR dr3.89. Human proteins mapped to the zebrafish genome by chained tBLASTn are indicated in blue. (B) Human (chr7:23,450-23,510 kb, March 2006) including the 13.7 kb TFR hs7.101. Human RefSeq genes are indicated in blue. (C) Mouse (chr6:49,114-49,174 kb, February 2006) including the 9.3 kb TFR mm6.309. Mouse RefSeq genes are indicated in blue. (D) Opossum (chr8:296,183-296,243 kb, January 2006) including the 16.4 kb TFR md8.376. Human RefSeq genes mapped to the opossum genome with BLAT are indicated in blue. (E) Frog (scaffold_56:3,208-3,268 kb, August 2005) including a 14 kb region that contains no transposons (red box). Human proteins mapped to the frog genome with tBLASTn are indicated in blue.