| Literature DB >> 20478831 |
Ivan Merelli1, Alessandro Guffanti, Marco Fabbri, Andrea Cocito, Laura Furia, Ursula Grazini, Raoul J Bonnal, Luciano Milanesi, Fraser McBlane.
Abstract
Recombination signal sequences (RSSs) flanking V, D and J gene segments are recognized and cut by the VDJ recombinase during development of B and T lymphocytes. All RSSs are composed of seven conserved nucleotides, followed by a spacer (containing either 12 +/- 1 or 23 +/- 1 poorly conserved nucleotides) and a conserved nonamer. Errors in V(D)J recombination, including cleavage of cryptic RSS outside the immunoglobulin and T cell receptor loci, are associated with oncogenic translocations observed in some lymphoid malignancies. We present in this paper the RSSsite web server, which is available from the address http://www.itb.cnr.it/rss. RSSsite consists of a web-accessible database, RSSdb, for the identification of pre-computed potential RSSs, and of the related search tool, DnaGrab, which allows the scoring of potential RSSs in user-supplied sequences. This latter algorithm makes use of probability models, which can be recasted to Bayesian network, taking into account correlations between groups of positions of a sequence, developed starting from specific reference sets of RSSs. In validation laboratory experiments, we selected 33 predicted cryptic RSSs (cRSSs) from 11 chromosomal regions outside the immunoglobulin and TCR loci for functional testing.Entities:
Mesh:
Year: 2010 PMID: 20478831 PMCID: PMC2896083 DOI: 10.1093/nar/gkq391
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Number and density of HUMAN putative cRSSs for intragenic and extragenic regions
| Chr. | RSS12 | RSS23 | ||||
|---|---|---|---|---|---|---|
| RSS count | Intrag. dens. | Extrag. dens. | RSS count | Intrag. dens. | Extrag. dens. | |
| chr1 | 248 570 | 890.41 | 1002.74 | 261 241 | 872.17 | 954.10 |
| chr2 | 251 136 | 858.59 | 968.40 | 258 916 | 924.03 | 939.30 |
| chr3 | 214 721 | 872.26 | 922.23 | 205 105 | 939.34 | 965.47 |
| chr4 | 205 738 | 875.10 | 929.12 | 183 816 | 1001.06 | 1039.92 |
| chr5 | 197 938 | 864.89 | 914.01 | 186 426 | 940.04 | 970.44 |
| chr6 | 183 662 | 860.50 | 931.68 | 175 554 | 951.26 | 974.71 |
| chr7 | 169 932 | 858.72 | 936.48 | 174 965 | 876.46 | 909.55 |
| chr8 | 155 781 | 863.59 | 939.55 | 153 573 | 930.73 | 953.06 |
| chr9 | 126 912 | 874.46 | 1112.69 | 137 272 | 873.30 | 1028.71 |
| chr10 | 144 072 | 856.05 | 940.74 | 154 524 | 868.69 | 877.11 |
| chr11 | 141 666 | 874.92 | 952.99 | 148 507 | 876.23 | 909.09 |
| chr12 | 139 185 | 881.05 | 961.68 | 146 332 | 876.15 | 914.71 |
| chr13 | 99 259 | 865.34 | 1160.30 | 96 400 | 972.04 | 1194.71 |
| chr14 | 89 376 | 859.31 | 1201.10 | 99 218 | 873.30 | 1081.96 |
| chr15 | 81 216 | 882.69 | 1262.45 | 96 756 | 864.90 | 1059.69 |
| chr16 | 78 857 | 895.37 | 1145.81 | 106 180 | 727.97 | 850.96 |
| chr17 | 73 289 | 922.98 | 1107.88 | 108 155 | 726.65 | 750.73 |
| chr18 | 75 549 | 844.88 | 1033.47 | 80 168 | 890.12 | 973.92 |
| chr19 | 71 216 | 915.51 | 830.28 | 88 168 | 645.67 | 670.64 |
| chr20 | 61 670 | 871.31 | 1021.98 | 77 213 | 781.58 | 816.26 |
| chr21 | 41 415 | 837.22 | 1162.14 | 39 510 | 784.12 | 1218.17 |
| chr22 | 40 042 | 900.60 | 1281.27 | 53 014 | 664.33 | 967.76 |
| chrX | 164 182 | 833.67 | 945.72 | 159 189 | 949.14 | 975.38 |
| chrY | 33 924 | 774.60 | 750.19 | 28 450 | 908.62 | 986.94 |
| TOT | 3 089 308 | 3 218 664 | ||||
Number and density of MOUSE putative cRSSs for intragenic and extragenic regions
| Chr. | RSS12 | RSS23 | ||||
|---|---|---|---|---|---|---|
| RSS count | Intrag. dens. | Extrag. dens. | RSS count | Intrag. dens. | Extrag. dens. | |
| chr1 | 136 509 | 1426.42 | 1444.56 | 154 144 | 1175.16 | 1279.29 |
| chr2 | 127 417 | 1407.53 | 1426.40 | 148 619 | 1148.64 | 1222.91 |
| chr3 | 112 623 | 1408.35 | 1417.12 | 121 963 | 1213.85 | 1308.59 |
| chr4 | 108 463 | 1456.20 | 1434.87 | 124 708 | 1168.66 | 1247.96 |
| chr5 | 108 471 | 1495.78 | 1406.25 | 125 747 | 1126.75 | 1213.05 |
| chr6 | 105 388 | 1433.10 | 1418.73 | 120 445 | 1155.72 | 1241.37 |
| chr7 | 102 874 | 1449.90 | 1482.63 | 118 938 | 1123.63 | 1282.39 |
| chr8 | 92 561 | 1392.15 | 1423.27 | 105 324 | 1109.16 | 1250.80 |
| chr9 | 90 497 | 1375.52 | 1371.05 | 104 061 | 1112.28 | 1192.34 |
| chr10 | 95 577 | 1380.14 | 1360.09 | 102 615 | 1187.03 | 1266.81 |
| chr11 | 89 231 | 1310.35 | 1365.49 | 104 088 | 1103.98 | 1170.59 |
| chr12 | 87 244 | 1323.81 | 1389.87 | 96 912 | 1180.01 | 1251.21 |
| chr13 | 87 463 | 1396.70 | 1375.26 | 95 933 | 1175.18 | 1253.84 |
| chr14 | 79 458 | 1403.24 | 1575.61 | 98 595 | 1141.77 | 1269.79 |
| chr15 | 65 347 | 1440.15 | 1583.78 | 82 324 | 1150.06 | 1257.17 |
| chr16 | 62 698 | 1417.32 | 1568.14 | 77 027 | 1174.76 | 1276.42 |
| chr17 | 61 659 | 1408.09 | 1545.15 | 78 012 | 1093.06 | 1221.26 |
| chr18 | 58 031 | 1405.72 | 1564.20 | 71 989 | 1166.83 | 1260.92 |
| chr19 | 48 079 | 1429.58 | 1275.87 | 48 447 | 1158.45 | 1266.18 |
| chrX | 111 850 | 1472.87 | 1489.94 | 109 886 | 1414.72 | 1516.57 |
| chrY | 1906 | 1448.93 | 1343.42 | 1784 | 1596.39 | 1513.99 |
| TOT | 1 833 319 | 2 091 561 | ||||
Figure 1.Sample tabular output from a cRSS ‘Chromosomal search’ database query on a region of human chr8 (nt. 128 817 497–28 822 855). Shown are the chromosome range, organism, type of cRSS requested and the putative signal sequences (7) with the sequence, the location, the RIC score and a clickable link to UCSC.
Figure 2.Sample graphical output from a cRSS ‘Gene search’ query for the mouse gene Gnb2l1. Shown are the location of 12 (red) and 23 (black) cRSSs passing the RIC score and mapping in a window encompassing the specified gene±1000 bp.
Figure 3.Sample tabular output of the ‘Analyse your own sequence’ section of the RSSsite. The sequence has been analyzed using both the human and murine models, searching for 12 and 23 spaced cRSS. In the output, the start and end position are reported, with the strand and the RIC score. The last column provides, according to the RIC thresholds, an indication about the possibility of the sequence to be a DNA breaking point.
Functionally tested sequences using putative cRSSs, according to the RIC score, predicted by the DNAGrab algorithm available at the RSSsite
RSS12 sequences are in blue, RSS23 sequences are in pink.
CUT* sequences are defined according to the visualization by LMPCR of a cleavage product at the 5′-end of the RSS.