| Literature DB >> 21575266 |
Shiao-Wei Huang1, You-Yu Lin, En-Min You, Tze-Tze Liu, Hung-Yu Shu, Keh-Ming Wu, Shih-Feng Tsai, Chu-Fang Lo, Guang-Hsiung Kou, Gwo-Chin Ma, Ming Chen, Dongying Wu, Takashi Aoki, Ikuo Hirono, Hon-Tsen Yu.
Abstract
BACKGROUND: The black tiger shrimp (Penaeus monodon) is one of the most important aquaculture species in the world, representing the crustacean lineage which possesses the greatest species diversity among marine invertebrates. Yet, we barely know anything about their genomic structure. To understand the organization and evolution of the P. monodon genome, a fosmid library consisting of 288,000 colonies and was constructed, equivalent to 5.3-fold coverage of the 2.17 Gb genome. Approximately 11.1 Mb of fosmid end sequences (FESs) from 20,926 non-redundant reads representing 0.45% of the P. monodon genome were obtained for repetitive and protein-coding sequence analyses.Entities:
Mesh:
Year: 2011 PMID: 21575266 PMCID: PMC3124438 DOI: 10.1186/1471-2164-12-242
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Characterization of repeat types by RepeatMasker*
| Type | # Hits | Length (bp) | % Bases | # Hits | Length (bp) | % Bases |
|---|---|---|---|---|---|---|
| | 380 | 45,036 | 0.405% | 870 | 94,532 | 0.851% |
| Non-LTR elements | 154 | 14,584 | 0.131% | 151 | 16,570 | 0.149% |
| L2/CR1/Rex | 1 | 67 | 0.001% | 28 | 1,817 | 0.016% |
| R1/LOA/Jockey | 55 | 4,375 | 0.039% | 122 | 14,723 | 0.132% |
| RTE/Bov-B | - | - | - | 1 | 30 | 0.000% |
| LTR elements | 226 | 30,452 | 0.274% | 719 | 77,962 | 0.701% |
| Ty1/Copia | 3 | 238 | 0.002% | - | - | - |
| Gypsy/DIRS1 | 181 | 27,142 | 0.244% | 718 | 77,743 | 0.699% |
| | 53 | 4,302 | 0.039% | 154 | 14,817 | 0.133% |
| hobo-Activator | 3 | 363 | 0.003% | - | - | - |
| Others | 40 | 3,364 | 0.030% | 40 | 2,324 | 0.021% |
| | 48 | 4,816 | 0.043% | 5 | 309 | 0.003% |
| 1,351 | 422,537 | 3.802% | 1,274 | 414,435 | 3.729% | |
| 10,282 | 858,898 | 7.728% | 9,756 | 807,927 | 7.269% | |
| 4,258 | 386,684 | 3.479% | 4,257 | 387,159 | 3.483% | |
*The repeat databases of D. melanogaster and A. gambiae were used.
Characterization of microsatellites in the P. monodon genomea
| Motif | Unit | Counts | # FES | % FES | Bases | Bases | Max | Mean | STD | Mean | RA | RF |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A | 1 | 159 | 158 | 0.76 | 7,467 | 0.07 | 376 | 46.96 | 35.63 | 46.96 | 0.81 | 1.88 |
| AC | 2 | 1153 | 1095 | 5.23 | 118,430 | 1.07 | 783 | 102.71 | 93.67 | 51.36 | 12.85 | 13.66 |
| AG | 2 | 1424 | 1395 | 6.67 | 198,119 | 1.78 | 763 | 139.13 | 104.88 | 69.56 | 21.50 | 16.87 |
| AT | 2 | 912 | 896 | 4.28 | 87,832 | 0.79 | 604 | 96.31 | 70.28 | 48.15 | 9.53 | 10.80 |
| CG | 2 | 91 | 91 | 0.43 | 3,995 | 0.04 | 142 | 43.90 | 10.94 | 21.95 | 0.43 | 1.08 |
| AAC | 3 | 125 | 114 | 0.54 | 10,390 | 0.09 | 457 | 83.12 | 65.03 | 27.71 | 1.13 | 1.48 |
| AAG | 3 | 265 | 255 | 1.22 | 27,823 | 0.25 | 720 | 104.99 | 111.32 | 35.00 | 3.02 | 3.14 |
| AAT | 3 | 915 | 881 | 4.21 | 130,973 | 1.18 | 751 | 143.14 | 121.98 | 47.71 | 14.21 | 10.84 |
| ACC | 3 | 111 | 108 | 0.52 | 7,814 | 0.07 | 584 | 70.40 | 77.28 | 23.47 | 0.85 | 1.32 |
| ACG | 3 | 69 | 67 | 0.32 | 5,163 | 0.05 | 184 | 74.83 | 32.98 | 24.94 | 0.56 | 0.82 |
| ACT | 3 | 182 | 181 | 0.86 | 13,809 | 0.12 | 659 | 75.87 | 72.91 | 25.29 | 1.50 | 2.16 |
| AGC | 3 | 169 | 163 | 0.78 | 8,098 | 0.07 | 159 | 47.92 | 19.98 | 15.97 | 0.88 | 2.00 |
| AGG | 3 | 333 | 321 | 1.53 | 25,140 | 0.23 | 422 | 75.50 | 68.45 | 25.17 | 2.73 | 3.95 |
| ATC | 3 | 545 | 536 | 2.56 | 50,116 | 0.45 | 596 | 91.96 | 66.43 | 30.65 | 5.44 | 6.46 |
| CCG | 3 | 68 | 55 | 0.26 | 5,940 | 0.05 | 564 | 87.35 | 140.92 | 29.12 | 0.64 | 0.806 |
| AAAC | 4 | 17 | 17 | 0.08 | 996 | 0.01 | 120 | 58.59 | 33.20 | 14.65 | 0.11 | 0.20 |
| AAAG | 4 | 89 | 88 | 0.42 | 7,128 | 0.06 | 249 | 80.09 | 31.49 | 20.02 | 0.77 | 1.05 |
| AAAT | 4 | 95 | 93 | 0.44 | 7,572 | 0.07 | 372 | 79.71 | 60.36 | 19.93 | 0.82 | 1.13 |
| AAGC | 4 | 9 | 9 | 0.04 | 707 | 0.01 | 348 | 78.56 | 101.09 | 19.64 | 0.08 | 0.11 |
| AAGG | 4 | 33 | 31 | 0.15 | 4,890 | 0.04 | 642 | 148.18 | 169.76 | 37.05 | 0.53 | 0.39 |
| AATG | 4 | 11 | 11 | 0.05 | 923 | 0.01 | 131 | 83.91 | 31.85 | 20.98 | 0.10 | 0.13 |
| ACAG | 4 | 46 | 46 | 0.22 | 7,787 | 0.07 | 639 | 169.28 | 155.00 | 42.32 | 0.85 | 0.55 |
| ACAT | 4 | 82 | 79 | 0.38 | 10,505 | 0.10 | 659 | 128.11 | 133.48 | 32.03 | 1.14 | 0.97 |
| ACGC | 4 | 18 | 18 | 0.09 | 1,250 | 0.01 | 140 | 69.44 | 33.17 | 17.36 | 0.14 | 0.21 |
| ACGG | 4 | 89 | 89 | 0.43 | 3,789 | 0.03 | 116 | 42.57 | 9.40 | 10.64 | 0.41 | 1.05 |
| ACTC | 4 | 34 | 34 | 0.16 | 5,470 | 0.05 | 727 | 160.88 | 157.78 | 40.22 | 0.59 | 0.40 |
| AGAT | 4 | 76 | 75 | 0.36 | 10,076 | 0.09 | 743 | 132.58 | 114.86 | 33.14 | 1.09 | 0.90 |
| AGCC | 4 | 134 | 134 | 0.64 | 17,294 | 0.16 | 330 | 129.06 | 43.55 | 32.26 | 1.88 | 1.59 |
| AGCG | 4 | 63 | 62 | 0.30 | 2,054 | 0.02 | 36 | 32.60 | 2.82 | 8.15 | 0.22 | 0.75 |
| AGGC | 4 | 62 | 62 | 0.30 | 5,970 | 0.05 | 415 | 96.29 | 71.97 | 24.07 | 0.65 | 0.74 |
| AGGG | 4 | 97 | 94 | 0.45 | 13,231 | 0.12 | 618 | 136.40 | 130.46 | 34.10 | 1.44 | 1.15 |
| ATCC | 4 | 6 | 6 | 0.03 | 432 | 0.00 | 170 | 72.00 | 57.96 | 18.00 | 0.05 | 0.07 |
| CCCG | 4 | 29 | 29 | 0.14 | 1,915 | 0.02 | 82 | 66.03 | 8.26 | 16.51 | 0.21 | 0.34 |
| AAAAC | 5 | 6 | 6 | 0.03 | 388 | 0.00 | 96 | 64.67 | 24.05 | 12.93 | 0.04 | 0.07 |
| AAAAG | 5 | 14 | 14 | 0.07 | 927 | 0.01 | 115 | 66.21 | 28.21 | 13.24 | 0.10 | 0.17 |
| AAAAT | 5 | 9 | 9 | 0.04 | 435 | 0.00 | 60 | 48.33 | 10.05 | 9.67 | 0.05 | 0.11 |
| AACCT | 5 | 131 | 126 | 0.60 | 41,414 | 0.37 | 705 | 316.14 | 205.30 | 63.23 | 4.49 | 1.55 |
| AAGAG | 5 | 24 | 24 | 0.11 | 3,946 | 0.04 | 676 | 164.42 | 167.47 | 32.88 | 0.43 | 0.28 |
| AAGGG | 5 | 12 | 12 | 0.06 | 1,930 | 0.02 | 414 | 160.83 | 143.66 | 32.17 | 0.21 | 0.14 |
| AATAT | 5 | 14 | 14 | 0.07 | 1,236 | 0.01 | 258 | 88.29 | 58.22 | 17.66 | 0.13 | 0.17 |
| AGAGG | 5 | 16 | 15 | 0.07 | 1,873 | 0.02 | 456 | 117.06 | 116.10 | 23.41 | 0.20 | 0.19 |
| AGGCG | 5 | 25 | 25 | 0.12 | 922 | 0.01 | 37 | 36.88 | 0.60 | 7.38 | 0.10 | 0.30 |
| AGGGG | 5 | 20 | 20 | 0.10 | 1,884 | 0.02 | 215 | 94.20 | 53.84 | 18.84 | 0.20 | 0.24 |
| AAAAAG | 6 | 13 | 13 | 0.06 | 1,703 | 0.02 | 437 | 131.00 | 119.28 | 21.83 | 0.19 | 0.15 |
| AAAAAT | 6 | 9 | 9 | 0.04 | 452 | 0.00 | 62 | 50.22 | 16.36 | 8.37 | 0.05 | 0.11 |
| AAACAC | 6 | 6 | 6 | 0.03 | 425 | 0.00 | 133 | 70.83 | 32.20 | 11.81 | 0.05 | 0.07 |
| AAAGAG | 6 | 26 | 26 | 0.12 | 4,633 | 0.04 | 705 | 178.19 | 156.46 | 29.70 | 0.50 | 0.31 |
| AAATAT | 6 | 6 | 4 | 0.02 | 184 | 0.00 | 59 | 30.67 | 16.68 | 5.11 | 0.02 | 0.07 |
| AAATGG | 6 | 6 | 3 | 0.01 | 218 | 0.00 | 79 | 36.33 | 23.55 | 6.06 | 0.02 | 0.07 |
| AACAAT | 6 | 7 | 7 | 0.03 | 625 | 0.01 | 119 | 89.29 | 22.26 | 14.88 | 0.07 | 0.08 |
| AACAGC | 6 | 25 | 25 | 0.12 | 1,052 | 0.01 | 55 | 42.08 | 5.61 | 7.01 | 0.11 | 0.30 |
| AAGAGC | 6 | 8 | 8 | 0.04 | 352 | 0.00 | 44 | 44.00 | 0.00 | 7.33 | 0.04 | 0.10 |
| AAGAGG | 6 | 30 | 30 | 0.14 | 4,223 | 0.04 | 554 | 140.77 | 124.39 | 23.46 | 0.46 | 0.36 |
| AAGGAG | 6 | 28 | 28 | 0.13 | 3,377 | 0.03 | 318 | 120.61 | 85.98 | 20.10 | 0.37 | 0.33 |
| AAGGGG | 6 | 23 | 23 | 0.11 | 2,123 | 0.02 | 195 | 92.30 | 55.98 | 15.38 | 0.23 | 0.27 |
| AATATT | 6 | 7 | 7 | 0.03 | 370 | 0.00 | 98 | 52.86 | 30.56 | 8.81 | 0.04 | 0.08 |
| AATGAT | 6 | 45 | 41 | 0.20 | 5,325 | 0.05 | 453 | 118.33 | 107.96 | 19.72 | 0.58 | 0.53 |
| ACACAT | 6 | 16 | 16 | 0.08 | 2,716 | 0.02 | 714 | 169.75 | 190.21 | 28.29 | 0.30 | 0.19 |
| ACACGC | 6 | 14 | 14 | 0.07 | 1,056 | 0.01 | 136 | 75.43 | 32.56 | 12.57 | 0.12 | 0.17 |
| ACACTC | 6 | 10 | 10 | 0.05 | 972 | 0.01 | 172 | 97.20 | 46.96 | 16.20 | 0.11 | 0.12 |
| ACAGAG | 6 | 13 | 13 | 0.06 | 2,486 | 0.02 | 733 | 191.23 | 196.79 | 31.87 | 0.27 | 0.15 |
| ACATAT | 6 | 15 | 13 | 0.06 | 1,737 | 0.02 | 367 | 115.80 | 104.02 | 19.30 | 0.19 | 0.18 |
| ACCATC | 6 | 11 | 11 | 0.05 | 852 | 0.01 | 203 | 77.45 | 66.27 | 12.91 | 0.09 | 0.13 |
| ACCTCC | 6 | 8 | 8 | 0.04 | 587 | 0.01 | 150 | 73.38 | 50.83 | 12.23 | 0.06 | 0.10 |
| ACGATG | 6 | 9 | 9 | 0.04 | 424 | 0.00 | 86 | 47.11 | 22.13 | 7.85 | 0.05 | 0.11 |
| ACTCTC | 6 | 6 | 6 | 0.03 | 752 | 0.01 | 229 | 125.33 | 84.80 | 20.89 | 0.08 | 0.07 |
| AGAGGG | 6 | 49 | 49 | 0.23 | 7,856 | 0.07 | 702 | 160.33 | 138.90 | 26.72 | 0.85 | 0.58 |
| AGCCGC | 6 | 10 | 10 | 0.05 | 642 | 0.01 | 175 | 64.20 | 43.00 | 10.70 | 0.07 | 0.12 |
| AGGATG | 6 | 6 | 6 | 0.03 | 929 | 0.01 | 509 | 154.83 | 181.98 | 25.81 | 0.10 | 0.07 |
| AGGGGG | 6 | 19 | 19 | 0.09 | 2,064 | 0.02 | 321 | 108.63 | 73.56 | 18.11 | 0.22 | 0.23 |
| CCCCCG | 6 | 66 | 25 | 0.12 | 838 | 0.01 | 58 | 12.70 | 5.66 | 2.12 | 0.09 | 0.78 |
The characterization of microsatellites in P. monodon FESs is accomplished in terms of the occurring hits, the number of occurred FESs, the percentage frequency of repeats in terms of the total number of FES analyzed, the amount of repeats in terms of nucleotide lengthand as a percentageof total sequence (11,114,786 bp) analyzed, the maximum length, the standard deviationabout the microsatellite mean length, the mean repeat number, the relative abundance, and the relative frequency. In total, seventy-one microsatellite types have ≧6 hits in the 20,926 FESs; the remaining 91 microsatellite types with 1~5 hits were omitted.
The relative abundance (RA) is the base-pairs comprised by each repeat class [Bases (bp)] divided by the total length of all microsatellites (921,720 bp) ×100.
The relative frequency (RF) is the hits of each repeat class [Counts] divided by the total hits of all microsatellites (8,441 hits) ×100.
Figure 1Relative abundance by base-pair (a) and relative frequencies by loci (b) of top 20 microsatellite classes present in total fosmid end sequences. 142 SSRs: remaining 142 simple sequence repeat motifs.
Survey of microsatellite distribution and mean lengths in various genomes
| Species | % genome (bp) | Density | Mean length (bp) | References |
|---|---|---|---|---|
| 8.30 | 1 per 1.32 kb | 109.2 | This study | |
| 0.31 | 1 per 9.56 kb | 29.6 | [ | |
| Arthropoda | 0.54 | -- | -- | [ |
| 1.30 | 1 per 1.876 kb | 25.6 | [ | |
| Human | 1.0-3.0 | 1 per 6 kb | 19.0 | [ |
Data mostly from Drosophila melanogaster and other 159 Drosophila sp.
Figure 2Length distribution of the 20 most frequently occurring microsatellite classes. L1 type: 12-20 bp; L2: 21-40 bp; L3: 41-60 bp; L4: 61-80 bp; L5: 81-100 bp; L6: 101-120 bp; L7: 121-140 bp; L8: 141-160 bp; L9: 161-180 bp; L10: 181-200 bp; L11: > 201 bp.
Figure 3Comparison between the relative abundance (%) of dinucleotide repeats. (a) and trinucleotide repeats (b) in genomic DNAs (FESs) and in transcribed regions (ESTs).
Examples of shrimp genes known to contain a very long stretch of microsatellites
| Gene [species] | Gene function | Sequence | Position | Role of microsatellites | Reference |
|---|---|---|---|---|---|
| Anti-virus | 5'-promoter | Negative regulatory element | [ | ||
| Prophenoloxidase- | Innate immunity | 3'UTR | n.d. | [ | |
| Prophenoloxidase- | Innate immunity | 3'UTR | n.d. | [ | |
| Heat shock cognate 70 [ | Molecular chaperon; | 5'-promoter | n.d. | [ | |
| 5-HT1 receptor [ | Serotonin receptor; G-protein coupled | Coding region | poly(G) tract | [ |
n.d.: not determined
Summary of the 103 novel repetitive elements in the P.monodon genome
| (i) WSSV-related (33 PREs) | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| FAM9_15-44 | 470 | 24.000 | 45.88 | 274.988 | 4 | 21 bp-repeat ×19.4, | wsv134 [WSSV] | 3.0E-13 | 31% (44/140) |
| wsv115 [WSSV] | 5.0E-126 | 30% (266/876) | |||||||
| wsv119 [WSSV] | 4.0E-25 | 22% (162/717) | |||||||
| wsv216 [WSSV] | 1.0E-41 | 23% (291/1261) | |||||||
| wsv220 [WSSV] | 1.0E-37 | 23% (178/747) | |||||||
| Hypothetical protein [ | 5.0E-19 | 34% (103/297) | |||||||
| FAM152 | 461 | 22.978 | 44.47 | 266.056 | 4 | 93 bp-repeat ×2.0 | wsv447 [WSSV] | 3.0E-138 | 28% (415/1434) |
| wsv332 [WSSV] | 2.0E-59 | 25% (212/820) | |||||||
| wsv327 [WSSV] | 2.0E-38 | 24%(217/902) | |||||||
| wsv282 [WSSV] | 7.0E-63 | 43% (168/389) | |||||||
| wsv285 [WSSV] | 3.0E-38 | 22% (186/829) | |||||||
| FAM2 | 434 | 19.895 | 47.82 | 255.227 | 4 | 97 bp-repeat ×2.0, | wsv360 [WSSV] | 0 | 30% (899/2911) |
| FAM1 | 413 | 17.987 | 40.84 | 237.935 | 3 | 60 bp-repeat ×2.1; | wsv306 [WSSV] | 1.0E-50 | 32% (121/376) |
| wsv269 [WSSV] | 2.0E-31 | 25% (99/388) | |||||||
| FAM31&207 | 364 | 24.078 | 44.71 | 206.653 | 4 | 333 bp-repeat ×1.9, | wsv343 [WSSV] | 2.0E-81 | 28% (187/660) |
| Inhibitor of Apoptosis Protein [ | 1.0E-115 | 61% (207/337) | |||||||
| Innexin 3 [ | 3.0E-55 | 40% (104/258) | |||||||
| FAM87 | 242 | 12.067 | 43.76 | 139.605 | 4 | wsv514 [WSSV] | 0 | 37% (702/1881) | |
| FAM24 | 209 | 9.553 | 47.93 | 112.279 | 2 | wsv192 [WSSV] | 2.0E-108 | 29% (310/1042) | |
| wsv209 [WSSV] | 0 | 36% (595/1629) | |||||||
| FAM5 | 200 | 8.457 | 46.61 | 113.975 | 1 | wsv440 [WSSV] | 2.0E-54 | 28% (180/627) | |
| wsv433 [WSSV] | 6.0E-163 | 39% (365/928) | |||||||
| wsv427 [WSSV] | 3.0E-20 | 27%(91/332) | |||||||
| FAM28 | 131 | 6.481 | 43.94 | 74.195 | 0 | wsv325 [WSSV] | 6.0E-60 | 33% (165/493) | |
| wsv271 [WSSV] | 5.0E-13 | 19% (120/608) | |||||||
| FAM43 | 127 | 7.919 | 47.99 | 70.954 | 2 | wsv011 [WSSV] | 3.0E-86 | 27% (332/1190) | |
| FAM255 | 119 | 8.153 | 41.42 | 66.930 | 0 | wsv514 [WSSV] | 1.0E-77 | 27% (294/1071) | |
| FAM124 | 116 | 4.961 | 48.90 | 63.455 | 0 | wsv026 [WSSV] | 4.0E-90 | 39% (231/592) | |
| FAM179 | 88 | 4.819 | 52.25 | 51.657 | 0 | wsv035 [WSSV] | 1.0E-134 | 35% (221/616) | |
| wsv037 [WSSV] | 1.0E-94 | 35% (204/581) | |||||||
| FAM361* | 84 | 5.845 | 43.92 | 47.611 | 2 | wsv209 [WSSV] | 3.0E-80 | 30% (213/697) | |
| FAM259* | 82 | 7.353 | 42.46 | 45.564 | 7 | wsv306 [WSSV] | 1.0E-33 | 27% (119/427) | |
| wsv332 [WSSV] | 2.0E-12 | 22% (82/362) | |||||||
| FAM137 | 71 | 2.639 | 43.08 | 39.233 | 2 | wsv303 [WSSV] | 4.0E-97 | 32% (266/819) | |
| FAM158 | 68 | 3.912 | 49.85 | 34.857 | 2 | wsv289 [WSSV] | 2.0E-19 | 22% (144/634) | |
| FAM29 | 67 | 2.661 | 43.22 | 35.159 | 0 | wsv423 [WSSV] | 4.0E-81 | 32% (195/594) | |
| FAM197 | 52 | 3.181 | 44.39 | 27.452 | 0 | wsv289 [WSSV] | 1.0E-20 | 31% (59/188) | |
| FAM483 | 41 | 3.624 | 44.40 | 21.374 | 0 | wsv360 [WSSV] | 7.0E-30 | 25% (129/511) | |
| FAM224&1875 | 41 | 2.462 | 42.12 | 19.675 | 0 | wsv360 [WSSV] | 4.0E-40 | 32% (119/368) | |
| FAM411 | 41 | 1.739 | 49.17 | 22.235 | 0 | wsv037 [WSSV] | 1.0E-56 | 35% (151/425) | |
| FAM541 | 38 | 3.175 | 37.76 | 21.559 | 0 | wsv343 [WSSV] | 4.0E-36 | 25% (126/490) | |
| FAM209 | 34 | 3.429 | 46.60 | 18.964 | 0 | wsv035 [WSSV] | 1.0E-95 | 31% (309/986) | |
| FAM346 | 34 | 3.115 | 44.37 | 18.965 | 0 | wsv011 [WSSV] | 1.0E-42 | 32% (113/345), | |
| FAM138 | 33 | 2.153 | 36.79 | 16.305 | 0 | wsv433 [WSSV] | 4.0E-71 | 31% (205/655) | |
| FAM177 | 31 | 2.105 | 37.62 | 13.970 | 0 | wsv433 [WSSV] | 2.0E-41 | 31% (147/472) | |
| FAM56 | 26 | 4.241 | 39.85 | 14.764 | 0 | 15 bp-repeat ×2.5, 18 bp-repeat ×2.2 | wsv115 [WSSV] | 2.0E-25 | 29% (117/402) |
| FAM472 | 26 | 1.930 | 45.70 | 13.534 | 0 | wsv447 [WSSV] | 6.0E-42 | 25% (153/610) | |
| FAM156_3,4 | 26 | 1.752 | 41.44 | 14.790 | 0 | wsv139 [WSSV] | 1.0E-38 | 30% (120/388) | |
| FAM838 | 26 | 1.616 | 41.89 | 11.572 | 0 | wsv360 [WSSV] | 4.0E-52 | 31% (169/538) | |
| FAM574 | 25 | 3.916 | 40.70 | 14.908 | 0 | wsv026 [WSSV] | 6.0E-50 | 34% (178/522) | |
| FAM139 | 24 | 2.431 | 40.93 | 13.449 | 2 | wsv360 [WSSV] | 2.0E-14 | 22% (68/297) | |
| FAM9_1-14 | 392 | 6.853 | 43.02 | 190.147 | 69 | 177 bp-repeat ×1.9 | LINE/I | 2.6E-144 | Including a previously described retrotransposon (Contig T; GB# EE724330) demonstrated to be down-regulated under hypoxic and hyperthermic stress [ |
| FAM185 | 350 | 5.837 | 40.86 | 280.277 | 32 | LINE/I | 3.8E-51 | Including a sex-linked AFLP marker (E03M60M72.8) [ | |
| FAM309 | 222 | 7.612 | 40.95 | 126.973 | 1 | Penelope | 4.1E-40 | ||
| FAM189_7-16 | 201 | 6.250 | 44.43 | 112.035 | 2 | Penelope | 1.3E-43 | ||
| FAM75_17-25, 35-36,39-40 | 163 | 4.207 | 50.8 | 82.963 | 11 | LINE/Jockey | 4.2E-46 | Including a previously described retrotransposon (Contig X; GB# EE724334) which is differentially expressed under various stress [ | |
| FAM18* | 143 | 4.621 | 36.57 | 76.777 | 1 | Penelope | 1.3E-47 | ||
| FAM75_1-9, 11-16 | 118 | 4.148 | 57.11 | 51.196 | 23 | 30 bp-repeat ×2.1 | LTR/Gypsy | 1.8E-25 | |
| FAM9_45-54 | 109 | 4.024 | 52.81 | 45.069 | 41 | LINE/RTE-BovB | 1.4E-92 | Including 2 previously described retrotransposons- (1) GB# DQ228358 [ | |
| FAM189_1-6 | 64 | 5.041 | 43.42 | 34.129 | 0 | 16 bp-repeat ×2.1 | Penelope | 2.0E-41 | |
| FAM380 | 58 | 3.330 | 40.33 | 28.528 | 0 | 30 bp-repeat ×2.5 | Penelope | 1.2E-39 | |
| FAM393 | 52 | 3.321 | 44.11 | 28.283 | 19 | LTR/Gypsy | 1.7E-14 | ||
| FAM1285 | 21 | 2.233 | 40.17 | 10.649 | 0 | LINE/Jockey | 1.2E-30 | ||
| FAM498 | 26 | 1.928 | 48.34 | 10.537 | 0 | LINE/Jockey | 4.0E-10 | Containing a | |
| FAM1106 | 24 | 0.828 | 38.29 | 7.719 | 12 | LINE/RTE-BovB | 2.7E-30 | Including a previously described retrotransposon (ED255; GB# EE724266) which is up-regulated under hyperthermic stress [ | |
| FAM145* | 50 | 3.792 | 37.61 | 29.848 | 0 | 21 bp-repeat ×2.0, | dUTPase isoform 1 [ | 5.0E-53 | 69% (96/139) |
| FAM327 | 54 | 6.411 | 37.65 | 26.625 | 1 | 29 bp-repeat ×1.9 | Heat Shock Protein 70 [ | 4.0E-71 | 75% (136/180) |
| FAM142 | 31 | 2.598 | 42.73 | 17.365 | 0 | 40 bp-repeat ×2.3 | Heat Shock Protein 70 [ | 2.0E-125 | 56% (251/441) |
| FAM46* | 26 | 2.712 | 37.21 | 12.028 | 1 | Inhibitor of Apoptosis Protein [ | 9.0E-31 | 34% (100/294) | |
| FAM575 | 35 | 1.351 | 42.78 | 11.594 | 2 | 48 bp-repeat ×11.4 | hCG1645741 [ | 3.0E-15 | 30% (100/328) |
| FAM42 | 419 | 0.611 | 54.83 | 168.921 | 0 | 15 bp-repeat ×2.0 | |||
| FAM72 | 309 | 1.306 | 61.72 | 130.598 | 2 | ||||
| FAM121 | 153 | 1.368 | 56.07 | 90.081 | 0 | 472 bp-repeat ×2.3 | |||
| FAM80 | 158 | 0.618 | 73.79 | 56.266 | 0 | ||||
| FAM75_26-34, 37-38 | 88 | 2.565 | 53.33 | 43.871 | 0 | 190 bp-repeat ×2.6; (GGAGAGAGGGGA) ×2.3 | |||
| FAM198 | 84 | 7.608 | 39.34 | 47.826 | 0 | 192 bp-repeat ×1.8 | |||
| FAM205 | 83 | 0.400 | 49.00 | 16.051 | 0 | ||||
| FAM345 | 80 | 0.284 | 47.89 | 17.102 | 0 | (GTGTTGGTTTGTGT) ×2.2 | |||
| FAM67&606&707 | 75 | 3.578 | 42.51 | 38.079 | 3 | 24 bp-repeat ×2.0; 273 bp-repeat ×1.9; 15 bp-repeat ×2.0 | |||
| FAM328 | 66 | 2.892 | 39.42 | 37.000 | 0 | 21 bp-repeat ×2.1 | |||
| FAM79 | 63 | 0.123 | 73.17 | 7.475 | 0 | ||||
| FAM19 | 55 | 4.948 | 47.15 | 32.320 | 0 | 13 bp-repeat ×3.1 | |||
| FAM6 | 57 | 2.133 | 46.51 | 28.644 | 0 | ||||
| FAM64* | 57 | 2.958 | 39.69 | 27.477 | 0 | ||||
| FAM199 | 51 | 2.362 | 41.19 | 24.646 | 0 | ||||
| FAM156_1,2,5 | 50 | 2.180 | 43.58 | 26.492 | 0 | ||||
| FAM392 | 50 | 0.904 | 31.42 | 14.760 | 0 | ||||
| FAM188 | 49 | 0.990 | 48.59 | 12.318 | 17 | 302 bp-repeat ×1.9 | |||
| FAM348 | 46 | 0.244 | 72.54 | 9.579 | 0 | ||||
| FAM165 | 45 | 0.944 | 34.32 | 25.341 | 0 | 129 bp-repeat ×6.7 | |||
| FAM382* | 44 | 1.995 | 41.60 | 24.050 | 0 | ||||
| FAM153 | 36 | 1.895 | 42.90 | 16.403 | 0 | ||||
| FAM245 | 36 | 1.626 | 47.72 | 17.019 | 0 | ||||
| FAM578 | 35 | 3.552 | 45.02 | 20.560 | 0 | ||||
| FAM57 | 33 | 3.366 | 39.51 | 18.344 | 4 | ||||
| FAM453 | 33 | 0.802 | 43.14 | 13.902 | 0 | 59 bp-repeat ×2.0 | |||
| FAM172 | 32 | 4.035 | 40.57 | 18.414 | 0 | ||||
| FAM632 | 32 | 3.077 | 38.12 | 17.055 | 1 | ||||
| FAM839 | 32 | 1.896 | 30.96 | 20.315 | 2 | 18 bp-repeat ×1.9; 61 bp-repeat ×2.1; 215 bp-repeat ×2.1 | |||
| FAM120 | 31 | 2.782 | 38.75 | 18.864 | 0 | (GCCTGA) ×5.8, (CTGCGG) ×4.2 | |||
| FAM390 | 31 | 0.563 | 38.54 | 11.570 | 0 | ||||
| FAM369 | 30 | 1.579 | 41.36 | 15.870 | 0 | ||||
| FAM816 | 27 | 3.056 | 36.81 | 15.590 | 0 | 87 bp-repeat ×6.7 | |||
| FAM22 | 27 | 3.767 | 42.39 | 14.516 | 0 | 39 bp-repeat ×2.2 | |||
| FAM708 | 26 | 3.297 | 37.03 | 17.053 | 0 | 37 bp-repeat ×2.1; 34 bp-repeat ×1.9; 17 bp-repeat ×1.9 × 2 sites | |||
| FAM244 | 25 | 1.269 | 39.48 | 11.190 | 2 | ||||
| FAM405 | 25 | 0.963 | 42.37 | 11.828 | 0 | ||||
| FAM580 | 24 | 2.246 | 37.62 | 13.474 | 0 | 183 bp-repeat ×3.1; 34 bp-repeat ×1.9 | |||
| FAM266 | 24 | 0.486 | 51.23 | 5.216 | 0 | ||||
| FAM440 | 23 | 2.100 | 39.71 | 12.938 | 0 | 27 bp-repeat ×2.9; 162 bp-repeat ×2.6 | |||
| FAM1203 | 23 | 1.298 | 41.68 | 11.308 | 0 | ||||
| FAM1696 | 23 | 0.495 | 37.37 | 7.108 | 0 | ||||
| FAM569 | 22 | 2.189 | 39.10 | 11.903 | 3 | ||||
| FAM296 | 22 | 2.169 | 39.47 | 11.409 | 0 | ||||
| FAM1318 | 21 | 2.597 | 34.39 | 11.053 | 0 | ||||
| FAM146 | 21 | 2.557 | 44.66 | 12.539 | 0 | ||||
| FAM278 | 21 | 2.324 | 38.60 | 11.618 | 0 | ||||
| FAM282 | 21 | 1.875 | 48.05 | 10.369 | 0 | ||||
| FAM277 | 21 | 1.705 | 36.60 | 9.807 | 0 | ||||
| FAM740 | 20 | 1.293 | 31.01 | 7.472 | 0 | ||||
| FAM696 | 20 | 0.755 | 29.27 | 11.730 | 1 | 86 bp-repeat ×8.6 | |||
The consensus sequences of the 103 PREs can be downloaded from the Penaeus Genome Database http://sysbio.iis.sinica.edu.tw/page/others.php?news=0.
Seven out of the 103 PREs of which homologous sequences were present in the Marsupenaeus japonicus genome were marked with asterisks (*).
BlastN search against the P. monodon EST dataset (PmTwN) in the Penaeus Genome Database; cut-off value: 1E-40; matched length: 200 bp; identity: 85%
BlastX search against the nr database; cut-off value: 1E-10
Types of transposable element defined by protein-based RepeatMasker; cut-off value: 1E-10
Summary of repetitive sequences in the P.monodon genome
| Length (bp) | Total length (%) | |||
|---|---|---|---|---|
| Type | RepeatMasker | RECON | ||
| Non-LTRs | 16,570 | 627,361 | 643,931 (5.79%) | |
| Penelope | 0 | 378,442 | 378,442 (3.40%) | |
| LTRs | 77,962 | 79,479 | 157,441 (1.42%) | |
| 14,817 | 0 | 14,817 (0.13%) | ||
| 0 | 2,399,849 | 2,399,849 (21.59%) | ||
| 309 | 1,285,334 | 1,285,643 (11.57%) | ||
| 807,927 | 0 | 807,927 (7.27%) | ||
| 5,688,050 | ||||