| Literature DB >> 23150988 |
Hirohide Uenishi1, Takeya Morozumi, Daisuke Toki, Tomoko Eguchi-Ogawa, Lauretta A Rund, Lawrence B Schook.
Abstract
BACKGROUND: Along with the draft sequencing of the pig genome, which has been completed by an international consortium, collection of the nucleotide sequences of genes expressed in various tissues and determination of entire cDNA sequences are necessary for investigations of gene function. The sequences of expressed genes are also useful for genome annotation, which is important for isolating the genes responsible for particular traits.Entities:
Mesh:
Year: 2012 PMID: 23150988 PMCID: PMC3499286 DOI: 10.1186/1471-2164-13-581
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Pig cDNA libraries, ESTs, and completely sequenced cDNA clones
| | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| ADR01 | Oligo-capping | Adrenal gland | LWD (LWD2) | pCMVFL3 | 6490 | 3188 | 9678 | 901 | 861 | 782 | |
| AMP01 | SMART | Alveolar macrophage | LWD (LWD7) | pDNR-LIB | 5459 | 3704 | 9163 | 1416 | 1345 | 930 | |
| BFLT1 | Oligo-capping | Brain (frontal lobe) | Duroc (2-14C) | pME18S | 8104 | 5678 | 13,782 | 991 | 953 | 859 | |
| BKFL1 | SMART | Backfat | Landrace (L2) | pDNR-LIB | 1557 | 7920 | 9477 | 453 | 418 | 404 | |
| BMWN1 | Vector-capping | Bone marrow | NIBS miniature | pGCAP10 | 5981 | 2492 | 8473 | 148 | 133 | 124 | |
| CBLT1 | Vector-capping | Cerebellum | Duroc (2-14C) | pGCAP10 | 6082 | 3054 | 9136 | 0 | 0 | 0 | |
| CLNT1 | Oligo-capping | Colon | Duroc (2-14C) | pME18S | 6580 | 6782 | 13,362 | 537 | 512 | 365 | |
| DCI01 | SMART | Immature dendritic cells | Landrace (L1) | pDNR-LIB | 5953 | 4486 | 10,439 | 887 | 841 | 748 | |
| HTMT1 | Vector-capping | Hypothalamus | Duroc (2-14C) | pGCAP10 | 8063 | 5278 | 13,341 | 1663 | 1474 | 1264 | |
| ILNT1 | Vector-capping | Inguinal lymph node | Duroc (2-14C) | pGCAP10 | 6321 | 2658 | 8979 | 0 | 0 | 0 | |
| ITT01 | Oligo-capping | Intestine | LWD (LWD2) | pCMVFL3 | 7272 | 2475 | 9747 | 1265 | 1206 | 1037 | |
| KDN01 | Oligo-capping | Kidney | LWD (LWD8) | pME18S | 5873 | 3235 | 9108 | 748 | 723 | 663 | |
| LNG01 | Oligo-capping | Lung | LWD (LWD3) | pCMVFL3 | 5186 | 3859 | 9045 | 1331 | 1250 | 1061 | |
| LVR01 | Oligo-capping | Liver | LWD (LWD4) | pCMVFL3 | 7199 | 1815 | 9014 | 779 | 741 | 653 | |
| LVRM1 | Oligo-capping | Liver | Meishan | pCMVFL3 | 13,881 | 5051 | 18,932 | 1844 | 1760 | 1372 | |
| MLN01 | Oligo-capping | Mesenteric lymph node | LWD (LWD2) | pCMVFL3 | 6443 | 3250 | 9693 | 1176 | 1099 | 902 | |
| MLTL1 | SMART | Longissimus muscle | Landrace (L2) | pDNR-LIB | 3577 | 4892 | 8469 | 413 | 388 | 292 | |
| OVR01 | Oligo-capping | Ovary | LWD (LWD1) | pCMVFL3 | 6537 | 2828 | 9365 | 1416 | 1356 | 1226 | |
| OVRM1 | Oligo-capping | Ovary | Meishan | pCMVFL3 | 12,471 | 7071 | 19,542 | 3309 | 3149 | 2665 | |
| OVRT1 | Oligo-capping | Ovary | Duroc (2-14C) | pME18S | 9516 | 4340 | 13,856 | 819 | 790 | 728 | |
| PBL01 | Oligo-capping | Peripheral blood lymphocytes | LWD (LWD5) | pCMVFL3 | 6652 | 3262 | 9914 | 957 | 908 | 732 | |
| PCT01 | Oligo-capping | Placenta | LWD (LWD9) | pME18S | 2175 | 1115 | 3290 | 161 | 150 | 142 | |
| PST01 | Oligo-capping | Prostate | LWD (LWD10) | pME18S | 6813 | 2329 | 9142 | 691 | 654 | 596 | |
| PTG01 | Oligo-capping | Pituitary gland | LWD (LWD4) | pCMVFL3 | 4281 | 5628 | 9909 | 864 | 826 | 790 | |
| SKNB1 | Oligo-capping | Skin | Berkshire | pME18S | 4894 | 3363 | 8257 | 687 | 630 | 534 | |
| SMG01 | Oligo-capping | Submaxillary gland | LWD (LWD2) | pCMVFL3 | 6944 | 2680 | 9624 | 458 | 430 | 361 | |
| SPL01 | Oligo-capping | Spleen | LWD (LWD1) | pCMVFL3 | 6793 | 2811 | 9604 | 1457 | 1397 | 1207 | |
| SPLT1 | Vector-capping | Spleen | Duroc (2-14C) | pGCAP10 | 6037 | 2734 | 8771 | 0 | 0 | 0 | |
| TCH01 | Oligo-capping | Trachea | LWD (LWD3) | pCMVFL3 | 5151 | 3658 | 8809 | 1412 | 1345 | 1087 | |
| TES01 | Oligo-capping | Testis | LWD (LWD6) | pME18S | 7112 | 2962 | 10,074 | 697 | 669 | 466 | |
| THY01 | Oligo-capping | Thymus | LWD (LWD1) | pCMVFL3 | 7586 | 3704 | 11,290 | 2158 | 2066 | 1620 | |
| UTR01 | Oligo-capping | Uterus | LWD (LWD1) | pCMVFL3 | 6796 | 2626 | 9422 | 1441 | 1356 | 1180 | |
| Total | 209,779 | 120,928 | 330,707 | 31,079 | 29,430 | 13,894 | (2993)b | ||||
Thirty-two cDNA libraries constructed with pig tissues and cell populations were used to generate expressed sequence tags (ESTs). Twenty-three libraries were constructed by using the oligo-capping method [21], and five were constructed by using the vector-capping [20] method. Four libraries were constructed by using the SMART method (Clontech, Palo Alto, CA, USA) [22]. ESTs in contigs are shown, as are those that had a ≥100-bp stretch of Phred quality value ≥20 and were not involved in contigs (“In singlets”). Numbers of cDNA clones that were completely sequenced (31,079 in total) are also shown. Completely sequenced cDNA clones derived from pigs cloned from a female subjected to genome sequencing by the International Swine Genome Sequencing Consortium are classified by library.
a Origins are indicated as pig breeds or lines. Different animals in the same breed are designated as described in parentheses. LWD is (Landrace × Large White) × Duroc. Duroc (2-14C) designates a Duroc individual cloned from a female pig subjected to draft genome sequencing [5-7].
b Loci repeated among clones derived from different libraries were removed.
Figure 1Distribution of EST reads in contigs. Contigs are ordered by the numbers of expressed sequence tags (ESTs) they contained. There were 6740 contigs carrying two ESTs and 1084 contigs carrying more than 30 ESTs. There were 15,432 contigs consisting of fewer than 20 ESTs.
Correspondence to mammalian genes and estimated efficiencies of cloning of start codons of EST assemblies
| Human | 13,691 (754) | 12,911 | 64,011 | 12,056 | 51,955 | 47,229 | 9,635 | 37,594 |
| Mouse | 12,955 (730) | 12,137 | 63,444 | 12,028 | 51,416 | 45,539 | 9,588 | 35,951 |
| Cattle | 13,445 (1935) | 11,341 | 63,718 | 12,035 | 51,683 | 47,118 | 9,634 | 37,484 |
| Dog | 12,293 (763) | 11,410 | 62,815 | 11,871 | 50,944 | 37,193 | 8,090 | 29,103 |
| Pig | 14,275 | 63,169 | 11,917 | 51,252 | 46,063 | 9,396 | 36,667 | |
Numbers of genes that had unique NCBI Gene IDs and corresponded to contigs and singlets generated by assembly of expressed sequence tags (ESTs) are indicated. Also shown are the numbers that had unique Gene IDs in the NCBI HomoloGene database (a database of orthologs among species) and corresponded to the contigs and singlets generated. Numbers in parentheses indicate numbers of gene IDs that had no corresponding HomoloGene IDs. HomoloGene IDs in pigs are not indicated, because there is no HomoloGene ID database for pig genes.
EST assemblies were estimated to contain start codons if the length upstream of the matches (BLAST score >50) in the assemblies was greater than that between the start base of the coding sequence and the matched region of the corresponding gene. Numbers of assemblies (contigs and singlets) corresponding to protein sequences in humans, mice, cattle, dogs, and pigs are also shown.
Figure 2Classification of ESTs according to Gene Ontology. Proportions of expressed sequence tags (EST) classified according to Gene Ontology terms under the root namespaces (molecular function (A), cellular component (B), and biological process (C)) are indicated for each cDNA library. Classification according to Gene Ontology was conducted by using the similarity of the EST assemblies to human genes and the correspondence between genes and the Gene Ontology terms provided in NCBI Gene ( ftp://ftp.ncbi.nih.gov/gene/DATA/; [24]). ESTs classified under more than one term in a single namespace are counted redundantly under the respective terms. ESTs not classified under any terms are omitted from this figure.
Figure 3Correspondence of EST assemblies and cDNA clones to locations on the draft sequence of the pig genome and Gene Ontology. Shown are the numbers of loci that are located on each pig chromosome in the draft sequence of the pig genome (Sscrofa10.2) and that correspond to the EST assemblies (A) and pig cDNA clones (B) sequenced in this study. Orientations of loci are shown by closed bars (pter to qter) and open bars (qter to pter). EST assemblies (C) and pig cDNA clones that were completely sequenced (D) were classified according to the Gene Ontology terms shown under each root namespace (i.e., molecular function, cellular component, and biological process) by using the ontology file as at 31 October 2011. Classification according to Gene Ontology was conducted by using the similarity of cDNA clones to the mRNA sequences of human genes in the NCBI RefSeq (release 49) and the correspondence between genes and Gene Ontology terms provided in NCBI Gene ( ftp://ftp.ncbi.nih.gov/gene/DATA/ as at 2 November 2011; [24]). The numbers of EST assemblies classified into the three namespaces (molecular function, cellular component, and biological process) were 56,703, 59,098, and 54,610, respectively. The numbers of cDNA clones classified into the three namespaces were 21,332, 22,306, and 20,538, respectively. Assemblies and clones classified under more than one term in a single namespace are counted redundantly under the respective terms. Terms including fewer than 1000 assemblies and clones are indicated as “Others” in the aggregates.
Mapping of pig EST assemblies on pig chromosomes
| 1 | 580 | 3343 | 1799 | 594 | 3346 | 1798 | 1174 | 6689 | 3597 |
| 2 | 639 | 2983 | 1452 | 646 | 2855 | 1365 | 1285 | 5838 | 2817 |
| 3 | 415 | 2248 | 1132 | 472 | 2574 | 1299 | 887 | 4822 | 2431 |
| 4 | 435 | 2049 | 1013 | 398 | 2112 | 1025 | 833 | 4161 | 2038 |
| 5 | 373 | 1774 | 870 | 375 | 2007 | 846 | 748 | 3781 | 1716 |
| 6 | 667 | 3188 | 1503 | 567 | 2826 | 1415 | 1234 | 6014 | 2918 |
| 7 | 526 | 2590 | 1211 | 631 | 3007 | 1039 | 1157 | 5597 | 2250 |
| 8 | 301 | 1637 | 851 | 215 | 1434 | 754 | 516 | 3071 | 1605 |
| 9 | 357 | 1966 | 1011 | 371 | 2082 | 1007 | 728 | 4048 | 2018 |
| 10 | 207 | 1118 | 555 | 200 | 1182 | 598 | 407 | 2300 | 1153 |
| 11 | 128 | 762 | 449 | 121 | 855 | 496 | 249 | 1617 | 945 |
| 12 | 388 | 2133 | 869 | 361 | 1759 | 896 | 749 | 3892 | 1765 |
| 13 | 489 | 2425 | 1293 | 515 | 2467 | 1321 | 1004 | 4892 | 2614 |
| 14 | 508 | 2693 | 1226 | 523 | 2716 | 1175 | 1031 | 5409 | 2401 |
| 15 | 284 | 1852 | 893 | 288 | 1828 | 797 | 572 | 3680 | 1690 |
| 16 | 147 | 817 | 498 | 139 | 894 | 509 | 286 | 1711 | 1007 |
| 17 | 348 | 1630 | 585 | 215 | 905 | 474 | 563 | 2535 | 1059 |
| 18 | 192 | 964 | 479 | 176 | 1127 | 479 | 368 | 2091 | 958 |
| X | 220 | 1018 | 591 | 216 | 1165 | 618 | 436 | 2183 | 1209 |
| Y | 4 | 14 | 6 | 1 | 6 | 6 | 5 | 20 | 12 |
| Unplaced scaffolds | 1077 | 7726 | 2251 | 812 | 4387 | 2212 | 1889 | 12,113 | 4463 |
| Total | 8285 | 44,930 | 20,537 | 7836 | 41,534 | 20,129 | 16,121 | 86,464 | 40,666 |
Numbers of EST (expressed sequence tag) assemblies and the corresponding independent loci mapped on the pig genome are shown for each chromosome. “Forward” and “Reverse” indicate that the assemblies were aligned on the chromosomes in the orientation from pter to qter or from qter to pter, respectively.
Distributions of lengths of cDNA clones completely sequenced by the primer walking and transposon shotgun sequencing methods
| –1000 | 1174 | 1956 | 93 | 2049 | 156 |
| 1001–1500 | 928 | 9062 | 913 | 9975 | 519 |
| 1501–2000 | 67 | 6201 | 1843 | 8044 | 1058 |
| 2001–2500 | 0 | 2817 | 731 | 3548 | 766 |
| 2501–3000 | 0 | 919 | 428 | 1347 | 450 |
| 3000–3500 | 0 | 48 | 294 | 342 | 286 |
| 3501–4000 | 0 | 1 | 85 | 86 | 165 |
| 4001– | 0 | 0 | 19 | 19 | 100 |
| Total | 2169 | 21,004 | 4406 | 25,410 | 3500 |
| Average (bp) | 907.6 | 1548.3 | 1959.6 | 1620.0 | 2163.3 |
Lengths of cDNA clones are shown according to the sequencing methods used. Clones sequenced by using the primer walking (PW) method are shown separately according to the number of primer walkings. Clones sequenced by using transposon shotgun sequencing (TPS) include those that were finished by the TPS method after being sequenced by the PW method.
a Clones that were completely sequenced just by using universal primers (forward and reverse primers aligned to the vector sequence, and T25V primer).
Mammalian genes corresponding to sequenced cDNA clones
| | | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| | | | ||||||||||||
| Human | 11,298 | (397) | 10,889 | | 604 | (31) | 572 | | 12,498 | (11,904) | 5966 | (5509) | 5749 | 276 |
| Mouse | 10,688 | (444) | 10,197 | | 553 | (37) | 513 | | 12,354 | (11,762) | 5597 | (5365) | 5567 | 269 |
| Cattle | 10,881 | (1329) | 9487 | | 568 | (88) | 479 | | 12,622 | (12,045) | 5667 | (5435) | 5622 | 270 |
| Dog | 10,082 | (469) | 9556 | | 522 | (28) | 492 | | 9642 | (9150) | 4188 | (4011) | 4200 | 208 |
| Pig | 10,752 | | | | 342 | | | | 11,873 | (11,420) | 5244 | (5116) | 5343 | 182 |
| Totalb | 14,616 | (13,962) | 6466 | |||||||||||
Numbers of genes that had unique NCBI Gene IDs and corresponded to the sequences of the pig cDNA clones are shown at the left side of the table. Also shown are the numbers that had unique Gene IDs in the NCBI HomoloGene database (a database of orthologs among species) and corresponded to the sequences of pig cDNA clones. The numbers of unique Gene IDs corresponding to sequences of pig cDNA clones that were not mapped at any locations on the draft pig genome sequence (Sscrofa10.2) are also shown. In addition, the numbers of genes that had unique Gene IDs in the NCBI HomoloGene database and corresponded to the sequences of the unmapped pig cDNA clones are indicated. Numbers in parentheses indicate numbers of unique Gene IDs that had no corresponding HomoloGene IDs. Numbers of HomoloGene IDs in pigs are not indicated, because there is no HomoloGene ID database for pig genes. At the right side of the table, The numbers of cDNA clones estimated to contain full-length coding sequences (CDSs) of pig genes by alignment against the protein sequences of humans, mice, cattle, and dogs in the NCBI RefSeq database are shown in the first column. Pig protein sequences registered in RefSeq were used in this analysis. The numbers of unique NCBI Gene IDs corresponding to the cDNA clones and the numbers of loci corresponding to the cDNA clones examined using the draft sequence of the pig genome (Sscrofa10.2) are also indicated (“Loci with Gene ID”).
a We counted the unique NCBI Gene IDs corresponding to cDNA clones that were not mapped on the draft sequence of the pig genome. If the Gene IDs corresponded to cDNA clones both mapped on, and not mapped on, the draft genome sequence, they were counted in both the “Loci with Gene ID” column and the “Gene ID with cDNA unmapped on pig genome” column.
b Redundant cDNA clones corresponding to genes of more than one species were counted without repetition.
Mapping of the pig cDNA clones sequenced in this study on pig chromosomes
| 1 | 1128 (164) | 560 (134) | 1197 (133) | 594 (112) | 2325 (297) | 1154 (246) |
| 2 | 1144 (138) | 564 (123) | 1180 (161) | 563 (134) | 2324 (299) | 1127 (257) |
| 3 | 744 (97) | 380 (87) | 871 (121) | 468 (99) | 1615 (218) | 848 (186) |
| 4 | 748 (88) | 366 (71) | 793 (96) | 385 (81) | 1541 (184) | 751 (152) |
| 5 | 710 (86) | 310 (63) | 685 (77) | 313 (65) | 1395 (163) | 623 (128) |
| 6 | 1184 (174) | 552 (133) | 1040 (130) | 539 (115) | 2224 (304) | 1091 (248) |
| 7 | 1013 (117) | 437 (98) | 1118 (125) | 397 (97) | 2131 (242) | 834 (195) |
| 8 | 563 (69) | 267 (58) | 393 (50) | 209 (44) | 956 (119) | 476 (102) |
| 9 | 699 (94) | 339 (72) | 673 (85) | 339 (75) | 1372 (179) | 678 (147) |
| 10 | 409 (48) | 167 (37) | 362 (37) | 169 (32) | 771 (85) | 336 (69) |
| 11 | 225 (26) | 119 (25) | 237 (22) | 129 (20) | 462 (48) | 248 (45) |
| 12 | 723 (89) | 334 (76) | 667 (68) | 328 (60) | 1390 (157) | 662 (136) |
| 13 | 922 (124) | 431 (101) | 890 (93) | 467 (80) | 1812 (217) | 898 (181) |
| 14 | 894 (101) | 405 (80) | 920 (116) | 424 (90) | 1814 (217) | 829 (170) |
| 15 | 568 (79) | 281 (64) | 518 (61) | 247 (55) | 1086 (140) | 528 (119) |
| 16 | 266 (35) | 137 (26) | 252 (39) | 123 (35) | 518 (74) | 260 (61) |
| 17 | 607 (70) | 212 (59) | 365 (47) | 190 (38) | 972 (117) | 402 (97) |
| 18 | 325 (32) | 144 (29) | 354 (38) | 151 (30) | 679 (70) | 295 (59) |
| X | 364 (56) | 195 (49) | 402 (73) | 203 (55) | 766 (129) | 398 (104) |
| Y | 5 (0) | 2 (0) | 1 (0) | 1 (0) | 6 (0) | 3 (0) |
| Unplaced scaffolds | 1854 (308) | 747 (156) | 1417 (162) | 706 (135) | 3271 (470) | 1453 (291) |
| Total | 15,095 (1995) | 6949 (1541) | 14,335 (1734) | 6945 (1452) | 29,430 (3729) | 13,894 (2993) |
Numbers of cDNA clones and the corresponding independent loci mapped on the pig genome are shown for each chromosome. “Forward” and “Reverse” indicate that the cDNA clones are aligned on the chromosomes in the orientation from pter to qter and from qter to pter, respectively. Numbers in parentheses show clones derived from pigs cloned from a female subjected to genome sequencing by the International Swine Genome Sequencing Consortium (Duroc 2–14).