| Literature DB >> 28098239 |
Xi Fu1,2, Fengjun Zhang2, Shugo Watabe3, Shuichi Asakawa2.
Abstract
Here, we report a genome-wide survey of immunoglobulin light chain (IGL) genes of torafugu (Takifugu rubripes) revealing multi-clusters spanning three separate chromosomes (v5 assembly) and 45 scaffolds (v4 assembly). Conventional sequence similarity searches and motif scanning approaches based on recombination signal sequence (RSS) motifs were used. We found that three IGL isotypes (L1, L2, and L3) exist in torafugu and that several loci for each isotype are present. The transcriptional orientations of the variable IGL (VL) segments were found to be either the same (in the L2 isotype) or opposite (in the L1 and L3 isotypes) to the IGL joining (JL) and constant (CL) segments, suggesting they can undergo rearrangement by deletion or inversion when expressed. Alignments of expressed sequence tags (ESTs) to corresponding germline gene segments revealed expression of the three IGL isotypes in torafugu. Taken together, our findings provide a genomic framework for torafugu IGL genes and show that the IG diversity of this species could be attributed to at least three distinct chromosomal regions.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28098239 PMCID: PMC5241823 DOI: 10.1038/srep40416
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Genomic features of the torafugu V genes.
| IGLV family | VL gene | Fct | Promoter | Gene structure | RSS | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Octamer | (nt) | TATA | (nt) | L-PART1 (nt) | Intron | V-exon | 7mer | Spacer (nt) | 9mer | |||
| IGLV1 | V1a^ | Pa | — | — | — | — | — | — | 114 | CACAGTG | 12 | ACAAAAACC |
| V1b^ | Pb | ATTTGCAT | 27 | TTTAAA | 64 | 40 | 137 | 249 | — | — | — | |
| V1c^ | ORFl | ATTTGCAT | 27 | TTTAAA | 64 | 40 | 137 | 285 | CACAGTG | 12 | ACAAAAACC | |
| V1d^ | Pa | — | — | — | — | — | — | 267 | CACAGTG | 12 | ACAAAAACC | |
| V1e^ | ORFl | ATTTGCAT | 27 | TTTAAA | 64 | 40 | 137 | 291 | CACAGTG | 12 | ACAAAAACC | |
| V1f | ORFl | — | — | TTTAAA | 64 | 40 | 137 | 300 | CACAGTG | 12 | ACAAAAACC | |
| V1g^ | Pa | — | — | — | — | — | — | 300 | CACAGTG | 12 | ACAAAAACC | |
| V1h^ | ORFl | ATTTGCAT | 27 | TTTAAA | 64 | 40 | 137 | 300 | CACAGTG | 12 | ACAAAAACC | |
| V1i | Pa | — | — | — | — | — | — | 300 | CACAGTG | 12 | ACAAAAACC | |
| V1j | ORFl | ATTTGCAT | 27 | TTTAAA | 64 | 40 | 137 | 285 | CACAGTG | 12 | ACAAAAACC | |
| V1k^ | ORFl | — | — | — | — | 40 | 137 | 291 | CACAGTG | 12 | ACAAAAACC | |
| V1l^ | ORFl | ATTTGCAT | 27 | TTTAAA | — | 40 | 133 | 291 | CACAGTG | 12 | ACAAAAACC | |
| V1m^ | ORFl | ATTTGCAT | 27 | TTTAAA | 64 | 40 | 140 | 285 | CACAGTG | 12 | ACAAAAACC | |
| V1n | ORFl | ATTTGCAT | 27 | TTTAAA | 64 | 40 | 137 | 300 | CACAGTG | 12 | ACAAAAACC | |
| V1o | ORFl | ATTTGCAT | 27 | TTTAAA | 62 | 40 | 137 | 285 | CACAGTG | 12 | ACAAAAACC | |
| V1p^ | ORFl | — | — | TTTAAA | 64 | 40 | 137 | 288 | CACAGTG | 12 | ACAAAAACC | |
| V1q^ | ORFl | ATTTGCAT | 27 | TTTAAA | 62 | 40 | 137 | 300 | CACAGTG | 12 | ACAAAAACC | |
| V1r^ | Pb | ATTTGCAT | 27 | TTTAAA | 64 | 40 | 133 | 168 | — | — | — | |
| V1s^ | ORFl | — | — | — | — | 40 | 137 | 291 | CACAGTG | 12 | ACAAAAACC | |
| V1t^ | ORFl | ATTTGCAT | 27 | TTTAAA | 64 | 40 | 137 | 291 | CACAGTG | 12 | ACAAAAACC | |
| IGLV2 | V2a^ | Pf | ATGTAAAT | 107 | TATTAA | 97 | 40 | 92 | 326 | CACAGTG | 12 | ACAAAAACC |
| V2b^ | Pb | ATGTAAAT | 107 | TATTAA | 97 | 40 | 92 | 206 | — | — | — | |
| V2c^ | Ph | ATGTAAAT | 107 | TATTAA | 97 | 40 | 92 | 323 | CACAGTG | 12 | ACAAAAACC | |
| V2d^ | F | ATGCAAAT | 101 | TATTAA | 97 | 40 | 92 | 302 | CACAGTG | 12 | ACAAAAACC | |
| V2e^ | Pk | ATGTAAAT | 107 | TATTAA | 97 | 40 | 92 | 326 | CACAGTG | 12 | ACAAAAACC | |
| V2f^ | Pa | ATGTAAAT | — | — | — | — | — | 329 | CACAGTG | 12 | ACAAAAACC | |
| V2g^ | F | TTGAAAAT | 88 | TATTAA | 97 | 40 | 92 | 326 | CACAGTG | 12 | ACAAAAACC | |
| V2h^ | Pb | ATGTAAAT | 107 | TATTAA | 97 | 40 | 92 | 215 | — | — | — | |
| V2i^ | F | ATGCAAAT | 101 | TATTAA | 97 | 40 | 92 | 329 | CACAGTG | 12 | ACAAAAACC | |
| V2j^ | Pd | ATGTAAAT | 107 | TATTAA | 97 | 40 | 195 | 225 | CACAGTG | 12 | ACAAAAACC | |
| V2k^ | F | ATGTAAAT | 107 | TATTAA | 97 | 40 | 217 | 182 | CACAGTG | 12 | ACAAAAACC | |
| V2l | Pb | ATGTAAAT | 107 | TATTAA | 97 | 40 | 92 | 230 | — | — | — | |
| V2m | Pc | ATGTAAAT | 107 | TATTAA | 97 | 40 | 92 | 329 | CACAGTG | 12 | ACAAAAACC | |
| V2n | Pe | ATGTAAAT | 107 | TATTAA | 97 | 40 | 92 | 326 | CACAGTG | 12 | — | |
| V2o | Pg | — | — | TTAAAT | 97 | 40 | 92 | 326 | CACAGTG | 12 | ACAAAAACC | |
| V2p | Pi | — | — | — | — | 40 | 92 | 326 | CACAGTG | 12 | ACAAAAACC | |
| V2q | Pb | ATGTAAAT | 107 | TATTAA | 97 | 40 | 92 | 302 | — | — | — | |
| V2r | Pj | — | — | TATTAA | 97 | 40 | 92 | 329 | CACAGTG | 12 | ACAAAAACC | |
| V2s | Pa | — | — | — | — | — | 92 | 326 | CACAGTG | 12 | ACAAAACCT | |
| V2t | F | ATGTAAAT | 107 | TATTAA | 97 | 40 | 89 | 323 | CACAGTG | 12 | ACAAAAACC | |
| V2u | Pa | — | — | — | — | — | — | 329 | CACAGTG | 12 | ACAAAAACC | |
| V2v | F | — | — | — | — | 40 | 92 | 329 | CACAGTG | 12 | ACAAAAACC | |
| IGLV3 | V3a^ | F | ATTTCCAT | 38 | TTTATA | 65 | 52 | 84 | 303 | CACAGTG | 12 | ACAAACCCT |
| V3b^ | F | ATTTCCAT | 38 | TTTATA | 65 | 52 | 85 | 314 | CACAGTG | 12 | ACAAAAACT | |
| V3c^ | F | ATTTCCAT | 38 | TTTATA | 65 | 52 | 85 | 317 | CACAGTG | 12 | ACAAAAACC | |
| V3d^ | F | ATTTCCAT | 38 | TTTATA | 65 | 52 | 84 | 315 | CACAGTG | 12 | ACAAAAACC | |
| V3e | Pa | — | — | — | — | — | — | 306 | CACAGTG | 12 | ACAAAAACC | |
| V3f | Pa | — | — | — | — | — | — | 315 | CACAGTG | 12 | ACAAAAACC | |
Fct functionality, F functional, P pseudogene, ORF open reading frame, R reverse strand,
^ VL gene segments depicted in schematic diagram of the genomic loci,
aL–PART1 is missing;
b3′ truncation;
c1 nt deletion and frameshift at position 659 R; 2 nt deletion and frameshift from 637 R;
d1 nt deletion and frameshift at position 5176 R; 2 nt deletion and frameshift from 5154 R; e1 nt deletion and frameshift at position 4359; 2 nt deletion and frameshift from 4381;
f1 nt deletion and frameshift at position 4666; 2 nt deletion and frameshift from 4678;
g1 nt deletion and frameshift at position 3685 R; 2 nt deletion and frameshift from 3673 R;
h1 nt insertion and frameshift at position 540; 1 nt deletion and frameshift at position 586; 1 nt deletion and frameshift at position 608;
i6 nt deletion and frameshift from 1896; 1 nt deletion and frameshift at position 1936; 2 nt deletion and frameshift from 1955;
j1 nt insertion and frameshift at position 439 R; 4 nt deletion and frameshift from 456 R;
k2 nt deletions in CDR1-IMGT and CDR2-IMGT regions and frameshift mutations at 1418 and 1487; 4 nt deletion and frameshift from 1429; 1 nt deletion and frameshift at position 1462;
l1st-CYS replaced by Ala.
Torafugu JL nucleotide and AA sequences with associated RSS.
| Fct | J-Nonamer | Spacer | J-Heptamer | J region nt and AA sequences | |
|---|---|---|---|---|---|
| J1a | F | GGTTTTTGT | ACGACCACTTGATGAGTTTGTAT | CACTGTG | |
| J1b | F | GGTTTTTGT | ACGACCACTTGATGAGTTTGTAT | CACTGTG | |
| J1c | F | GGTTTTTGT | ACGACCACTTGATGAGTTTGTAT | CACTGTG | |
| J2a | F | GGTTTTTGT | ACAGCTGTGTGTACAAACTGAAT | CACTGTG | |
| J2b | F | GGTTTTTGT | ACAGCTGTGTGTACAAACTGAAT | CACTGTG | |
| J2c | F | GGTTTTTGT | ACAGCTGTGTGTACAAACTGAAT | CACTGTG | |
| J2d | F | GGTTTTTGT | ACAGCTGTGTGTACAATCTGAAT | CACTGTG | |
| J2e | F | — | — | CACTGTG | |
| J2f | F | GGTTTTTGT | ACAGCTGTGTGTACAAACTGAAT | CACTGTG | |
| J2g | F | GGTTTTTGT | ACAGCTGTGTGTACAAACTGAAT | CACTGTG | |
| J2h | F | GGTTTTTGT | ACAGCTGTGTGTACAAACTGAAT | CACTGTG | |
| J3a | F | GGTTTTTGT | ACGACCACTTGATGAGTTTGTAT | CACTGTG | |
| J3b | F | GGTTTTTGT | ACGACCACTTGATGAGTTTGTAT | CACTGTG |
Fct functionality, F functional.
Figure 1IMGT protein display of in-frame torafugu CL representative amino acid sequences.
The protein display is shown using IMGT header (IMGT Repertoire, http://www.imgt.org).
Figure 2Overall organization of representative type 3 IGL genes.
Scaffold 2422 of 14,667 bp, 2488 of 13,611 bp, and 3698 of 3784 bp, are shown to scale, with exon size exaggerated. The transcriptional polarity is indicated by overhead arrow. Each gene is labeled, and an asterisk denotes incomplete coding sequences. VP/ORF denotes pseudogene (P) or ORF sequence.
Figure 3Inversion rearrangements on scaffold 2422.
The transcription polarity of the rearranged VJ, at the right, is indicated by arrowheads on the top of VJ-C. The JL-RSS is indicated as a white triangle, the VL-RSS is indicated as a black triangle.
Figure 4Overall organization of representative type 2 IGL genes.
The transcriptional polarity is indicated by overhead arrow. An asterisk denotes an incomplete coding sequence. VP/ORF denotes a pseudogene or ORF sequence.
Figure 5Overall organization of representative type 1 IGL genes.
The transcriptional polarity is indicated by overhead arrow. An asterisk denotes an incomplete coding sequence. VP/ORF denotes a pseudogene or ORF sequence. Scaffold 158 and 10 were assigned to chromosome 2.
Figure 6Southern blot of genomic DNA from torafugu sperm probed with torafugu IGLC.
Restriction endonucleases are indicated at the bottom: EcoRI (E), HindIII (H), BamHI (B), and PstI (P). Figures are cropped and the original blots images are available in Additional File.
Figure 7Overview window from Jalview of alignment of teleosts VL representative amino acid sequences as determined by MAFFT.
Hyphens denote gaps. FR and CDR regions are labeled according to Kabat delineation42. The conserved Tryptophan (Trp, W) in FR2 region is indicated by an asterisk. Cysteines (Cys, C) that are expected to form intra-chain disulfide bridges are indicated by solid black triangles, with the exception of torafugu IGLV1 group sequences (wherein Cys is replaced by Ala).
Figure 8Phylogenetic analysis of representative VL from various vertebrates.
The NJ tree was constructed using MEGA 7 with 1000 bootstrap replications. GenBank accession numbers are: zebrafish L1 (AF246185); carp L1a (AB073328); carp L3 (AB073335); zebrafish L3 (AF246193); salmon (Salmo salar) L1 (AF273012); trout (Oncorhynchus mykiss) L1 (X65260); catfish G (L25533); carp L1b (AB073332); human kappa (S46371); mouse kappa (MUSIGKACN); salmon L3 (AF406956); catfish F (U25705); X. laevis (Xenopus laevis) rho (XELIGLVAA); horn shark (Heterodontus francisci) TIII (L25561); nurse shark (Ginglymostoma cirratum) NS4 (A49633); X. laevis TIII (L76575); mouse lambda (AY648665); chicken lambda (M24403); human lambda (AAA59013); catfish lambda (EU925383); nurse shark NS5 (AAV34678); skate (Leucoraja erinacea) TI (L25568); horn shark TI (X15315); horn shark sigma (EF114760); nurse shark sigma (EF114765); X. laevis sigma (S78544); carp L2 (AB091113); trout L2 (AAB41310); zebrafish L2 (AF246162); catfish sigma (EU872021).
Figure 9Phylogenetic analysis of CL from various vertebrates.
GenBank accession numbers are as follows: wolffish L1b (AF137398); rockcod L1b (DQ842622); seabass L1b (AJ400216); zebrafish L1 (AF246185); salmon L3 (AF406956); yellowtail L1a (AB062619); rockcod L1a (EF114784); stickleback L1a (AY278356); catfish G (L25533); carp L1a (AB035728); carp L1b (AB035729); salmon L1 (AF273012); trout L1 (X65260); chicken lambda (M24403); human lambda (AAH07782); mouse lambda (J00592); X. laevis sigma (S78544); carp L2 (AB103558); zebrafish L2 (AF246162); catfish sigma (EU872021); trout L2 (AAB41310); rockcod L2 (EF114785); pufferfish (Tetraodon nigroviridis) sigma (AJ575637); horn shark TI (X15315); nurse shark NS5 (AAV34681); skate TI (L25568); horn shark sigma (EF114760); nurse shark sigma (EF114765); sandbar shark (Carcharhinus plumbeus) TII (M81314); skate TII (L25566); mouse kappa (AB048524); X. laevis rho (XELIGLVAA); human kappa (M11937); carp L3 (AB035730); zebrafish L3 (AF246193); catfish F (U25705); rockcod L3 (DQ842626).