| Literature DB >> 35720344 |
Katherine J L Jackson1, Justin T Kos2, William Lees3, William S Gibson2, Melissa Laird Smith2, Ayelet Peres4, Gur Yaari4, Martin Corcoran5, Christian E Busse6, Mats Ohlin7, Corey T Watson2, Andrew M Collins8.
Abstract
The immunoglobulin genes of inbred mouse strains that are commonly used in models of antibody-mediated human diseases are poorly characterized. This compromises data analysis. To infer the immunoglobulin genes of BALB/c mice, we used long-read SMRT sequencing to amplify VDJ-C sequences from F1 (BALB/c x C57BL/6) hybrid animals. Strain variations were identified in the Ighm and Ighg2b genes, and analysis of VDJ rearrangements led to the inference of 278 germline IGHV alleles. 169 alleles are not present in the C57BL/6 genome reference sequence. To establish a set of expressed BALB/c IGHV germline gene sequences, we computationally retrieved IGHV haplotypes from the IgM dataset. Haplotyping led to the confirmation of 162 BALB/c IGHV gene sequences. A musIGHV398 pseudogene variant also appears to be present in the BALB/cByJ substrain, while a functional musIGHV398 gene is highly expressed in the BALB/cJ substrain. Only four of the BALB/c alleles were also observed in the C57BL/6 haplotype. The full set of inferred BALB/c sequences has been used to establish a BALB/c IGHV reference set, hosted at https://ogrdb.airr-community.org. We assessed whether assemblies from the Mouse Genome Project (MGP) are suitable for the determination of the genes of the IGH loci. Only 37 (43.5%) of the 85 confirmed IMGT-named BALB/c IGHV and 33 (42.9%) of the 77 confirmed non-IMGT IGHV were found in a search of the MGP BALB/cJ genome assembly. This suggests that current MGP assemblies are unsuitable for the comprehensive documentation of germline IGHVs and more efforts will be needed to establish strain-specific reference sets.Entities:
Keywords: BALB/c; IGHV; SMRT sequencing; haplotyping; substrains
Mesh:
Substances:
Year: 2022 PMID: 35720344 PMCID: PMC9205180 DOI: 10.3389/fimmu.2022.888555
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 8.786
New BALB/c IGHV sequences identified in this study.
| Gene Name | Gene Sequence |
|---|---|
| balbIGHV036* | caggttactctgaaagagtctggccctgggatattgcagccatcacagacgcttagcctggcctgtac |
| balbIGHV037* | caagttactctaaaagagtctggccctgggatattgaagccctcacagaccctcagtctgacttgttctt |
| balbIGHV038** | aaggtccagctgcagcagtctggagctgagctggtgaaacccggggcatcagtgaagctgtcctgc |
| balbIGHV039*** | gaggtccagctgcaacagtctggacctgagctggtgaagcctggagcttcaatgaagatatcctgca |
| b6IGHV040# | |
| balbIGHV041## |
*Previously reported by Corcoran and colleagues (22).
**The sequence matches musIGHV662 but includes an 8 nucleotide 3' extension
***The sequence matches musIGHV398, but for 7 ‘missing’ nucleotides that are shown as ‘---------‘.
#This sequence is identical to musIGHV269 / IGHV1-2*01, but includes a 27 nucleotide 5' extension and a 3 nucleotide 3' extension (underlined).
##The sequence is a variant of b6IGHV040 (C100T as indicated in upper case).
Sequences of inferred BALB/c constant region gene exons used in conjunction with previously reported sequences for C gene-based haplotyping of IGHV genes.
| Label | Nucleotide sequence |
|---|---|
| IGHM*rs29176517g_CH1 | gagagtcagtccttcccaaatgtcttccccctcgtctcctgcgagagccc |
| IGHM*rs29176517a _CH1 | Gagagtcagtccttcccaaatgtcttccccctcgtctcctgcgagagcccc |
| IGHG2B*rs45969375c_BALB_CH1 | gccaaaacaacacccccatcagtctatccactggcccctgggtgtggaga |
Figure 1IGHV8 subgroup haplotype plots for BALB/c and C57BL/6. The C57BL/6 chromosome is shown as IgM*B6 and the BALB/c chromosome as IgM*BALB.
Figure 2IGHV gene utilization of F1 haplotypes compared to parental strains. Each data point is an IGHV gene, orange filled points are genes shared by the C57BL/6 and BALB/c haplotype. The red line plots y = x and genes with equal utilization in F1 haplotype and homozygous parent strain would fall on this slope. (A) IGHV gene utilization of C57BL/6 haplotype genes from F1 dataset compared to expression in C57BL/6 homozygous mice. (B) IGHV gene usage of BALB/c haplotype genes from F1 dataset compared to usage in homozygous BALB/c mice.
Figure 3Performance of different IGHV reference sets for the alignment of BALB/c VDJ sequences. IgM reads from BALB/c mice were aligned against different reference directories and the number of mismatches to the closest germline gene was calculated. The percentage of reads with 0 - 30 mismatches are plotted. (A) Aligned to OGRDB Reference set without the inclusion of musIGHV398 (B) Aligned to OGRDB Reference set with the inclusion of musIGHV398 (C) Aligned to IMGT Reference Directory.