| Literature DB >> 31570042 |
Claire Harris1, Weichen Xu2, Luigi Grassi1, Chunlei Wang2, Abigail Markle2, Colin Hardman3, Richard Stevens4, Guillermo Miro-Quesada5, Diane Hatton1, Jihong Wang2.
Abstract
Protein primary structure is a potential critical quality attribute for biotherapeutics. Identifying and characterizing any sequence variants present is essential for product development. A sequence variant ~11 kDa larger than the expected IgG mass was observed by size-exclusion chromatography and two-dimensional liquid chromatography coupled with online mass spectrometry. Further characterization indicated that the 11 kDa was added to the heavy chain (HC) Fc domain. Despite the relatively large mass addition, only one unknown peptide was detected by peptide mapping. To decipher the sequence, the transcriptome of the manufacturing cell line was characterized by Illumina RNA-seq. Transcriptome reconstruction detected an aberrant fusion transcript, where the light chain (LC) constant domain sequence was fused to the 3' end of the HC transcript. Translation of this fusion transcript generated an extended peptide sequence at the HC C-terminus corresponding to the observed 11 kDa mass addition. Nanopore-based genome sequencing showed multiple copies of the plasmid had integrated in tandem with one copy missing the 5' end of the plasmid, deleting the LC variable domain. The fusion transcript was due to read-through of the HC terminator sequence into the adjacent partial LC gene and an unexpected splicing event between a cryptic splice-donor site at the 3' end of the HC and the splice acceptor site at the 5' end of the LC constant domain. Our study demonstrates that combining protein physicochemical characterization with genomic and transcriptomic analysis of the manufacturing cell line greatly improves the identification of sequence variants and understanding of the underlying molecular mechanisms.Entities:
Keywords: Fc-extension; LC/MS; RT-PCR; aberrant fusion protein; alternative splicing; expression vector; high throughput sequencing; monoclonal antibody; nanopore sequencing; sequence variant; splice variants
Mesh:
Substances:
Year: 2019 PMID: 31570042 PMCID: PMC6816433 DOI: 10.1080/19420862.2019.1667740
Source DB: PubMed Journal: MAbs ISSN: 1942-0862 Impact factor: 5.857
Figure 1.(a): HPSEC profile of mAb-A. Inset is zoomed view. (b): HPSEC profiles of mAb-A shoulder (red) and monomer (black) fractions. Peaks at 11.5 to 12 min in shoulder fraction are system peaks.
Figure 2.Intact mass analysis: Deconvoluted masses for HPSEC monomer fraction (a): intact mass (b): reduced light chain (c): reduced heavy chain and shoulder fraction (d): intact mass (e): reduced light chain (f): reduced heavy chain.
Figure 3.CE-SDS analysis for HPSEC monomer fraction (a): non-reducing (b): reducing and shoulder fraction (c): non-reducing (d): reducing.
Figure 4.RP-HPLC analysis for IdeS digested mAb-A HPSEC shoulder peak (a): RP-HPLC UV chromatogram. (b): Deconvoluted masses for 12–14 min peaks in A. (black is Fc’; 12.4 min peak in A, red is Fc’+11 kDa: 13.4 min peak in A). (c): Deconvoluted masses for 16 min peak in A (F(ab’)2 and intact missing Fc’ or Fc’/2). F(ab’)2 = (LC+HC1-236)2.
Figure 5.Reducing Lys-C mapping UV overlay of HPSEC shoulder fraction (red) and monomer fractions (black). (a): 37.67 min peak is light chain peptide Y176-K190 (YAASSYLSLTPEQWK) with mass of 1742.852 Da, and (b): 62.95 min peak is light chain peptide A134-K153 (ATLVCLISDFYPGAVTVAWK) with mass of 2153.123 Da. * is heavy chain peptide with oxidized Met-428.
Figure 6.Molecular mechanism of the heavy chain + 11 kDa sequence variant. (a): Annotation track of the transgenes in the plasmidic region. The upper track reports two of the plasmid copies integrated into the host genome. The lower track reports the non-mature transcripts of heavy chain (HC), light chain (LC) and heavy chain +11 kDa sequence variant (HC−Fc_ext). Within the exons, the thick bars indicate coding sequence and the thin bars non-coding sequence (5ʹ UTR and 3ʹ UTR). (b): Integration site analysis from MinION genomic sequencing identified the position of light chain constant (CL) next to end of heavy chain (HC). (c): Sequence of 3ʹ end of HC and 5ʹ end of CL domain that contribute to the fusion transcript. Mis-splicing causes the sequence highlighted in blue to be spliced out. The fusion transcript sequence after splicing has occurred is shown covering the area of the fusion site as identified in the RNASeq data. The amino acid translation is shown above.
Figure 7.Confirmation of Fc extension signature peptide SLSLSPGQPK by non-reducing peptide mapping (a): Peptide mapping UV chromatograms of HPSEC monomer and front shoulder fractions zoomed-in to show the peaks correspond to the expected heavy chain C-terminal peptide and the signature peptide from 11 kDa Fc extension. (b): Mass spectra of Fc extension variant signature peptide SLSLSPGQPK and Fc C-terminal peptide SLSLSPG (Lys removed). (c): MS/MS spectrometry confirming the sequence of Fc extension signature peptide SLSLSPGQPK.