| Literature DB >> 35483225 |
Prateek Singh1, Rajat Ujjainiya1, Satyartha Prakash2, Salwa Naushin1, Viren Sardana1, Nitin Bhatheja2, Ajay Pratap Singh1, Joydeb Barman2, Kartik Kumar2, Saurabh Gayali2, Raju Khan3, Birendra Singh Rawat4, Karthik Bharadwaj Tallapaka5, Mahesh Anumalla5, Amit Lahiri6, Susanta Kar6, Vivek Bhosale6, Mrigank Srivastava6, Madhav Nilakanth Mugale6, C P Pandey6, Shaziya Khan6, Shivani Katiyar6, Desh Raj6, Sharmeen Ishteyaque6, Sonu Khanka6, Ankita Rani6, Jyotsna Sharma6, Anuradha Seth6, Mukul Dutta6, Nishant Saurabh7, Murugan Veerapandian8, Ganesh Venkatachalam8, Deepak Bansal9, Dinesh Gupta10, Prakash M Halami11, Muthukumar Serva Peddha11, Ravindra P Veeranna11, Anirban Pal12, Ranvijay Kumar Singh13, Suresh Kumar Anandasadagopan14, Parimala Karuppanan15, Syed Nasar Rahman14, Gopika Selvakumar15, Subramanian Venkatesan14, Malay Kumar Karmakar16, Harish Kumar Sardana17, Anamika Kothari18, Devendra Singh Parihar17, Anupma Thakur17, Anas Saifi17, Naman Gupta17, Yogita Singh17, Ritu Reddu17, Rizul Gautam17, Anuj Mishra17, Avinash Mishra19, Iranna Gogeri20, Geethavani Rayasam21, Yogendra Padwad22, Vikram Patial22, Vipin Hallan22, Damanpreet Singh22, Narendra Tirpude22, Partha Chakrabarti23, Sujay Krishna Maity23, Dipyaman Ganguly23, Ramakrishna Sistla24, Narender Kumar Balthu25, Kiran Kumar A25, Siva Ranjith25, B Vijay Kumar25, Piyush Singh Jamwal26, Anshu Wali26, Sajad Ahmed26, Rekha Chouhan26, Sumit G Gandhi27, Nancy Sharma27, Garima Rai27, Faisal Irshad27, Vijay Lakshmi Jamwal27, Masroor Ahmad Paddar27, Sameer Ullah Khan27, Fayaz Malik27, Debashish Ghosh28, Ghanshyam Thakkar29, S K Barik30, Prabhanshu Tripathi31, Yatendra Kumar Satija32, Sneha Mohanty31, Md Tauseef Khan31, Umakanta Subudhi33, Pradip Sen34, Rashmi Kumar34, Anshu Bhardwaj34, Pawan Gupta34, Deepak Sharma34, Amit Tuli34, Saumya Ray Chaudhuri34, Srinivasan Krishnamurthi34, L Prakash35, Ch V Rao36, B N Singh36, Arvindkumar Chaurasiya37, Meera Chaurasiyar37, Mayuri Bhadange37, Bhagyashree Likhitkar37, Sharada Mohite37, Yogita Patil37, Mahesh Kulkarni37, Rakesh Joshi37, Vaibhav Pandya38, Sachin Mahajan38, Amita Patil38, Rachel Samson37, Tejas Vare37, Mahesh Dharne37, Ashok Giri37, Sachin Mahajan38, Shilpa Paranjape39, G Narahari Sastry40, Jatin Kalita40, Tridip Phukan40, Prasenjit Manna40, Wahengbam Romi40, Pankaj Bharali40, Dibyajyoti Ozah40, Ravi Kumar Sahu40, Prachurjya Dutta40, Moirangthem Goutam Singh41, Gayatri Gogoi41, Yasmin Begam Tapadar41, Elapavalooru Vssk Babu42, Rajeev K Sukumaran43, Aishwarya R Nair44, Anoop Puthiyamadam43, Prajeesh Kooloth Valappil43, Adrash Velayudhan Pillai Prasannakumari43, Kalpana Chodankar45, Samir Damare45, Ved Varun Agrawal46, Kumardeep Chaudhary1, Anurag Agrawal1, Shantanu Sengupta47, Debasis Dash48.
Abstract
Data science has been an invaluable part of the COVID-19 pandemic response with multiple applications, ranging from tracking viral evolution to understanding the vaccine effectiveness. Asymptomatic breakthrough infections have been a major problem in assessing vaccine effectiveness in populations globally. Serological discrimination of vaccine response from infection has so far been limited to Spike protein vaccines since whole virion vaccines generate antibodies against all the viral proteins. Here, we show how a statistical and machine learning (ML) based approach can be used to discriminate between SARS-CoV-2 infection and immune response to an inactivated whole virion vaccine (BBV152, Covaxin). For this, we assessed serial data on antibodies against Spike and Nucleocapsid antigens, along with age, sex, number of doses taken, and days since last dose, for 1823 Covaxin recipients. An ensemble ML model, incorporating a consensus clustering approach alongside the support vector machine model, was built on 1063 samples where reliable qualifying data existed, and then applied to the entire dataset. Of 1448 self-reported negative subjects, our ensemble ML model classified 724 to be infected. For method validation, we determined the relative ability of a random subset of samples to neutralize Delta versus wild-type strain using a surrogate neutralization assay. We worked on the premise that antibodies generated by a whole virion vaccine would neutralize wild type more efficiently than delta strain. In 100 of 156 samples, where ML prediction differed from self-reported uninfected status, neutralization against Delta strain was more effective, indicating infection. We found 71.8% subjects predicted to be infected during the surge, which is concordant with the percentage of sequences classified as Delta (75.6%-80.2%) over the same period. Our approach will help in real-world vaccine effectiveness assessments where whole virion vaccines are commonly used.Entities:
Keywords: BBV152; COVID-19; Covaxin; Ensemble methods; Infection; Machine learning; SARS-CoV-2
Mesh:
Substances:
Year: 2022 PMID: 35483225 PMCID: PMC9040372 DOI: 10.1016/j.compbiomed.2022.105419
Source DB: PubMed Journal: Comput Biol Med ISSN: 0010-4825 Impact factor: 6.698
Fig. 1Workflow of the study to identify COVID-19 infection status. Using a consensus of supervised (machine learning) and unsupervised (clustering) approaches, COVID-19 Infection status was ascertained in 1063 individuals who provided samples in Phase 3 (P3) and also in Phase 1 or Phase 2 (P1/P2). The final ensemble model was used to predict the COVID-19 infection status for all Covaxin administered individuals in P3.
Fig. 2Data structure and antibody level distribution. A): Sample distribution and overlap among three phases [P1 (June–November 2020), P2 (December 2020–April 2021), P3 (May–August 2021)] of CSIR Cohort of Covaxin administered individuals (N = 1823), B): Distribution of Antibodies to Nucleocapsid (COI) and Spike (U/mL) in the form of density histograms of 1823 individuals, C): PCA plot of 1823 Covaxin administered individuals based on six features including COI, U/mL, age, gender, days since last vaccination, and the number of doses. COVID-19 self-reported infection is depicted in red color, D): Sample distribution stratified via self-reported COVID-19 infection status and doses taken (N = 1823). Density-based contours indicate the presence of two subgroups amongst both in 1 dose and 2 doses administered self-reported not infected individuals. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)
Fig. 3Development and validation of prediction models. A): Consensus clustering with k-prototype and VarSelLCM methods (N = 1063). Light Brown and blue colour represent concordance between two clustering approaches for Cluster 1 and Cluster 2, respectively. The black color represents discordance between the two methods, hence indeterminate; B): Supervised machine learning (SVM method) based prediction of the infection status (N = 1063), further stratified via self-reported COVID-19 infection status and the number of vaccine doses; C): Ensemble ML model-based prediction of COVID-19 infection in all individuals (N = 1823), further stratified via self-reported infection status and the number of vaccine doses; D): Phase 2 seronegative subjects who gave samples in Phase 3 analyzed using a surrogate virus neutralization assay (sVNT) and predicted to be infected by Ensemble model (N = 39). 71.8% of samples predicted to be infected by Ensemble were found to be Delta infected utilizing a variant-specific sVNT assay. Delta infected was labelled when Delta Inhibition % > WT Inhibition % with a margin based on standard error. Delta Not infected were labelled when samples processed without dilution had less than 30% inhibition. All other data points were labelled Delta Uninfected. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)