Kyle Tretina1, Roger Pelle2, Joshua Orvis1, Hanzel T Gotia1, Olukemi O Ifeonu1, Priti Kumari1, Nicholas C Palmateer1, Shaikh B A Iqbal1, Lindsay M Fry3,4, Vishvanath M Nene5, Claudia A Daubenberger6,7, Richard P Bishop4, Joana C Silva8,9. 1. Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA. 2. Biosciences Eastern and Central Africa, International Livestock Research Institute, Nairobi, Kenya. 3. Animal Disease Research Unit, Agricultural Research Service, USDA, Pullman, WA, 99164, USA. 4. Department of Veterinary Microbiology & Pathology, Washington State University, Pullman, WA, 99164, USA. 5. International Livestock Research Institute, Nairobi, Kenya. 6. Swiss Tropical and Public Health Institute, Basel, Switzerland. 7. University of Basel, Basel, Switzerland. 8. Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA. jcsilva@som.umaryland.edu. 9. Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, 21201, USA. jcsilva@som.umaryland.edu.
Abstract
BACKGROUND: The apicomplexan parasite Theileria parva causes a livestock disease called East coast fever (ECF), with millions of animals at risk in sub-Saharan East and Southern Africa, the geographic distribution of T. parva. Over a million bovines die each year of ECF, with a tremendous economic burden to pastoralists in endemic countries. Comprehensive, accurate parasite genome annotation can facilitate the discovery of novel chemotherapeutic targets for disease treatment, as well as elucidate the biology of the parasite. However, genome annotation remains a significant challenge because of limitations in the quality and quantity of the data being used to inform the location and function of protein-coding genes and, when RNA data are used, the underlying biological complexity of the processes involved in gene expression. Here, we apply our recently published RNAseq dataset derived from the schizont life-cycle stage of T. parva to update structural and functional gene annotations across the entire nuclear genome. RESULTS: The re-annotation effort lead to evidence-supported updates in over half of all protein-coding sequence (CDS) predictions, including exon changes, gene merges and gene splitting, an increase in average CDS length of approximately 50 base pairs, and the identification of 128 new genes. Among the new genes identified were those involved in N-glycosylation, a process previously thought not to exist in this organism and a potentially new chemotherapeutic target pathway for treating ECF. Alternatively-spliced genes were identified, and antisense and multi-gene family transcription were extensively characterized. CONCLUSIONS: The process of re-annotation led to novel insights into the organization and expression profiles of protein-coding sequences in this parasite, and uncovered a minimal N-glycosylation pathway that changes our current understanding of the evolution of this post-translational modification in apicomplexan parasites.
BACKGROUND: The apicomplexan parasite Theileria parva causes a livestock disease called East coast fever (ECF), with millions of animals at risk in sub-Saharan East and Southern Africa, the geographic distribution of T. parva. Over a million bovines die each year of ECF, with a tremendous economic burden to pastoralists in endemic countries. Comprehensive, accurate parasite genome annotation can facilitate the discovery of novel chemotherapeutic targets for disease treatment, as well as elucidate the biology of the parasite. However, genome annotation remains a significant challenge because of limitations in the quality and quantity of the data being used to inform the location and function of protein-coding genes and, when RNA data are used, the underlying biological complexity of the processes involved in gene expression. Here, we apply our recently published RNAseq dataset derived from the schizont life-cycle stage of T. parva to update structural and functional gene annotations across the entire nuclear genome. RESULTS: The re-annotation effort lead to evidence-supported updates in over half of all protein-coding sequence (CDS) predictions, including exon changes, gene merges and gene splitting, an increase in average CDS length of approximately 50 base pairs, and the identification of 128 new genes. Among the new genes identified were those involved in N-glycosylation, a process previously thought not to exist in this organism and a potentially new chemotherapeutic target pathway for treating ECF. Alternatively-spliced genes were identified, and antisense and multi-gene family transcription were extensively characterized. CONCLUSIONS: The process of re-annotation led to novel insights into the organization and expression profiles of protein-coding sequences in this parasite, and uncovered a minimal N-glycosylation pathway that changes our current understanding of the evolution of this post-translational modification in apicomplexan parasites.
Entities:
Keywords:
East coast fever; Genome; N-glycosylation; Re-annotation; Theileria
Authors: Monica Florin-Christensen; Anabel E Rodriguez; Carlos E Suárez; Massaro W Ueti; Fernando O Delgado; Ignacio Echaide; Leonhard Schnittger Journal: Pathogens Date: 2021-01-08
Authors: Fiona K Allan; Siddharth Jayaraman; Edith Paxton; Emmanuel Sindoya; Tito Kibona; Robert Fyumagwa; Furaha Mramba; Stephen J Torr; Johanneke D Hemmink; Philip Toye; Tiziana Lembo; Ian Handel; Harriet K Auty; W Ivan Morrison; Liam J Morrison Journal: Front Genet Date: 2021-07-15 Impact factor: 4.599
Authors: Nicholas C Palmateer; Kyle Tretina; Joshua Orvis; Olukemi O Ifeonu; Jonathan Crabtree; Elliott Drabék; Roger Pelle; Elias Awino; Hanzel T Gotia; James B Munro; Luke Tallon; W Ivan Morrison; Claudia A Daubenberger; Vish Nene; Donald P Knowles; Richard P Bishop; Joana C Silva Journal: PLoS Negl Trop Dis Date: 2020-10-29
Authors: Boitumelo B Maboko; Kgomotso P Sibeko-Matjila; Rian Pierneef; Wai Y Chan; Antoinette Josemans; Ratselane D Marumo; Sikhumbuzo Mbizeni; Abdalla A Latif; Ben J Mans Journal: Front Genet Date: 2021-06-25 Impact factor: 4.599