Literature DB >> 35255085

Banana bunchy top virus genetic diversity in Pakistan and association of diversity with recombination in its genomes.

Sana Bashir1, Syed Muhammad Saqlan Naqvi2, Aish Muhammad3, Iqbal Hussain3, Kazim Ali3, Muhammad Ramzan Khan3, Sumaira Farrakh1, Tayyaba Yasmin1, Muhammad Zeeshan Hyder1.   

Abstract

Banana Bunchy top virus (BBTV) is a multipartite circular single strand DNA virus that belongs to genus Babuvirus and family Nanoviridae. It causes significant crop losses worldwide and also in Pakistan. BBTV is present in Pakistan since 1988 however, till now only few (about twenty only) sequence of genomic components have been reported from the country. To have insights into current genetic diversity in Pakistan fifty-seven genomic components including five complete genomes (comprises of DNA-R, -U3, -S, -M, -C and -N components) were sequenced in this study. The genetic diversity analysis of populations from Pakistan showed that DNA-R is highly conserved followed by DNA-N, whereas DNA-U3 is highly diverse with the most diverse Common Region Stem-loop (CR-SL) in BBTV genome, a functional region, which previously been reported to have undergone recombination in Pakistani population. A Maximum Likelihood (ML) phylogenetic analysis of entire genomes of isolates by using sequence of all the components concatenated together with the reported genomes around the world revealed deeper insights about the origin of the disease in Pakistan. A comparison of the genetic diversity of Pakistani and entire BBTV populations around the world indicates that there exists a correlation between genetic diversity and recombination. Population genetics analysis indicated that the degree of selection pressure differs depending on the area and genomic component. A detailed analysis of recombination across various components and functional regions suggested that recombination is closely associated with the functional parts of BBTV genome showing high genetic diversity. Both genetic diversity and recombination analyses suggest that the CR-SL is a recombination hotspot in all BBTV genomes and among the six components DNA-U3 is the only recombined component that has extensively undergone inter and intragenomic recombination. Diversity analysis of recombinant regions results on average one and half fold increase and, in some cases up to four-fold increase due to recombination. These results suggest that recombination is significantly contributing to the genetic diversity of BBTV populations around the world.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 35255085      PMCID: PMC8901069          DOI: 10.1371/journal.pone.0263875

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Banana bunchy top disease (BBTD) is the most common and devastating viral disease of Banana, predominantly found in Pacific and Asian regions. It has been considered one of the most important plant viral diseases around the world [1]. In Pakistan, BBTD was first observed in 1988 in Sindh province. Later, based on symptomology it was identified in 1991 [2]. BBTD is caused by Banana Bunchy top Virus (BBTV), which belongs to genus Babuvirus in the family Nanoviridae [3]. BBTV is transmitted persistently by the aphid Pentalonia nigronervosa Coq, its sole known vector, and infects only the members of Musaceae [4]. Banana bunchy top virus (BBTV) is a multipartite circular single stranded (css) DNA virus comprising of six css DNA components each of about 1Kb in size [5, 6]. BBTV genome consists of six integral ssDNA components, including DNA-R, C, M, S and -N which encode master replication initiation protein (M-Rep), cell cycle link protein (Clink) movement protein (MP), capsid protein (CP) and nuclear-shuttle protein (NSP), respectively while DNA-U3 encodes a protein for which no function has been assigned so far [5, 7–10]. Two conserved notable functional regions exist in all six DNA components of BBTV, which include the common region stem-loop (CR-SL), which is about 70 nucleotides long and contains a 31 nucleotides origin of replication (ori) for proposed rolling-circle replication [11, 12]. Second region which is more than 70% identical in six components, and about 90 nucleotide long, is the common region major (CR-M) [5, 13] to which small ssDNA primers bound and initiate complementary-strand DNA synthesis after entering a host cell [14] and is under the course of concerted type evolution [15, 16]. Occurrence of genetic variations due to recombination is a well-established phenomenon in BBTV [17]. This phenomenon is of particular significance in the evolution of viruses as it provides the possibility of natural recombination events, which leads to extensive viral diversity [17]. In geminivirus evolution, the importance of recombination is well recognized [18-20] as it is the most probable mechanism responsible for the genetic diversification of agriculturally important begomovirus species [21, 22]. Whilst generating the descendants with increased fitness, recombination is also a source for increased genetic diversity in begomoviruses [17]. In viruses, recombination breakpoints identification is a useful way to detect circulating recombinant forms and to infer the underlying recombination mechanisms such as intra and intergenomic recombination [23, 24]. Although sequence analysis of individual BBTV components [9, 10, 15, 16, 25–30] and complete BBTV genomes [31-36] revealed evidence of intra and intergenomic recombination, but no study related to the understanding the contribution of these recombination mechanisms in genetic diversity has been reported so far for BBTV. Previous molecular analysis of BBTV from Pakistan has been very limited with only one partial genome [37] and a few DNA components [16, 25, 37] have been sequenced from different districts of Sindh province. Thus, earlier phylogenetic analyses within the country performed on BBTV have used sequences of individual BBTV components rather than full genomes. In the current study, we reported the sequence of complete genomes of five BBTV isolates originated from different districts of Sindh, Pakistan. The diversity analysis and putative recombination events were studied for Pakistani isolates and then for sequences available in the public database in GenBank. Also, the contribution of genetic diversity by recombination, recombination hotspots and population genetics are studied for BBTV genomes to better understand the evolution of this virus in various geographic regions around the globe.

Material and methods

DNA extraction, PCR amplification, cloning and sequencing

BBTV infected plant material showing typical BBTD symptom was collected from district Tandojam, Sindh province in 2006 for P.TJ1 isolate (for which DNA-R was previously reported by Hyder and colleagues (2007)), and in 2007, for P.BS1 & 2, P.GH1 & 2, P.HD1 & 2, P. JS1 (the DNA-U3 for P.GH1, P.JS1 and P.HD1 were previously reported by Hyder and colleagues (2011)), P.KP1 & 2, P.MT1 & 2, P.NS1, P.TA1 & 2 from Tandojam, Bhitshah, Ghotki, Hyderabad, Jamshoro, Khairpur, Matiari, Nawabshah and Tandoadam districts of the Sindh respectively. The P.TJ3, P.NARC and P.Sakrand & P.TJ4 were isolated during 2011, 2017 and 2018 respectively from Tandojam district. The total genomic DNA extraction was performed using the CTAB method as described by Hyder et al., (2007). PCR amplification and DNA sequencing were performed with two pairs of adjacent outwardly extending abutting primer each specific for a different location in each genomic component. Using this technique multiple reads in sequencing every component was generated and the sites where one set of primer binds, were sequenced using 2nd pair of primer and vice versa. for each genomic component of BBTV (Table 1).
Table 1

Details of primers used to sequence BBTV genomic components from Pakistan.

BBTV Genomic ComponentSetName of PrimerSequence 5’- 3’directionNucleotides Coordinate (nt)
DNA-RSet ADNA-R AF ATGGCGCGATATGTGGTATG 102–121
DNA-R AR TCTGTCGTCGATGATGATCTTG 102–80
Set BDNA-R BF CCAAATGGAGGAGAAGGAAAG 642–662
DNA-R BR GCCATAGACCCAAATTATTCTCCG 641–618
DNA-U3Set ADNA-U3 AF TTGTGCTGAGGCGGAAGAT 313–331
DNA-U3 AR CCACCTTCACAGAAGAGAG 312–294
Set BDNA-U3 BF CAGATTAATTCCTTAGCGAC 837–856
DNA-U3 BR GACCGTTCATTCAACTTGAC 836–817
DNA-SSet ADNA-S AF GTATCCGAAGAAATCCATC 236–254
DNA-S AR CTAGCCATTTGTTGTCTG 235–218
Set BDNA-S BF GGAAGAATGTAACGGAGGTCG 643–663
DNA-S BR TCAACACGGTTGTCTTCCTCA 642–622
DNA-MSet ADNA-M AF ATGGCATTAACAACAGAGCG 282–301
DNA-M AR TTAGCAGGGTCCTATTTATAGG 281–260
Set BDNA-M BF GGATGATCAAGGAAGACG 593–610
DNA-M BR CTTCTATTTGGTTGAGAAGG 592–573
DNA-CSet ADNA-C AF GAATCGTCTGCTATGCCTG 252–270
DNA-C AR CCAGAACTCCATTTCTCTTC 251–232
Set BDNA-C BF GTTCTCTCTTCTTCATCG 585–602
DNA-C BR CTCATCACAATAGAGATCTTG 584–564
DNA-NSet ADNA-N AF GATGGATTGGGCGGAATCA 276–284
DNA-N AR GCTTCTGCTTTGCTTTCGC 275–257
Set BDNA-N BF GAGCAGAGACATGGAAGTTAG 642–662
DNA-N BR CAATCTATTCCTGGCGCAAC 641–622
M13 Universal Primers-M13 F TGTAAAACGACGGCCAGT -
M13 R CAGGAAACAGCTATGACC -

Note: The nucleotide coordinates are according to the P.TJ1 isolate genomic components’ sequences.

Note: The nucleotide coordinates are according to the P.TJ1 isolate genomic components’ sequences. The PCR amplifications were carried out either using GoTaq® PCR Core System I (Promega Corp. Madison, WI, USA) or using Taq DNA Polymerase (recombinant) (Fermentas UAB Lithuania), according to manufacturer’s instructions. A typical PCR reaction contained about 50 ng DNA template, respective Taq buffer, 1.5 mM MgCl2, 200 μM of each dNTPs, 2.5 units of Taq DNA polymerase and 50 pM of each primer. The thermal profile for both primer sets included pre-PCR denaturation at 94°C for 3 minutes followed by 35 cycles of denaturing at 94°C for 30 seconds, annealing at 52°C for 30 seconds and extension at 72°C for 45 seconds, and a final extension at 72°C for 20 minutes. The PCR products of P.TJ1 were ligated into pTZ57R (InsTA Cloning Kit, Fermentas UAB Lithuania) according to the manufacturer’s instructions and transformed into Escherichia coli DH5α (supE44, ΔlacU169 (Φ80lacZΔM15), hsdR17, recA1, endA1, gyrA96, thi-1, relA1) cells by electroporation. While for those isolates sampled during 2007, the PCR products were ligated directly into the pGEM®-T Easy Vector (Promega, Madison, WI) and cloned in E. coli DH5α cell. The clones were selected using 50 μg/ml ampicillin and screened for white colonies generated by insertional inactivation of functional beta galactosidase gene whose expression was induced by IPTG (Isopropyl β- d-1-thiogalactopyranoside) converting chromogenic X-GAL (5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside) into blue color in non-recombinant bacterial cells. The plasmid DNA was extracted using a minipreparation protocol according to Sambrook and Russell [38], and confirmed by digestion with appropriate restriction endonucleases. For every component of an isolate, both the strands of two individual clones originated from two different primer sets of the respective component were sequenced. The genomic components of P.TJ1 isolate was sequenced using the CEQ Dideoxy Dye Terminator cycle sequencing Kit (Beckman Coulter, USA) and the CEQ 8000 Genetic Analysis System (Beckman Coulter, USA) by universal M13 and insert-specific primers to obtain full length sequence of each strand. The isolates sampled during 2007 were processed in a similar way and sequenced at sequencing facility of Iowa State University, U.S.A. Both the strands of the PCR products of two different primer sets specific for the same genomic component, from each isolate, sampled in 2011, 2017 and 2018 were directly sequenced using commercial DNA Sequencing Facility of Macrogen (Macrogen, Inc. South Korea). The sequence data of both the strand of a clone/PCR product was assembled first using DNA Dragon Contig Sequence Assembly Software (version 1.5.0) SequentiX—Digital DNA Processing (Germany) (https://www.sequentix.de/software_dnadragon.php). Then the consensus of sequencing, originating from two different primer sets, was developed for each component of an isolate. The consensus sequence of each sequenced component from an isolate was submitted in GenBank. The accession numbers were (MK140625, MK140626, MK140627, FJ859727, FJ859728, FJ859722, FJ859733, FJ859734, FJ859732, FJ859723, FJ859724, FJ859729, FJ859730, FJ859731, FJ859725, FJ859726, JX170762 for DNA-R), (MK140628, MK140629, MK140630, JX1700764 for DNA-U3), (MK140619, MK140620, MK140621, EF593169, FJ859740, FJ859741, FJ859735, FJ859746, FJ859747, FJ859745, FJ859736, FJ859737, FJ859742, FJ859743, FJ859744, FJ859738, FJ859739, JX170763 for DNA-S), (MK140616, MK140617, MK140618, EU095948, JX467685, JX467686, JX46768, JX170760 for DNA-M), (MK140613, MK140614, MK140615, EF520722, JX170759 for DNA-C), (MK140624, MK140625, MK140626, EF529519, JX170761 for DNA-N) respectively.

Sequence retrieval and genetic diversity

The full-length sequences of all components of BBTV genome were obtained from NCBI GenBank using Taxonomy Browser till April 05, 2020 and given abbreviated names for convenience (S1 Table). The coding and regulatory regions, including CR-SL and CR-M were identified as described by Burns and colleagues (1995), and the sequences were started from the start of major ORF identified by Herrera-Valencia and colleagues (2007) [39] in each component for alignment through MAFFT version 6.864 [40]. The genetic diversity values of each BBTV component and subgenomic regions were determined on the aligned sequences first for Pakistani isolates and then worldwide. The nucleotide diversity range analysis was performed using MatGAT [41], while Pair-wise average nucleotide diversity per 100 sites π (Pi) and Watterson estimator θw (Theta-w) for population mutation rates per 100 sites were determined using DnaSP version 6.12.03 [42]. To study the relationship BBTV isolates of Pakistan with BBTV isolates from other parts of the world, a Maximum Likelihood (ML) phylogenetic genomic tree based on the concatenated nucleotide sequences of entire genomic components (i.e. DNA-R, U3, S, M, C and N concatenated together in single sequence for an isolate) of those isolates for which full genomic sequences were available in the GenBank, was constructed with 1000 bootstrap replicates. In addition, individual component-based ML phylogenetic analyses of full-length sequences of each component separately was also performed in the Molecular Evolutionary Genetics Analysis program (MEGA), version X [43] with 1000 bootstrap replicates. The phylogenetic trees were visualized using FigTree version 1.4.3 [44] (http://tree.bio.ed.ac.uk/software/).

Selection pressure analysis

The selection pressure was estimated by dN/dS ratio, where dN represents the average number of nonsynonymous substitutions per nonsynonymous site and dS is the average number of synonymous substitutions per synonymous site. In MEGA X [43]. dN and dS values were estimated separately by using Nei-Gojobori method (Jukes-Cantor). The gene is under neutral selection when dN/dS ratio = 1, positive (or diversifying) selection when the dN/dS ratio is > 1 and negative (or purifying) selection when dN/dS ratio < 1.

Neutrality test

DnaSP version 6.12.03 [42] was used for testing Tajima D [45], Fu and Li’s D*and F* [46] number of haplotypes (H), haplotype diversity (Hd) and nucleotide diversity (π). Tajima’s D test in a genomic region measures the departure from neutrality for all mutations. Tajima’s D test is based on the differences between the number of segregating sites and the average number of nucleotide differences. Fu and Li’s D*test is based on the differences between the total number of mutations and number of singletons (mutations appearing only once among the sequence). Fu and Li’s F*test is based on the differences between the number of singletons and the average number of nucleotide differences between every pair of sequences. Haplotype diversity refers to the frequency and number of haplotypes in the population while nucleotide diversity estimates the average pairwise differences among sequences.

Recombination analysis

The recombination analysis was performed using Recombination Detection Program (RDP) version 4 Beta14 [47]. The intergenomic (intracomponent) recombination was determined by using MAFFT aligned sequences separately for each component, while intragenomic (intercomponent) recombination was determined using MAFFT aligned sequences of all the component together. The recombination analysis was performed using maximum χ2 (MaxChi) [48], bootscan [49], GENECONV [50], LARD [51], Distance Plot (SimPlot) [52], sister scanning (SiSscan) [53], TOPAL [54], chimaera [55] and reticulate (compatibility matrix) [56] implemented in RDP with the default values (P. value = 0.05, Multiple Comparison Correction = Bonferroni Correction, Number of permutations = 0) except that sequence type was set to circular. The contribution of genetic diversity by recombinants in their respective populations was determined by selecting the recombined genomic regions for each recombination event in alignments of the respective component as identified by RDP and then selecting recombination population in the ML trees from the respective geographic location, aligning only the recombinant regions using MAFFT software and then calculating the genetic diversity range, π (Pi) and Watterson estimator θw (Theta-w) values as described earlier for the respective recombinant population with and without recombinant sequences.

Results

Genetic diversity of BBTV population

Total fifty-seven genomic components including five complete genomes of BBTV were sequenced from Pakistan during 2006 to 2018 (accession numbers given in Table 2). The genetic diversity analysis of these components along with other reported isolates of BBTV from Pakistan [16, 25, 37], showed that DNA-R is highly conserved followed by DNA-N, whereas DNA-U3 is highly diverse followed by DNA-C. Similarly, for coding regions, DNA-R is highly conserved followed by DNA-M, while, DNA-C followed by DNA-S are highly diverse (Table 3). However, when different functional regions of BBTV genome such as CR-M and CR-SL are compared with the full-length sequences and coding regions, it is revealed that between BBTV components CR-M is highly conserved while CR-SL is highly diverse in DNA-U3. Contrary to sequences of full-length components and coding regions, CR-M in DNA-R is highly diverse followed by DNA-C. In a similar contradiction, DNA-U3 bears the most conserved CR-M region. The sequence diversity in the CR-SL region of Pakistani BBTV population reveals very interesting findings, the DNA-R, -S, -M and -C have identical CR-SL and there exists no divergence, however, DNA-U3 and N showed diverse CR-SL region. While in DNA-N the genetic diversity was lower than DNA-U3 in this region (Table 3).
Table 2

Description of Banana bunchy top virus isolates sequenced from Pakistan for this study.

IsolateGeographical originAccession NumbersIsolation
AbbreviationRegionCountrySubgroupDNA-RDNA-U3DNA-SDNA-MDNA-CDNA-NYearReference
P.TJ1TandojamPakistanPIO--EF593169EU095948EF520722EF5295192005This study
P.BS1BhitshahPakistanPIOFJ859727-FJ859740JX467685--2007This study
P.BS2BhitshahPakistanPIOFJ859728-FJ859741---2007This study
P.GH1GhotkiPakistanPIOFJ859722-FJ859735---2007This study
P.HD1HyderabadPakistanPIOFJ859733-FJ859746JX467686--2007This study
P.HD2HyderabadPakistanPIOFJ859734-FJ859747---2007This study
P.JS1JamshoroPakistanPIOFJ859732-FJ859745---2007This study
P.KP1KhairpurPakistanPIOFJ859723-FJ859736---2007This study
P.KP2KhairpurPakistanPIOFJ859724-FJ859737---2007This study
P.MT1MatiariPakistanPIOFJ859729-FJ859742---2007This study
P.MT2MatiariPakistanPIOFJ859730-FJ859743---2007This study
P.NS1NawabshahPakistanPIOFJ859731-FJ859744JX467687---2007This study
P.TA1TandoadamPakistanPIOFJ859725-FJ859738---2007This study
P.TA2TandoadamPakistanPIOFJ859726-FJ859739---2007This study
P.TJ3TandojamPakistanPIOJX170762JX170764JX170763JX170760JX170759JX1707612011This study
P.NARCTandojamPakistanPIOMK140625MK140628MK140619MK140616MK140613MK1406222017This study
P.SakrandTandojamPakistanPIOMK140627MK140630MK140621MK140618MK140615MK1406242018This study
P.TJ4TandojamPakistanPIOMK140626MK140629MK140620MK140617MK140614MK1406232018This study

Note: *For P.TJ1, the DNA-R (accession # AY996562) had been previously reported by Hyder et al., (2007) and DNA-U3 (accession#GQ214699) was reported by Hyder et al., 2011 -and by reporting rest of the components, its genome is completely reported in this stuy. Similarly,DNA-U3 of P.GH1. P. HD1 and P.JS1 (and their respective accession numbers FJ859748, FJ859750 and FJ859749) had been reported earlier by Hyder et al., (2011) and rest are being reported in this study. The isolates sampled in 2006 and 2007 are related to Ph.D thesis (Hyder, 2009) and not reported earlier elsewhere. In total 57 components belonging to 18 isolates from different geograhpic locations are sequenced for this study.

Table 3

Genetic diversity of Banana bunchy top virus population in Pakistan.

ComponentsPopulationFull-lengthCoding regionCR-MCR-SL
Percent Identity RangeπθwPercent Identity RangeπθwPercent Identity RangeπθwPercent Identity Rangeπθw
DNA-R Total99.1-100-0.26± 0.040.48± 01899.5–1000.13 ± 0.030.31 ± 0.1395.8–1002.41 ± 0.242.40 ± 1.151000.000.00
DNA-U3 Total98.4–99.90.69 ± 0.150.90 ±0.4098.3–1000.45± 0.150.62 ± 0.3899.0–1000.24± 0.180.40 ± 0.4090.2–1003.18 ± 1.373.24 ± 1.40
DNA-S Total99.1–1000.48± 0.100.99 ±0.3697.7–1000.51 ± 0.170.88 ± 0.3597.0–1000.40 ± 0.211.23 ± 0.701000.000.00
DNA-M Total98.5–1000.33 ± 0.140.49 ± 0.2399.4–1000.18 ± 0.070.31 ± 0.2099.0–1000.63 ± 0.100.42 ±0.421000.000.00
DNA-C Total98.6–99.80.67 ± 0.150.80 ± 0.3997.4–1000.58 ± 0.130.67 ± 0.3695.0–1001.46 ± 0.501.79 ± 1.131000.000.00
DNA-N Total98.4–99.80.31 ± 0.060.36 ±0.2097.4–1000.25 ± 0.110.28 ± 0.1999.0–1000.36± 0.230.48 ± 0.4898.8–1000.55 ± 0.350.73 ± 0.73

Note: Percent identity range analysis indicates the minimum to maximum percent nucleotide identity values obtained after pair-wise comparison of isolates in the entire population. π (Pi) denotes pair-wise average nucleotide diversity per 100 sites along with standard deviation, while θw (Theta-w) signifies Watterson estimator for population mutation rates per 100 sites with its standard deviation. The population n for each components and subgroups are as follows: DNA-R [n(total) = 24], DNA-U3 [n(total) = 9], DNA-S [n(total) = 22], DNA-M [n(total) = 9], DNA-C [n(total) = 7] and DNA-N [n(total) = 6] where ‘total’ means all the sequence of isolates of particular genomic component.

Note: *For P.TJ1, the DNA-R (accession # AY996562) had been previously reported by Hyder et al., (2007) and DNA-U3 (accession#GQ214699) was reported by Hyder et al., 2011 -and by reporting rest of the components, its genome is completely reported in this stuy. Similarly,DNA-U3 of P.GH1. P. HD1 and P.JS1 (and their respective accession numbers FJ859748, FJ859750 and FJ859749) had been reported earlier by Hyder et al., (2011) and rest are being reported in this study. The isolates sampled in 2006 and 2007 are related to Ph.D thesis (Hyder, 2009) and not reported earlier elsewhere. In total 57 components belonging to 18 isolates from different geograhpic locations are sequenced for this study. Note: Percent identity range analysis indicates the minimum to maximum percent nucleotide identity values obtained after pair-wise comparison of isolates in the entire population. π (Pi) denotes pair-wise average nucleotide diversity per 100 sites along with standard deviation, while θw (Theta-w) signifies Watterson estimator for population mutation rates per 100 sites with its standard deviation. The population n for each components and subgroups are as follows: DNA-R [n(total) = 24], DNA-U3 [n(total) = 9], DNA-S [n(total) = 22], DNA-M [n(total) = 9], DNA-C [n(total) = 7] and DNA-N [n(total) = 6] where ‘total’ means all the sequence of isolates of particular genomic component. The BBTV population in Pakistan is considered to have a monophyletic origin and supposed to have been originated from a single introduction of BBTV in the country [25, 37] that took place before 1988 when BBTV was first observed in the Sindh [2]. The current diversity analysis (Table 3) suggests that BBTV components in Pakistan, are not evolving at a similar rate as some components are quite conserved i.e. DNA-R and DNA-N, while some have more diversification (i.e. DNA-U3). Interestingly, within a component, the divergence is not uniform, the most conserved component i.e. DNA-R has the most diverse CR-M, while DNA-U3 has CR-SL that is the most diverse functional region in the entire genome of BBTV. Once having these insights about the genetic diversity of BBTV population in Pakistan, the genetic diversity of entire BBTV populations around the world was also calculated in similar way. The full-length sequences (1425 full-length components) from various BBTV isolates were obtained from GenBank using Taxonomy browser (S1 Table) and their diversity was calculated for two major groups of BBTV population i.e., the South Pacific and Asian groups [26] now referred to as Pacific Indian Ocean (PIO) and the Southeast Asian (SEA) group respectively [57] along with the entire world population. The analysis (Table 4) revealed that PIO group is more conserved compared to the SEA group, a finding that is consistent with the previous studies [7, 9, 26, 33, 58]. Among components DNA-R is the most conserved, while DNA-U3 is the most diverse component (Table 4) and the coding region analysis also follows the same trend. However, contradictory to the diversity analysis of BBTV-Pakistan, DNA-U3 has the most diverse CR-M in the entire BBTV population, while DNA-C has the most conserved CR-M region. The DNA-N has the most conserved CR-SL region in the entire population, while, like Pakistani population, DNA-U3 also has the most diverse CR-SL region. These analyses clearly suggest that different components and their functional part in BBTV genome are harboring different extent of genetic diversity and this diversity is related to different broader geographic regions such as PIO and SEA or to a country such as Pakistan.
Table 4

Genetic diversity of Banana bunchy top virus population in the world.

ComponentsPopulationFull-lengthCoding regionCR-MCR-SL
Percent Identity RangeπθwPercent Identity RangeπθwPercent Identity RangeπθwPercent Identity Rangeπθw
DNA-R Total86.6–1004.46± 0.176.53±1.3190.2–1003.95±0.145.87 ± 1.1959.3–1009.40± 0.4011.49±2.8470.8–1001.21± 0.157.54± 2.01
PIO92–1002.18± 0.765.28±1.1194.6–1002.00±0.064.70±1.0074.6–1004.65± 0.2710.84±2.7782.2–1001.39± 0.165.89±1.66
SEA89.8–1002.81± 0.304.50±1.1691.9–1002.65±0.294.00+1.0481.0–1004.60± 0.965.52±3.0471.4–1000.92± 0.425.39±1.88
DNA-U3 Total66.4–99.98.45±0.4611.11±2.4167.9–1007.83±0.4911.58 ± 2.643.0–10013.38±0.8114.59±3.6638.2–1008.41± 0.4712.08±3.60
PIO70.9–1005.99±0.339.53± 2.1475.3–1004.97± 0.4110.78± 2.5547.3–1008.70± 0.6613.60±3.5754.1–1007.58± 0.326.57± 2.16
SEA67.6–99.76.41± 0.828.38± 2.7475.8–1006.19 ± 0.798.14± 2.5266.3–1007.57±1.6611.20±3.6631.4–10010.66± 2.3514.42±5.19
DNA-S Total79.9–99.65.41± 0.268.37 ±1.7386.9–99.64.01 ± 0.156.79 ± 1.4460.4–10013.37 ± 0.8510.86 ± 2.7875.6–1002.22 ± 0.217.08 ± 1.99
PIO87.2–1002.59 ± 0.167.41 ± 1.5987.5–1002.44 ± 0.116.28 ± 1.3979.2–1003.05± 0.2910.82 ± 2.7181.7–1002.25 ± 0.216.50 ± 1.92
SEA84.7–1004.58 ± 0.646.24 ± 1.6891.3–1002.82 ± 0.323.81 ± 1.0761.4–1006.17± 2.139.80 ± 3.1376.8–1002.10 ± 0.673.90 ± 1.52
DNA-M Total72.2–1008.20± 0.419.04± 1.9680.7–1006.73± 0.259.05± 2.0263.0–10010.83± 0.918.67± 2.2472.4–1002.26± 0.395.45±1.70
PIO85.2–1004.34± 0.107.30± 1.6582.6–1004.55± 0.147.99±1.8685.9–1002.4 ± 0.156.13± 1.7271.1–1001.58± 0.214.66±1.48
SEA79.4–1005.44± 0.797.87± 2.2789.5–1004.04± 0.455.39± 1.6462.4–1001.72± 0.322.41± 1.0379.3–1000.85± 0.404.46±1.13
DNA-C Total83.6–1004.80± 0.387.43 ± 1.6383–1004.71 ± 0.367.91 ± 1.7767.3–1007.94 ± 0.837.86 ± 2.1380.4–1002.87 ± 0.365.07 ± 1.64
PIO92.4–1001.58± 0.085.69 ± 1.3090.1–1001.66 ± 0.126.44 ± 1.5092.1–1000.64 ± 0.104.22 ± 1.3379.3–1002.41 ± 0.234.29 ± 1.44
SEA88–1004.04± 0.605.51 ± 1.6388.9–1003.46 ± 0.605.09 ± 1.5547.5–1003.58 ± 0.443.83 ± 1.4872–1006.35 ± 1.277.71 ± 2.81
DNA-N Total76.2–99.47.25 ± 0.397.26 ±1.5887.3–99.84.72 ± 0.306.04 ± 1.3866.3–1008.88± 0.8910.37 ± 2.6895.1–1001.11 ± 0.104.11 ± 1.38
PIO76.6–1003.83 ± 0.216.66 ± 1.5278.6–1002.10 ± 0.165.56 ± 1.3394–1002.65 ± 0.679.22 ± 2.4890.2–1000.92 ± 0.092.76 ± 1.08
SEA84.6–1005.86 ± 0.865.90 ± 1.6989.7–1004.44± 0.544.20± 1.2563.4–1004.31 ± 0.513.32± 1.3396.3–1001.41 ± 0.262.32 ± 1.12

Note: Percent identity range analysis indicates the minimum to maximum percent nucleotide identity values obtained after pair-wise comparison of isolates in a given population. π (Pi) denotes pair-wise average nucleotide diversity per 100 sites along with standard deviation, while θw (Theta-w) signifies Watterson estimator for population mutation rates per 100 sites with its standard deviation. The population n for each components and subgroups are as follows: DNA-R [n(total) = 342 {n(PIO) = 263, n(SEA) = 79}], DNA-U3 [n(total) = 200 {n(PIO) = 162, n(SEA) = 38}], DNA-S [n(total) = 274 {n(PIO) = 214, n(SEA) = 60}], DNA-M [n(total) = 205, {n(PIO) = 164, n(SEA) = 41}], DNA-C [n(total) = 194 {n(PIO) = 156, n(SEA) = 38}] and DNA-N [n(total) = 198 {n(PIO) = 154, n(SEA) = 44}] where ‘total’ means all the sequence of isolates of particular genomic component.

Note: Percent identity range analysis indicates the minimum to maximum percent nucleotide identity values obtained after pair-wise comparison of isolates in a given population. π (Pi) denotes pair-wise average nucleotide diversity per 100 sites along with standard deviation, while θw (Theta-w) signifies Watterson estimator for population mutation rates per 100 sites with its standard deviation. The population n for each components and subgroups are as follows: DNA-R [n(total) = 342 {n(PIO) = 263, n(SEA) = 79}], DNA-U3 [n(total) = 200 {n(PIO) = 162, n(SEA) = 38}], DNA-S [n(total) = 274 {n(PIO) = 214, n(SEA) = 60}], DNA-M [n(total) = 205, {n(PIO) = 164, n(SEA) = 41}], DNA-C [n(total) = 194 {n(PIO) = 156, n(SEA) = 38}] and DNA-N [n(total) = 198 {n(PIO) = 154, n(SEA) = 44}] where ‘total’ means all the sequence of isolates of particular genomic component.

Phylogenetic analysis

The phylogenetic relationship of BBTV isolates from different banana growing regions was analyzed to determine the relationships of isolates from Pakistan with those from worldwide. For this purpose, a concatenated genome-based ML phylogenetic tree was generated. The tree includes a total of 132 BBTV genomes out of which 107 were from PIO and 25 were from SEA group. The phylogenetic analysis showed close homology of Pakistan isolates with each other making a single clade with close relationships with isolates from Congo, Sri Lanka, Rwanda, Malawi, Burundi and India with a bootstrap support of 78% (Fig 1) in PIO. In addition, the multiple sequence alignments of a total 345 full-length DNA-R sequences (266 from PIO and 79 from the SEA), 208 full-length sequences of DNA-U3 (166 from PIO and 42 from SEA), 274 full-length sequences of DNA-S (214 from PIO and 60 from SEA), for DNA-M, 206 full-length sequences (164 from PIO and 42 from SEA), 194 full-length sequences of DNA-C (156 from PIO and 38 from SEA) and for DNA-N a total of 198 full-length sequences (154 from PIO and 44 from SEA) were conducted by MAFFT for each component separately and phylogenetic relationships were inferred using Maximum Likelihood (ML) method with 1000 bootstrap replications using MEGA X and Newick format files were also generated and annotated in FigTree version 1.4.3 [44] (Fig 2, S1–S5 Figs. This analysis revealed a clear-cut partitioning of BBTV isolates into two major clusters (PIO and SEA group) [57]. Interestingly, though Pakistani BBTV population phylogenetically belonged to PIO based on all the genetic components however, their genetic diversity pattern (Table 3) is quite distinct compared to the PIO group (Table 4). Notably, the most conserved component in BBTV Pakistan is DNA-R, while in PIO it is DNA-C, while most conserved functional region in BBTV Pakistan is CR-SL (identical in DNA-R, -S, -M and -C) while in PIO it is CR-M of DNA-C.
Fig 1

Maximum Likelihood phylogenetic analysis of Banana bunchy top virus entire genomes, showing the relationship of newly characterized BBTV genomes from Pakistan.

Phylogenetic Maximum Likelihood analysis based on the concatenated nucleotide sequences of BBTV genomic components (R, U3, S, M, C, N together) indicates the evolutionary analyses of Pakistani isolates with the isolates from other parts of the world. The analysis involved 132 complete genome sequences. Each component is labelled with letters representing its geographic origin: Aus Australia, B. Burundi, C. China, Co. Congo, E. Egypt, I. India, M. Malawi, U. USA, S. Samoa, RW. Rwanda, Si. Sri Lanka, P. Pakistan, Ph. Philippines, T. Taiwan, To. Tonga, In. Indonesia, TH. Thailand. Pakistan isolates were shown in bold. The analysis showed two geographic regions in BBTV i.e Pacific Indian Ocean and Southeast Asia. Within the PIO Pakistan isolates showed homology with Indian, Samoa, Tonga Egypt, Australia and USA isolates. While SEA group includes isolates from China, Taiwan, Philippine, Thailand, Indonesia and a reassorted isolate of India.

Fig 2

Phylogenetic analysis of BBTV DNA-U3 component illustrating intra and intergenomic recombination events at different nodes.

Phylogenetic analysis of DNA-U3 based on nucleotide sequences of BBTV isolates from Pakistan along with previously reported PIO and SEA isolates of BBTV. The evolutionary history was inferred using the Maximum likelihood (ML) method. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates). Each sequence is labelled with the GenBank accession number followed by origin and isolate name. The phylogenetic trees were further annotated in FigTree software v1.4.3 (Rambaut, 2010). Light green colour in the circle represents SEA population, while yellow colour represents PIO population. Similar analysis of DNA-R, -S, -M, -C and -N were performed and provided as supplementary figures (S1–S5 Figs) Intergenomic recombination events were marked with pink arc, while Intragenomic recombination events marked with blue arc. Isolates involved in both events were coloured brown. E represents the number of events and a,b,c,d etc. shows different populations in one event detected in RDP. Asterisk sign (*) shows only one isolate was involved in recombination at the given node. Arrows on both sides indicates that all the isolates were involved in respective recombination event. Blue arrow on both sides indicates that all the isolates of PIO group were involved in respective recombination event. The recombinant events described in S2 and S3 Tables are marked. Due to recombination different isolates showed intra and intergenomic events at different nodes.

Maximum Likelihood phylogenetic analysis of Banana bunchy top virus entire genomes, showing the relationship of newly characterized BBTV genomes from Pakistan.

Phylogenetic Maximum Likelihood analysis based on the concatenated nucleotide sequences of BBTV genomic components (R, U3, S, M, C, N together) indicates the evolutionary analyses of Pakistani isolates with the isolates from other parts of the world. The analysis involved 132 complete genome sequences. Each component is labelled with letters representing its geographic origin: Aus Australia, B. Burundi, C. China, Co. Congo, E. Egypt, I. India, M. Malawi, U. USA, S. Samoa, RW. Rwanda, Si. Sri Lanka, P. Pakistan, Ph. Philippines, T. Taiwan, To. Tonga, In. Indonesia, TH. Thailand. Pakistan isolates were shown in bold. The analysis showed two geographic regions in BBTV i.e Pacific Indian Ocean and Southeast Asia. Within the PIO Pakistan isolates showed homology with Indian, Samoa, Tonga Egypt, Australia and USA isolates. While SEA group includes isolates from China, Taiwan, Philippine, Thailand, Indonesia and a reassorted isolate of India.

Phylogenetic analysis of BBTV DNA-U3 component illustrating intra and intergenomic recombination events at different nodes.

Phylogenetic analysis of DNA-U3 based on nucleotide sequences of BBTV isolates from Pakistan along with previously reported PIO and SEA isolates of BBTV. The evolutionary history was inferred using the Maximum likelihood (ML) method. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates). Each sequence is labelled with the GenBank accession number followed by origin and isolate name. The phylogenetic trees were further annotated in FigTree software v1.4.3 (Rambaut, 2010). Light green colour in the circle represents SEA population, while yellow colour represents PIO population. Similar analysis of DNA-R, -S, -M, -C and -N were performed and provided as supplementary figures (S1–S5 Figs) Intergenomic recombination events were marked with pink arc, while Intragenomic recombination events marked with blue arc. Isolates involved in both events were coloured brown. E represents the number of events and a,b,c,d etc. shows different populations in one event detected in RDP. Asterisk sign (*) shows only one isolate was involved in recombination at the given node. Arrows on both sides indicates that all the isolates were involved in respective recombination event. Blue arrow on both sides indicates that all the isolates of PIO group were involved in respective recombination event. The recombinant events described in S2 and S3 Tables are marked. Due to recombination different isolates showed intra and intergenomic events at different nodes.

Population genetics and selection pressure

To understand the evolutionary selection pressure on various BBTV genomic components, the nonsynonymous to synonymous substitutions rate per site were calculated (Table 5) for coding regions of each component. For all the components (DNA-R, -S, -M, -C, -N and -U3 in many geographic regions), the dN values (ranging from 0.001–0.062) for the coding region were smaller than dS (ranging from 0.002–0.352) and the dN/dS ratios (ranging from 0.031–0.612) were <1, indicating negative/purifying selection, except for DNA-U3 where dN/dS ratio was different between different geographic populations (Table 5). In Pakistani, Taiwanese, Egyptian and Tongan population DNA-U3 showed negative/purifying selection (dN/dS ratio <1), while for Australian, Congo and USA populations showed positive/diversifying selection (dN/dS ratio >1) while population from India and Samoa showed neutral (dN/dS ratio = 1) selection pressure. The data of selection pressure indicated that the coding regions of all the components (except DNA-U3) of BBTV genome in various geographic populations are under purifying selection pressure. The purifying selection pressure is known to preserve the biological function and help in removal of deleterious mutations [59, 60] indicating that proteins encoded by BBTV genomic components are essential and any deleterious mutations occurring in their genes are eliminated from these geographic populations. However, DNA-U3 is interesting exception to this general observation noted for BBTV genome. In Congo it is under high positive selection with dN/dS value of 3.828. While it is under moderate positive selection in USA and Australia (dN/dS value of 1.773 and 1.496 respectively) and under neutral selection pressure in India and Samoa (with dN/dS value of 1.070 and 1.038 respectively) (Table 5).
Table 5

Component wise population genetic parameters of Banana bunchy top virus.

ComponentPopulationdNdSdN/dSπ (CDS)π (Full length)
DNA-R Australia (n = 37)0.00207 (0.00089) a0.00884 (0.00398) a0.2341630.350.46
Burundi (n = 3)0.00398 (0.00194)0.00724 (0.00497)0.5497240.460.72
China (n = 9)0.00690 (0.00203)0.11547 (0.01807)0.0597562.803.00
Congo (n = 69)0.00142 (0.00052)0.01960 (0.00479)0.0724490.520.87
Egypt (n = 5)0.01424 (0.00312)0.15724 (0.02306)0.0905623.934.85
India (n = 59)0.01092 (0.00191)0.09927 (0.01252)0.1100032.752.95
Indonesia (n = 17)0.00195 (0.00084)0.01734 (0.00472)0.1124570.510.62
Japan (n = 7)0.00171 (0.00099)0.01814 (0.00583)0.0942670.520.43
Pakistan (n = 24)0.00062 (0.00026)0.00393 (0.00227)0.1577610.130.26
Philippine (n = 16)0.00145 (0.00078)0.01453 (0.00456)0.0997940.420.60
Rwanda (n = 2)0.00149 (0.00145)0.00540 (0.00528)0.2759260.230.54
Samoa (n = 2)0.00149 (0.00144)0.04433 (0.01509)0.0336121.041.17
Sri Lanka (n = 3)0.0079 (0.00301)80.04654 (0.01313)0.1714651.581.53
Myanmar (n = 3)0.00099 (0.00099)0.02191 (0.00841)0.0451850.540.84
Taiwan (n = 13)0.00869 (0.00185)0.08537 (0.01217)0.1017922.253.02
Tonga (n = 55)0.00493 (0.00110)0.02075 (0.00421)0.237590.811.05
Vietnam (n = 9)0.01312 (0.00271)0.16801 (0.02192)0.0780914.184.33
Total (n = 333)0.01252 (0.00229)0.16648 (0.01825)0.0752043.984.45
DNA-U3 Australia (n = 29)0.02309 (0.01033)0.01543 (0.00691)1.4964362.071.45
Congo (n = 21)0.02148 (0.00797)0.00561 (0.00398)3.8288771.692.10
Egypt (n = 4)0.02045 (0.00863)0.02876 (0.01665)0.7110572.202.97
India (n = 9)0.05893 (0.01346)0.05504 (0.01372)1.070675.388.60
Pakistan (n = 7)0.00348 (0.00242)0.00447 (0.00454)0.7785230.360.56
Samoa (n = 2)0.01746 (0.00980)0.01681 (0.01691)1.0386671.701.79
Taiwan (n = 2)0.06266 (0.02339)0.09895 (0.04278)0.6332496.833.97
Tonga (n = 37)0.02590 (0.00961)0.03434 (0.01250)0.7542222.743.78
USA (n = 2)0.01460 (0.01063)0.00823 (0.00817)1.7739981.280.66
Total (n = 121)0.03358 (0.01133)0.04488 (0.01309)0.7482173.865.41
DNA-S Australia (n = 34)0.00766 (0.00193)0.03630 (0.00955)0.2110191.371.50
Burundi (n = 5)0.00697 (0.00283)0.02353 (0.00990)0.2962181.061.17
China (n = 8)0.00892 (0.00328)0.10067 (0.02055)0.0886062.763.64
Congo (n = 25)0.00498 (0.00165)0.03648 (0.00979)0.1365131.181.22
Gabon (n = 2)0.00998 (0.00492)0.03371 (0.01644)0.2960551.511.58
India (n = 45)0.00648 (0.00158)0.09444 (0.01666)0.0686152.473.56
Indonesia (n = 7)0.00331 (0.00168)0.03558 (0.01175)0.093031.041.0
Japan (n = 3)0.00166 (0.00160)0.01663 (0.00929)0.099820.500.63
Malawi (n = 2)0.00497 (0.00341)0.04244 (0.01873)0.1171071.321.02
Pakistan (n = 22)0.00259 (0.00098)0.01434 (0.00556)0.1806140.510.48
Philippine (n = 17)0.00450 (0.00186)0.02594 (0.00853)0.1734770.920.62
Samoa (n = 2)0.00248 (0.00246)0.07822 (0.02541)0.0317051.891.48
Sri Lanka (n = 3)0.00497 (0.00280)0.03958 (0.01471)0.1255681.261.42
Taiwan (n = 15)0.01076 (0.00299)0.09035 (0.01612)0.1190922.604.19
Tonga (n = 58)0.00460 (0.00140)0.02736 (0.00876)0.1681290.961.18
USA (n = 2)0.00248 (0.00242)0.01663 (0.01162)0.1491280.560.27
Vietnam (n = 2)00.00824 (0.00809)00.181.20
Total (n = 252)0.01203 (0.00322)0.15249 1 (0.02345)0.0806613.965.46
DNA-M Australia (n = 38)0.01772 (0.00361)0.06685 (0.01724)0.2650712.792.17
Burundi (n = 4)0.00375 (0.00260)0.02653 (0.01359)0.1413490.890.89
China (n = 9)0.03767 (0.00693)0.08435 (0.02027)0.4465924.605.33
Congo (n = 22)0.01404 (0.00350)0.03646 (0.01202)0.385081.872.01
Egypt (n = 2)0.01505 (0.00739)0.06313 (0.02917)0.2383972.823.15
India (n = 13)0.03377 (0.00547)0.07776 (0.01687)0.4342854.123.63
Indonesia (n = 3)0.01527 (0.00611)0.05001 (0.02125)0.3053392.271.21
Pakistan (n = 9)0.00166 (0.00117)0.00271 (0.00264)0.6125460.180.33
Philippine (n = 12)0.00775 (0.00318)0.04778 (0.01594)0.1622021.661.06
Samoa (n = 2)0.00373 (0.00369)0.02464 (0.01779)0.151380.840.66
Sri Lanka (n = 3)0.01001 (0.00489)0.03304 (0.01668)0.3029661.501.69
Taiwan (n = 9)0.01701 (0.00481)0.03682 (001427)0.4619772.102.37
Thailand (n = 2)0.00378 (0.00372)0.01174 (0.01144)0.3219760.561.24
Tonga (n = 64)0.02751 (0.00566)0.07546 (0.01861)0.3645643.673.26
USA (n = 2)0.01127 (0.00643)0.02454 (0.01773)0.459251.410.66
Total (n = 194)0.04629 (0.00790)0.13683 (0.02857)0.338308.178.21
DNA-C Australia (n = 39)0.00483 (0.00183)0.02081 (0.00824)0.23210.810.92
Burundi (n = 4)0.01294 (0.00422)0.01752 (0.01004)0.7385841.371.09
China (n = 11)0.03415 (0.00627)0.16481 (0.02479)0.2072085.585.59
Congo (n = 22)0.00524 (0.00185)0.02390 (0.00817)0.2192470.911.34
Egypt (n = 2)0.07069 (0.01319)0.35203 (0.06966)0.20080711.3113.25
India (n = 12)0.04324 (0.00539)0.10264 (0.01637)0.4212785.144.62
Indonesia (n = 3)0.00708 (0.00349)0.02604 (0.01270)0.2718891.090.92
Pakistan (n = 5)0.00106 (0.00103)0.01548 (0.00751)0.0684750.410.47
Philippine (n = 12)0.00463 (0.00218)0.02141 (0.00908)0.2162540.810.81
Sri Lanka (n = 3)0.00711 (0.00346)0.07307 (0.02109)0.0973042.052.43
Taiwan (n = 9)0.02073 (0.00383)0.08691 (0.01680)0.2385233.144.04
USA (n = 2)0.00265 (0.00260)0.00958 (0.00934)0.2766180.410.19
Tonga (n = 60)0.00655 (0.00194)0.03476 (0.00955)0.1884351.231.28
Total (n = 184)0.02849 (0.00409)0.13653 (0.02201)0.2086724.604.82
DNA-N Australia (n = 35)0.00297 (0.00146)0.01753 (0.00751)0.1694240.600.52
Burundi (n = 4)0.00279 (0.00194)0.00988 (0.00692)0.2823890.430.44
China (n = 15)0.01732 (0.00508)0.15771 (0.02774)0.1098224.345.54
Congo (n = 22)0.00441 (0.00149)0.02111 (0.00831)0.2089060.791.25
India (n = 10)0.01684 (0.00335)0.08905 (0.01723)0.1891072.904.36
Indonesia (n = 3)0.00187 (0.00186)0.03949 (0.01559)0.0473541.000.73
Pakistan (n = 5)0 (0)0.00396 (0.00395)00.860.22
Philippine (n = 11)0.00143 (0.00142)0.02524 (0.00927)0.0566560.660.72
Samoa (n = 3)0.03056 (0.00828)0.28592 (0.05765)0.1068837.1615.04
Sri Lanka (n = 2)0.01119 (0.00549)0.04028 (0.01965)0.2778051.723.75
Taiwan (n = 11)0.01037 (0.00267)0.10100 (0.01865)0.1026732.623.64
Tonga (n = 62)0.00406 (0.00169)0.02781 (0.00971)0.1459910.901.53
USA (n = 2)00.03010 (0.01734)00.860.45
Total (n = 185)0.01752 (0.00374)0.18114 (0.02986)0.0967214.637.09

Note: n, number of isolates, dS, number of synonymous substitutions per site; dN, number of nonsynonymous substitutions per site, a Numbers in parentheses are the standard errors, isolates whose CDS were not given by the authors in GenBank were not included in the analysis; π, denotes pair-wise average nucleotide diversity per 100 sites.

Note: n, number of isolates, dS, number of synonymous substitutions per site; dN, number of nonsynonymous substitutions per site, a Numbers in parentheses are the standard errors, isolates whose CDS were not given by the authors in GenBank were not included in the analysis; π, denotes pair-wise average nucleotide diversity per 100 sites. The DNA-U3 encodes U3 protein for which any function is not yet been determined, however, it is an integral [5] and essential component for infection as demonstrated by infectivity assay for nanoviruses [61, 62]. Despite being essential component, the higher dN/dS values noted for DNA-U3 in some countries where rest of the components are under purifying selection, indicates that some significant factors are at play differently for DNA-U3. Interestingly, three out of four DNA-U3 components analyzed in this study were previously reported to have undergone recombination in Burundi [31], similarly, all of the DNA-U3 components of India and most of the components from Samoa, USA and Australia have also been reported for recombination [31] thus pointing toward the possible role of recombination in observed positive selection seen in DNA-U3 in some countries. Similarly, the Indian subcontinent is noted as a major hub [31] of long-distance banana and BBTV movements. Stainton and colleagues (2015) indicated it as both the major donor location (for BBTV dispersal events to other parts of the world) and the major recipient location (of virus introductions in it). In general, BBTV isolates from the PIO group are found throughout the natural geographical range of Musa balbisiana, whereas isolates from the SEA group are found throughout the ranges of M. balbisiana and M. acuminata [63]. However, the banana germplasm of some Indian regions primarily comprised of hybrids between M. balbisiana from the Indian subcontinent and M. acuminata from Southeast Asia [63, 64]. This might explain the neutral pressure seen in DNA-U3 in India isolates. Different selection pressure was also observed among different banana genome types [30]. Later, Chiaki and colleagues (2015) also noted that selection pressure was higher in viruses infecting banana varieties with the AAB or ABB genotypes than those infecting with AA or AAA genotypes. The data of selection pressure in present study, suggests that BBTV genome is under negative selection pressure, however, the coding regions of DNA-U3 in some geographic regions around the word are under neutral and positive selection which might be due to mixing of isolates from PIO and SEA groups, and/or due to virus propagation in hosts of different genetic backgrounds and/or due to recombination. Haplotype diversity values were high for BBTV components ranging from 0.4–1.0 (Table 6). Tajima’s D values were calculated to determine the impact of demographic expansion and contraction in different BBTV DNA component populations. Negative values with statistical significance (P<0.02 or P<0.01) were obtained for DNA-R and DNA-C (Tonga population) and in the Australian population of DNA-S. These results were further confirmed by Fu and Li’s D and F statistical test values (Table 6). Which suggests that these populations may be under expansion phase. On the contrary, no statistically significant positive or negative values were found in the remaining populations of BBTV components suggesting that these populations may be undergoing a neutral or contraction period.
Table 6

Neutrality test, haplotype and nucleotide diversity of each BBTV population.

ComponentGeographical groupFu & Li’s D*Fu & Li’s F*Tajima’s DNucleotide diversity (π)Number of Haplotype (H)Haplotype diversity (Hd)
Australia (n = 37)-2.04035-2.09769-1.2460.350 (3.015) a190.908
China (n = 9)-0.00578**0.00812**0.29796*2.804 (24.139)80.972
Congo (n = 69)-3.93364-3.84941-2.031610.529 (4.557)300.928
Egypt (n = 5)-0.91326-1.00473-0.717133.937 (33.900)40.900
India (n = 59)-0.56954-1.16610-1.227982.755 (23.720)480.991
DNA-R Indonesia (n = 17)-1.25467-1.56227-1.372070.519 (4.470)130.963
Japan (n = 7)-1.27255-1.37953-1.192480.520 (4.476)71.000
Pakistan (n = 24)-3.12612-3.22066-1.914430.133 (1.141)90.616
Philippine (n = 16)-1.55326-1.65731-1.140400.424 (3.650)150.992
Taiwan (n = 13)-2.28488-2.48829-1.573682.251 (19.384)110.962
Tonga (n = 55)-4.99012**-4.81334**-2.35564**0.819 (7.047)440.989
Vietnam (n = 9)-0.04727-0.083220.516544.184 (36.027)91.000
Australia (n = 29)-0.084970.008270.207062.143 (5.013)280.995
Burundi (n = 4)0.956210.903580.956211.282 (3.000)41.000
China (n = 7)-1.42725-1.52246-1.358410.824 (0.857)40.714
Congo (n = 21)-1.55082-1.59338-0.777441.691 (3.957)200.995
DNA-U3 India (n = 13)-2.10432-2.34211-1.520513.901 (9.128)110.962
Egypt (n = 4)-0.52807-0.52801-0.528072.208 (5.166)30.833
Pakistan (n = 7)-1.42725-1.52246-1.358410.366 (0.8571)30.524
Philippine (n = 11)-2.38446**-2.55684**-1.831807.564 (17.472)111.000
Tonga (n = 38)-2.99753*-2.89995*-1.165142.731 (6.389)370.999
Australia (n = 34)-4.33599**-4.23105**-1.978714*1.376 (7.2656)290.991
Burundi (n = 5)-0.20090-0.21293-0.200901.061 (5.600)51.000
China (n = 8)0.243990.134150.277672.767 (14.607)81.000
Congo (n = 25)-1.66789-1.79259-1.221071.188 (6.273)200.983
India (n = 45)-3.21478*-3.14296*-1.335422.476 (13.075)380.989
DNA-S Indonesia (n = 7)0.304300.314780.224671.046 (5.52350.857
Pakistan (n = 22)-2.37337-2.51062-1.511400.518 (2.735)90.658
Philippine (n = 17)-1.17566-1.19675-0.681560.927 (4.897)150.978
Taiwan (n = 15)0.986970.62737-0.449602.608 (13.771)151.000
Tonga (n = 58)-4.84147**-4.47257**-1.883000.966 (5.102)520.996
Australia (n = 38)-0.83943-0.99304-0.777402.794 (9.890)350.994
Burundi (n = 4)-0.31446-0.30226-0.314460.895 (3.166)41.000
China (n = 9)-0.79528-0.89198-0.791114.606 (16.166)70.944
Congo (n = 22)-2.82192*-2.94040*-1.661201.873 (6.632)210.996
DNA-M India (n = 13)-1.81015-2.00681-1.522434.121 (14.589)110.974
Pakistan (n = 9)-1.68268-1.82046-1.512970.188 (0.666)40.583
Philippine (n = 12)-0.59573-0.65666-0.093001.662 (5.833)121.000
Taiwan (n = 9)0.431340.371980.020152.105 (7.388)80.972
Tonga (n = 64)-2.41535-2.28545-0.918003.675 (13.010)641.000
Australia (n = 39)-2.52616*-2.49715-1.294070.817 (3.973)330.992
Burundi (n = 4)0.186770.188860.186771.372 (6.666)41.000
China (n = 11)-0.02852-0.17311-0.145155.585 (27.145)90.945
DNA-C Congo (n = 22)-2.31978-2.39685-1.233700.912 (4.432)200.987
India (n = 12)-1.78246-2.00370-1.493215.144 (25.000)100.970
Pakistan (n = 5)-1.12397-1.15583-1.123970.412 (2.000)40.900
Philippine (n = 12)-1.04405-0.98064-0.344650.814 (3.954)121.000
Taiwan (n = 9)-1.85984-2.01186-1.453353.144 (15.277)91.000
Tonga (n = 60)-4.95616**-4.65426**-2.03254*1.239 (6.023)590.999
Australia (n = 35)-3.08712*-3.03503*-1.519020.607 (2.823)220.956
Burundi (n = 4)-0.78012-0.72052-0.780120.430 (2.000)30.833
China (n = 15)0.922330.736990.742374.340 (20.180)120.971
Congo (n = 22)-2.46716-2.62629*-1.658320.794 (3.692)200.991
DNA-N India (n = 10)-2.19752**-2.39401*-1.83388*2.906 (13.511)90.978
Pakistan (n = 5)-0.81650-0.77152-0.816500.086 (0.400)20.400
Philippine (n = 11)-0.86187-0.84325-0.403050.665 (3.090)111.000
Taiwan (n = 11)-2.02194-2.20438-1.654092.628 (12.218)111.000
Tonga (n = 62)-4.81607**-4.50832**-1.791120.908 (4.222)470.987

Note:

a Numbers in parentheses are the average number of nucleotide differences, all isolates whose CDS were not given by the authors in GenBank were not included in the analysis, also populations having less than four isolates were excluded due to the software requirement, π, denotes pair-wise average nucleotide diversity per site.

* P<0.02

** P<0.01.

Note: a Numbers in parentheses are the average number of nucleotide differences, all isolates whose CDS were not given by the authors in GenBank were not included in the analysis, also populations having less than four isolates were excluded due to the software requirement, π, denotes pair-wise average nucleotide diversity per site. * P<0.02 ** P<0.01. A detailed analysis on full-length sequences of BBTV genomic components using various methods implemented in the Recombination Detection Program (RDP), reveals many recombination events of intergenomic (homologous recombination between the same components) (S2 Table) and intragenomic (heterologous recombination between the two different components) (S3 Table) in BBTV genome. There are about fifty-four (66%) intergenomic recombination events out of total eighty-two events, while there are twenty-eight (34%) intragenomic recombinant events. Component wise, DNA-U3 is involved in the majority (thirty-five events, about 43%) of detected recombination events, followed by DNA-M (thirteen events, about 16%) while DNA-R and -N are involved in only nine (11%) events each. The data of recombination occurrence for each component revealed that DNA-S which encodes coat protein of BBTV has the least recombination occurrence of 1.8% (5 recombination events from 274 components) followed by DNA-R with 2.6% (9 recombination events detected in 345 components) (Table 7). It is worthy to note that DNA-U3 showed the highest 16.8% (35 recombination events occurred in 208 component) occurrence of recombination in BBTV genome which is about 9 times higher than the least recombined component of DNA-S. Interestingly, the DNA-U3 which was found to have the greatest genetic diversity (Table 4) shows the highest involvement in the recombination events occurring in BBTV genome, while DNA-R that is the most conserved component, is also among the least recombined components in BBTV genome.
Table 7

Occurrence of recombination in Banana bunchy top virus genome.

ComponentsAmong PIO groupAmong SEA groupBetween PIO & SEA groupEntire Population
Number of isolatesNumber of Recombination EventsPercentageNumber of isolatesNumber of Recombination EventsPercentageNumber of isolatesNumber of Recombination EventsPercentageNumber of isolatesNumber of Recombination EventsPercentage
Intergenomic Recombination
DNA-R 26610.377911.2634541.1634561.73
DNA-U3 16695.424249.5220883.842082110.09
DNA-S 21420.936011.6627410.3627441.45
DNA-M 16442.434224.7620620.9720683.88
DNA-C 15642.563825.2619421.0319484.12
DNA-N 15463.894412.271980019873.53
Total 1425261.82305113.601425171.191425543.78
Intragenomic Recombination
DNA-R 26610.37790034520.5734530.86
DNA-U3 16663.014237.1420852.40208146.73
DNA-S 21400600027410.3627410.36
DNA-M 16421.214212.3820620.9720652.42
DNA-C 15600380019431.5419431.54
DNA-N 15410.374412.271980019821.51
Total 1425100.7030551.631425130.911425282.175

Note: The population n for each components and subgroups are as follows: DNA-R [n(total) = 345 {n(PIO) = 266, n(SEA) = 79}], DNA-U3 [n(total) = 208 {n(PIO) = 166, n(SEA) = 42}], DNA-S [n(total) = 274 {n(PIO) = 214, n(SEA) = 60}], DNA-M [n(total) = 206, {n(PIO) = 164, n(SEA) = 42}], DNA-C [n(total) = 194 {n(PIO) = 156, n(SEA) = 38}] and DNA-N [n(total) = 198 {n(PIO) = 154, n(SEA) = 44}] where ‘total’ means all the sequence of isolates of particular genomic component.

Note: The population n for each components and subgroups are as follows: DNA-R [n(total) = 345 {n(PIO) = 266, n(SEA) = 79}], DNA-U3 [n(total) = 208 {n(PIO) = 166, n(SEA) = 42}], DNA-S [n(total) = 274 {n(PIO) = 214, n(SEA) = 60}], DNA-M [n(total) = 206, {n(PIO) = 164, n(SEA) = 42}], DNA-C [n(total) = 194 {n(PIO) = 156, n(SEA) = 38}] and DNA-N [n(total) = 198 {n(PIO) = 154, n(SEA) = 44}] where ‘total’ means all the sequence of isolates of particular genomic component. To understand the contribution of recombination in BBTV genetic diversity, the recombined regions identified by RDP for the respective events were selected and the diversity analysis of these geographic populations with and without recombinants was performed. The analysis (Table 8) revealed that recombination is responsible to increase the genetic diversity of many of the geographic populations which harbor recombinant isolates. In some instances (recombination event (1), Table 8), it is responsible to increase the diversity of about 4 times due to the recombinants in those populations. In some cases (recombination events (3), (15), (16), (17), (18), (33), (38), (39), (40), (48), (50), (53), (67), (72), Table 8) the diversity of recombinant population is less than their non-recombinant population, however, the average of the entire dataset (for which recombinant and non-recombinant population of certain geographic region was available) showed a net contribution of about 1.4 times suggesting a significant contribution by recombination in BBTV genetic diversity.
Table 8

Contribution of genetic diversity by recombinant isolates in their population.

Recombination EventDiversity without recombinantsDiversity with recombinantsFold Increase
SubpopulationsIdentity RangeπθwIdentity Rangeπθwπθw
INTERGENOMIC RECOMBINATION
Among PIO subgroup
DNA-R (1)India98.7–99.40.83±0.221.00±0.6198.7–99.43.83±1.955.03±2.484.65.03
DNA-U3 (3)Tonga87–99.64.49±0.324.80±1.6186.9–99.63.18±0.533.76±1.410.700.78
DNA-U3 (5)Australia98.5–1000.98±0.210.83±0.4992.6–1002.58±0.552.59±1.012.632.64
DNA-U3 (6)Tonga64.0–99.714.47±3.2712.55±5.0859.6–99.717.08±1.3610.85±3.851.180.86
DNA-U3 (10)Tonga1000.36±0.180.36±0.3699.5–99.70.72±0.250.72±0.5522
DNA-M (15)Congo97.9–1000.77±0.211.08±0.5890.8–1001.73±0.553.37±1.312.243.12
Tonga95.7–99.31.54±0.431.59±0.9795.7–1001.22±0.271.50±0.760.790.94
DNA-M (16)Congo97.8–98.01.96±0.571.88±1.2097.8–98.91.53±0.351.54±0.910.780.81
DNA-C (17)India94.2–99.82.16±0.812.72±1.2394.0–99.42.15±0.742.87±1.260.991.05
DNA-C (18)India92.8–99.82.91±0.973.22±1.4393.1–99.82.74±0.923.19±1.380.940.99
DNA-C (20)India83.2–1004.89±3.086.93±3.3583.2–1007.24±2.796.61±3.121.480.95
DNA-N (22)Tonga88.2–1006.16±1.054.80±3.0377.1–10012.81±4.9214.74±7.322.073.07
DNA-N (26)Tonga58.3–1002.01±0.263.44±1.0457.7–1002.86±0.273.54±1.041.421.02
Among SEA subgroup
DNA-U3 (30)China94.9–1003.52±2.094.23±2.9862.7–1007.64± 2.997.72±4.562.171.82
DNA-M (33)Philippine99.1–1000.26±0.190.41±0.4198.2–1000.23±0.170.39±0.390.880.95
Between PIO and SEA subgroups
DNA-R (38)Congo93.0–1002.81±0.512.23±1.0693.0–1002.74±0.482.13±0.990.970.95
Tonga91.8–1002.25±0.732.20±0.9887.8–1002.92±0.773.25±1.311.294.45
Indonesia96.3–1001.29±0.230.94±0.5696.3–1001.34±0.180.91±0.541.030.96
India89.5–1006.12±0.366.12±4.5678.3–1007.27±1.497.18±3.801.181.17
Taiwan97.02.89±1.442.89±2.2294.0–97.05.31±1.835.31±3.421.831.83
DNA-R (39)China99.3–1000.91±0.300.91±0.7599.3–1000.79±0.250.74 ±0.600.860.65
Congo97.9–99.32.74±0.813.42±1.3797.9–99.32.95±0.673.46±1.341.071.01
India85.6–1002.77±1.023.86±1.7485.5–1002.83± 0.853.98± 1.691.021.03
Tonga49.0–1002.24± 0.463.65± 1.3449.3–1002.50± 0.433.89± 1.281.111.06
DNA-R (40)India97.8–1000.51+0.140.85+0.5497.8–1000.50±0.130.82±0.510.980.96
Congo92.2–1001.33+0.461.33+1.0092.2–1001.33+0.361.45+0.9911.09
Tonga97.8–1000.28+0.101.25+0.6494.4–1000.40+0.171.70+0.771.431.36
Australia98.9–1000.74+0.340.74+0.7498.9–1000.74+0.220.60+0.6010.81
DNA-R (41)India99.0–1000.57±0.160.60±0.3799.0–1000.60±0.130.65±0.381.051.08
DNA-U3 (44)China97.92.23±1.112.23±1.7683.1–1008.64±1.288.45±3.163.873.78
DNA-U3 (45)China89.9–1005.17±0.404.01±1.6684.5–1006.82± 0.965.87± 2.271.321.46
DNA-U3 (46)Tonga95.6–1003.32±0.642.77± 1.5480.7–1007.80± 3.289.36±4.352.343.37
DNA-U3 (48)Philippine93.9–97.04.54±1.104.95±3.5987.9–97.07.21±1.267.42±4.271.591.49
Taiwan93.96.06±3.036.06±5.2490.9–1004.44±0.943.98± 2.750.730.66
Tonga87.9–1007.011±0.616.16±3.0472.7–1008.57±1.5310.95±4.621.221.77
DNA-S (50)Australia92.415.04±7.5215.07±10.7597.9–1000.76±0.081.50±0.510.050.09
Pakistan99.40.67±0.330.67±0.5398.3–1000.42±0.111.05±0.400.621.56
Taiwan98.9–99.40.88±0.320.88±0.6398.9–1000.81±0.120.96±0.460.921.09
DNA-C (53)China89.6–1006.22±2.936.22±4.1687.0–1004.80±1.435.00±2.400.800.89
INTRAGENOMIC RECOMBINANT
Among PIO and SEA subgroup
DNA-C (67)Australia99.1–1001.70±0.521.39±1.1298.6–1001.83±0.371.57±1.071.071.12
Congo83.3–92.215.78±5.2915.78±9.8985.7–1008.86±2.3710.94±5.580.560.69
Tonga97.0–1001.76±0.292.40±1.2497.0–1001.32±0.171.82±0.940.750.75
Taiwan88.4–1004.90±2.506.44±3.5167.1–10011.11±4.1811.45±5.712.261.77
Among PIO subgroup
DNA-U3 (69)Congo98.6–1001.33±0.621.33±1.3398.6–1001.35±0.531.07±1.091.010.80
DNA-N (70)Tonga96.1–1000.50±0.220.49±0.4990.9–1001.24±0.621.95±1.172.483.98
DNA-M (71)Congo94.27.69±3.847.69±5.9588.4–96.212.82± 4.612.30 7.881.661.59
DNA-R (72)Congo94.7–96.04.27±1.214.05±2.6594.0–1001.98±0.322.45±1.030.460.60
Tonga61.7–1002.35±0.593.21±1.2962.2–1002.46±0.440.30±1.121.040.09
Average 1.4 1.5

Note: The analyses are performed only for those recombined regions in different genomic components which were identified to have undergone Intergenomic & Intragenomic recombination detected by Recombination Detection Program (RDP) version 4 Beta14 (Martin et al., 2015) No. in parenthesis () represents those recombination events for which recombinant and non-recombinant population of certain geographic region was available, the recombination events involving entire population at a certain geographic region was not included due to unavailability of non-recombinant population.

Note: The analyses are performed only for those recombined regions in different genomic components which were identified to have undergone Intergenomic & Intragenomic recombination detected by Recombination Detection Program (RDP) version 4 Beta14 (Martin et al., 2015) No. in parenthesis () represents those recombination events for which recombinant and non-recombinant population of certain geographic region was available, the recombination events involving entire population at a certain geographic region was not included due to unavailability of non-recombinant population. The frequency of recombined regions in BBTV genome was analyzed by plotting them on various components (Fig 3) using TJ1 as a reference for nucleotides coordinates. The data show that DNA-U3 is highly involved both in inter and intragenomic recombination and DNA-R and DNA-S are least recombined components. In the BBTV genome, CR-SL, a functional regulatory region is identified as the recombinant hotspot for both intragenomic and intergenomic recombination.
Fig 3

Recombination map of Banana bunchy top virus genome.

The genomic components of BBTV are represented with inner circle having CR-SL, CR-M and ORF in black, pink and blue colours respectively. The nucleotide coordinates are corresponding to P.TJ1 isolates for reference. The red circles represent the intergenomic recombination, while the blue circle represent the intragenomic recombination. The intensity of colours was used to depict the frequency of recombination at a particular region using the key above. The recombinant events described in S2 and S3 Tables are marked. The figures were not drawn on scale.

Recombination map of Banana bunchy top virus genome.

The genomic components of BBTV are represented with inner circle having CR-SL, CR-M and ORF in black, pink and blue colours respectively. The nucleotide coordinates are corresponding to P.TJ1 isolates for reference. The red circles represent the intergenomic recombination, while the blue circle represent the intragenomic recombination. The intensity of colours was used to depict the frequency of recombination at a particular region using the key above. The recombinant events described in S2 and S3 Tables are marked. The figures were not drawn on scale.

Discussion

Understanding the genetic structure of virus populations and their evolutionary mechanisms is an important aspect of managing viral diseases [65]. In Pakistan, information on the molecular analysis of BBTV was very limited with only a partial genome [37] and few full-length DNA components sequenced and deposited within the public database in GenBank [16, 25, 37]. However, inferences based on full-length BBTV genomes suggest that genetic exchange by recombination and reassortment might have played an important role in BBTV evolution around the world [31]. Therefore, fifty-seven genomic components including five complete genomes were sequenced in this study from Pakistan and their genetic diversity and recombination analyses was performed. In the present study to understand existing variability in BBTV population, a detailed analysis of genetic diversity of its all component was performed first for the Pakistani population and later for the entire population of BBTV in the world. The genetic diversity data of Pakistani population (Table 3) suggested that BBTV full-length components diversity (ℼ) ranged from 0.26 to 0.69 pair-wise average nucleotide diversity per 100 sites, with Watterson estimator (θw) for population mutations rate of 0.36 to 0.99 per 100 sites in the country. The analysis showed that DNA-R a highly conserved component has the most diverse Common Region-Major (CR-M) while DNA-U3, a highly diverse component with the most conserved Common Region-Major (CR-M) and most diverse Common Region Stem-loop (CR-SL) (Table 3). An earlier study of recombination in CR-SL of DNA-U3 from Pakistan has also verified the diverse nature of this subgenomic regulatory region [16]. The analysis also indicated the heterogeneity of genetic diversity values associated with each component and their different parts, the overall diversity remained relatively small compared to the previously reported values for the Pacific Indian Ocean group [4, 26, 58]. Which has been split from the total population of BBTV probably a much larger time than the first introduction of BBTV in 1988 in Pakistan. The genetic diversity of entire BBTV population around the world (Table 4) was also calculated and compared with the diversity analysis of Pakistani population. In contrary to the existing BBTV diversity, diversity analysis of Pakistani BBTV population showed that the CR-M of DNA-U3, is the most diverse CR-M in the entire BBTV population. The analysis also confirmed the heterogeneity of genetic diversity values associated with each component and their different parts, through the diversity values (ℼ) for full-length sequences ranged between 4.42 to 8.54 pair-wise average nucleotide diversity per 100 sites, with Watterson estimator (θw) for population mutations rate of 6.61 to 11.07 per 100 sites, greater of an order of 10 to 20 times than the Pakistani population, suggesting that other factors might be at play for this observed diversity. Multiple sequence alignment of full-length DNA-R component worldwide shares 92–100% and 89.8–100% sequence identity (Table 4) with members of PIO and SEA group respectively. Which shows on average about 3% and 7% increase in PIO and SEA group as compared to earlier studies [15, 26, 33, 58] while in the case of CR-M and CR-SL on average increase was approximately 38% and 18.8% respectively in SEA population. For PIO population increase was about 12.3% and 1.85% in CR-M and CR-SL respectively. Notably multiple sequence alignment of full-length components of Pakistan population showed DNA-R, the most conserved component with 99.1–100% sequence identity while CR-SL was the conserved functional region with 100% sequence identity in all components except DNA-U3 and DNA-N with sequence identities of 90.2–100% and 98.8–100% respectively (Table 3). While multiple sequence analysis of full-length components in PIO showed DNA-C as most conserved component (92.4–100%) and CR-M of DNA-C as conserved functional region (92.1–100%) (Table 4). These differences of genetic diversity in various components of Pakistani BBTV population compared to PIO group might be due to various factors such as differential evolutionary pressure on important conserved proteins encoded by respective components or due to reassorted genomic components introduced in the country belonging to different regions. However, the phylogenetic analysis of individual genomic components (Fig 2, S1–S5 Figs and full entire genomic ML phylogenetic analysis (Fig 1) indicate that all Pakistani isolates clustered together as a clade of BBTV indicating that they have a common origin. This observation argues strongly against any possible reassortment. Therefore, the observed difference in the genetic diversity values of Pakistan isolates compared to PIO group, is not due to different origin rather it might be due to different evolutionary pressure on each component and/or possible recombination. The phylogenetic analysis of full-length DNA-R component (S1 Fig) showed that Pakistan isolates have closed homology with Egyptian and Indian isolates. Which is similar to previous studies [25, 37, 66] based on DNA-R full-length where Pakistan isolates have close homology with Egypt, India and Australia isolates. Similar analysis for other full-length BBTV components (DNA-U3, -S, -M and -N) indicated close homology of Pakistan isolates with India (Fig 2, S3–S5 Figs). While in DNA- C (S4 Fig) the order of homology of Pakistan isolates with other PIO isolates was Egypt, Sri Lanka and India respectively. Notably the phylogenetic analysis of concatenated sequences of BBTV (Fig 1) showed clustering of Pakistan isolates within PIO which shows its homology with members of this region. However, within the PIO, Pakistan isolates have close homology with isolates from India, Congo, Sri Lanka, Rwanda, Malawi, and Burundi. Based on entire genome wise and individual component wise (DNA-U3, -S, -M and -N) ML analyses it is evident that Pakistani isolates have very close relationships with the Indian isolates (except for DNA-R and DNA-C component where they are closer to Egyptian isolates than Indian). The previous phylogenetic analyses from Pakistan based on Neighbor-Joining phylogenetic analysis of DNA-R only however, indicated that Pakistani isolates were closer to Egyptian isolates [25, 37]. Based on more rigorous Maximum Likelihood phylogenetic analyses of entire as well as other genomic components (i.e. DNA-U3, -S, -M and -N) it is evident that Pakistani isolates are closely related to Indian isolates than rest of any country in the PIO group. The tendency of BBTV to induce genetic diversity is perhaps relevant for their ecological fitness. Selection pressure is an important estimator of evolutionary constraints imposed on coding regions [67]. The dN/dS ratio for coding regions (Table 5) of BBTV components showed that the population of DNA-R, DNA-S, DNA-M, DNA-C, and DNA-N components have strong purifying/negative selection while DNA-U3 have either positive or negative selection, which corroborates with the results of earlier studies for component R and S [30, 64]. In a comparative susceptibility study between two banana cultivars (Dwarf Brazilian AAB and Williams AAA) to BBTV, Dwarf Brazilian showed lower percentage of BBTV infection (39%) as compared to Williams (79%) in field experiments [68]. Based on their observation, Hooks and colleagues (2009) hypothesize that one or more morphological differences between the two cultivars might impact P. nigronervosa ability to inoculate BBTV. Banana pseudostem is waxy and banana aphid prefers to feed in pesudostem so the differences in wax content or composition between the two cultivars may lead to observed differences in virus transmission in their studied varieties. Furuya and colleagues (2012) [69] in a susceptibility study between Dwarf Cavendish (AAA) and Itobasho (BB) also observed reduced virus transmission in itobasho variety. Our analysis exhibited an excess of synonymous over nonsynonymous substitutions, indicating strong purifying (negative) selection as an additional mechanism constraining genetic variation [70]. BBTV is disseminated by aphid in a persistent manner, hence it is admissible that the respective constraint is inflicted on the CP gene to circumvent the accumulation of deleterious mutations which might be able to impede the virus-vector complex interactions [64, 71]. Majority of substitutions in DNA S were synonymous. As the sequence of DNA-S is conserved among all areas, the role of CP is crucial for the endurance of virus in the host plant [30]. Another possible sign of nucleotide encoding protein sequence having an impact on recombination patterns in BBTV is that the DNA-U3 component, which has no confirmed protein coding function, has a higher concentration of detectable recombination breakpoints (S2 and S3 Tables, Fig 3) than those of the known protein coding genes of other components. Remarkably, BBTV DNA-U3 seems to be most frequently exchanged by reassortment [31]. The presence of higher frequencies in the respective components depicts that it is substantially evolving neutrally without any risk that the recombinants might express defective chimeric proteins [72, 80]. Therefore, there is little conservation of coevolved epistatic interactions within the component. In comparison to the full-length sequences, low genetic diversity was observed in the coding regions (Table 6) delineating that recombination is not only confined to coding regions but it also proliferates to CR-M and CR-SL (Fig 3) in BBTV. To evaluate natural selection at the population level significant negative values of neutrality test in different BBTV components (Table 6) suggest population expansion that is consistent with the previous studies for BBTV components R and S [30]. This suggests that whenever a population is increasing the number of segregating sites will increase more rapidly than that of nucleotide diversity leading to a negative test value. Positive values in few populations of BBTV components indicate that during balancing selection, alleles are kept at intermediate frequencies because there will be more pairwise differences than segregating sites [73]. This is due to considerable sequence heterogeneity that imparts a reservoir of virus variants in the population. It enables a significant adaptation to the changing environmental conditions. Therefore, the gene flow provided by the recombination exploit the mechanism to ameliorate their evolutionary tendencies and local adaptation. Recombination is a major contributor to genetic diversity [18, 74] and genetic variations [75] in ssDNA viruses. Recombination analysis of BBTV components based on nucleotide sequence showed that DNA-U3 exhibits complex intra- and inter recombination patterns (Fig 3) with a recombination occurrence (Table 7) of about 16.8% which is about 9 times more than the least recombined DNA-S component (1.8%). While the recombination occurrence for other BBTV components ranged from 2.6%-7.92%. These observations suggest that recombination in components such as DNA-U3 may be selectively more favorable than recombination in components such as DNA-R, -S, -M, -C and -N [31]. DNA-R and DNA-S showed more frequent intergenomic recombination than intragenomic with a percentage of 1.73 and 1.45 respectively. Few incidences of recombination in DNA-R and DNA-S suggest that because of conserved nature and core functions [16, 64, 76] they are more prone to intergenomic rather than intragenomic recombination. The contribution of genetic diversity by recombination (Table 8) revealed by analyzing the recombined and without recombination regions for each recombination event detected in RDP. The analysis showed that overall, on average there was about a 56.25% (27 populations out of 48 populations) increase in genetic diversity due to recombination. While within the BBTV components DNA-N (population of Tonga) and DNA-U3 (population of Australia, Tonga, China, Philippines and Congo) showed on average 100% and 90% increased diversity of recombinant populations. which indicates that there is a possibility that with component reassortment, recombination is also a significant evolutionary process driving the diversification of BBTV [30, 64, 76–78]. The frequency of recombined regions in BBTV genome undergoing recombination (Fig 3) involved as many as seven times in intergenomic, while as many as four times in intragenomic recombination in various isolates of BBTV. DNA-U3 showed higher frequencies of recombined regions than other components of BBTV and overall recombined regions reside mostly in the CR-SL region. This region is a recombination hot spot in members of the geminiviridae [12, 20, 72] due to the production of a nick in the stem-loop by Rep which serves as the initiation site for rolling circle replication [79] through host DNA polymerase. Since BBTV also replicates by a rolling-circle mechanism [14], the CR-SL of DNA-U3 may similarly be a region subject to high levels of recombination. Conclusively, five full genomes have been sequenced from BBTV infected banana plants from Pakistan. Deep insight into the genetic diversity analysis of Pakistani and entire BBTV populations around the world showed some interesting contradictions in the diversity of functional regions which might be attributable to recombination. This study highlights the benefits of characterizing complete BBTV genomes rather than focusing on individual components from Pakistan. This analysis also complements the facts that different BBTV genomic components are diversifying at different rates and within one component, different parts have different levels of divergence. These differences suggest that geographic factors are also playing role in shaping the diversity and evolution of BBTV. Recombination analysis showed DNA-U3 a recombination hot component while CR-SL a recombination hotspot in BBTV genome. This in certain cases leads towards about four-fold increase in genetic diversity in the recombined population.

Abbreviations, accession numbers and geographical origin of Banana bunchy top virus isolates and their components used in the study.

(DOCX) Click here for additional data file.

Intergenomic recombination in Banana bunchy top virus genomes.

(DOCX) Click here for additional data file.

Intragenomic recombination in Banana bunchy top virus genomes.

(DOCX) Click here for additional data file.

Phylogenetic analysis of BBTV DNA-R illustrating intra and intergenomic recombination events at different nodes.

(TIF) Click here for additional data file.

Phylogenetic analysis of BBTV DNA-S illustrating intra and intergenomic recombination events at different nodes.

(TIF) Click here for additional data file.

Phylogenetic analysis of BBTV DNA-M illustrating intra and intergenomic recombination events at different nodes.

(TIF) Click here for additional data file.

Phylogenetic analysis of BBTV DNA-C illustrating intra and intergenomic recombination events at different nodes.

(TIF) Click here for additional data file.

Phylogenetic analysis of BBTV DNA-N illustrating intra and intergenomic recombination events at different nodes.

(TIF) Click here for additional data file.
  66 in total

1.  Evaluation of methods for detecting recombination from DNA sequences: computer simulations.

Authors:  D Posada; K A Crandall
Journal:  Proc Natl Acad Sci U S A       Date:  2001-11-20       Impact factor: 11.205

2.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.

Authors:  Kazutaka Katoh; Kazuharu Misawa; Kei-ichi Kuma; Takashi Miyata
Journal:  Nucleic Acids Res       Date:  2002-07-15       Impact factor: 16.971

3.  Widely conserved recombination patterns among single-stranded DNA viruses.

Authors:  P Lefeuvre; J-M Lett; A Varsani; D P Martin
Journal:  J Virol       Date:  2008-12-30       Impact factor: 5.103

Review 4.  Banana bunchy top: an economically important tropical plant virus disease.

Authors:  J L Dale
Journal:  Adv Virus Res       Date:  1987       Impact factor: 9.937

5.  MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms.

Authors:  Sudhir Kumar; Glen Stecher; Michael Li; Christina Knyaz; Koichiro Tamura
Journal:  Mol Biol Evol       Date:  2018-06-01       Impact factor: 16.240

6.  A single rep protein initiates replication of multiple genome components of faba bean necrotic yellows virus, a single-stranded DNA virus of plants.

Authors:  T Timchenko; F de Kouchkovsky; L Katul; C David; H J Vetten; B Gronenborn
Journal:  J Virol       Date:  1999-12       Impact factor: 5.103

7.  An iterated sequence in the genome of Banana bunchy top virus is essential for efficient replication.

Authors:  Virginia A Herrera-Valencia; Benjamin Dugdale; Robert M Harding; James L Dale
Journal:  J Gen Virol       Date:  2006-11       Impact factor: 3.891

8.  Low genetic diversity of Banana bunchy top virus, with a sub-regional pattern of variation, in Democratic Republic of Congo.

Authors:  L F T Mukwa; A Gillis; V Vanhese; G Romay; S Galzi; N Laboureau; A Kalonji-Mbuyi; M L Iskra-Caruana; C Bragard
Journal:  Virus Genes       Date:  2016-08-22       Impact factor: 2.332

9.  Rapid host adaptation by extensive recombination.

Authors:  Eric van der Walt; Edward P Rybicki; Arvind Varsani; J E Polston; Rosalind Billharz; Lara Donaldson; Adérito L Monjane; Darren P Martin
Journal:  J Gen Virol       Date:  2009-03       Impact factor: 3.891

10.  Evidence that banana bunchy top virus has a multiple component genome.

Authors:  T M Burns; R M Harding; J L Dale
Journal:  Arch Virol       Date:  1994       Impact factor: 2.574

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.