| Literature DB >> 31215403 |
Sonia Hussain1,2, Muhammad Farooq1, Hassan Jamil Malik1,2, Imran Amin1, Brian E Scheffler3, Jodi A Scheffler4, Shu-Sheng Liu5, Shahid Mansoor6.
Abstract
BACKGROUND: Whiteflies (Bemisia tabaci) are phloem sap-sucking pests that because of their broad host range and ability to transmit viruses damage crop plants worldwide. B. tabaci are now known to be a complex of cryptic species that differ from each other in many characteristics such as mode of interaction with viruses, invasiveness, and resistance to insecticides. Asia II 1 is an indigenous species found on the Indian sub-continent and south-east Asia while the species named as Middle East Asia Minor 1 (MEAM1), likely originated from the Middle-East and has spread worldwide in recent decades. The purpose of this study is to find genomic differences between these two species.Entities:
Keywords: Asia II 1; Insecticide; MEAM1; Sequencing; Virus; Whitefly
Mesh:
Year: 2019 PMID: 31215403 PMCID: PMC6582559 DOI: 10.1186/s12864-019-5877-9
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Mapping Summary
| Total NGS Library | 7 |
| Total Insert Size | 550 |
| Sequencer | IlluminaHiSeq2500 &MiSeq |
| Total Raw Data Generate | (HiSeq: 14GB) |
| (MiSeq: 16 Gb) | |
| Total: 31.15 Gb | |
| Average Coverage | 47.34 X |
| Average Coverage After Filtration | 34.52 X |
| Total No of Reads Generate | HiSeq: 142605246 |
| MiSeq: 56300942 | |
| Total: 198906188 | |
| Total No of Reads Quality Passed | 181,434,767 |
| Total No of Reads Mapped | 156,293,812 (86%) |
| Total No of Reads Mapped Properly | 149,439,368 (82%) |
| Reference Genome Covered | 88% |
| Mean Read Length | 159 bp |
Fig. 1Total coding regions are 15,664. All the coding regions with less than 10% each of their length are covered with at least 5X coverage, 53% of coding regions (8366) with full length are covered with at least 5X coverage
Variant Statistics
| Number of variants | 2,530,451 |
| Number of effects | 3,504,011 |
| Variant rate | 1 /235 bases |
| SNP | 2,327,972 |
| INS | 103,960 |
| DEL | 98,519 |
| Missense / Silent | 0.4257 |
| Ts/Tv ratio | 1.7147 |
| Heterozygous | 122,045 |
| Homozygous | 2,349,906 |
| Heterozygous/Homozygous | 0.05193612 |
Classification of effects and their number in the whole genome
| Type of Effects | No of Effects | ||
|---|---|---|---|
| Count | Percent | ||
| High Effect | Total | 1821 | 0.052 |
| Splice acceptor variant | 96 | 0.003 | |
| Splice donor variant | 135 | 0.004 | |
| Start loss | 56 | 0.002 | |
| Stop gain | 371 | 0.003 | |
| Stop lost | 96 | 0.003 | |
| Frame shift | 1102 | 0.031 | |
| Moderate Effect | Total | 35,583 | 2.724 |
| Conservative inframe deletion | 49 | 0.001 | |
| Conservative inframe insertion | 83 | 0.002 | |
| Disruptive inframe deletion | 98 | 0.003 | |
| Disruptive inframe insertion | 74 | 0.002 | |
| Missense variant | 35,285 | 1.004 | |
| Low Effect | Total | 95,439 | 1.015 |
| 5′ UTR premature start gain | 3020 | 0.086 | |
| Splice region variant | 10,980 | 0.312 | |
| Stop retained | 106 | 0.003 | |
| Synonymous variant | 83,150 | 2.366 | |
| Initiator codon/ non syn start | 15 | 0 | |
| Modifier Effects | Total | 3,371,168 | 96.209 |
| 3′ UTR | 174,811 | 4.974 | |
| 5′ UTR | 23,577 | 0.671 | |
| Downstream | 485,837 | 13.823 | |
| Upstream | 421,908 | 12.041 | |
| Non-coding transcript | 470 | 0.013 | |
| Intron variant | 1,479,087 | 42.082 | |
| Intergenic regions | 794,375 | 22.67 | |
Fig. 2Distribution of variants in different genic regions
Number of variant genes in each sub-class of high effects. One gene may have more than one effect and same gene may count in more than one category of high effects
| Type of High Effects | No of Genes |
|---|---|
| Splice acceptor variant | 92 |
| Splice donor variant | 129 |
| Start Loss | 55 |
| Stop gain | 346 |
| Stop lost | 91 |
| Frame shift | 765 |
| Total | 1294 |
Fig. 3Histogram representation of GO classification of genes with high impact variants. These genes are classified into CC: cellular component, BP: biological process and MF: molecular function. In the supplementary data, genes are listed, that belong to each of sub class of these three categories
Genes potentially involved in insecticide resistance with variants between Asia II 1 and MEAM1
| Gene ID | Annotation | Type of Variant | Variant Position |
|---|---|---|---|
| Bta08717 | Acetylcholinesterase-like protein | Frame Shift | Scaffold325:2419087 |
| Bta12286 | Cathepsin B | start lost | Scaffold562:2252138 |
| Bta06690 | Cathepsin F | stop gain | Scaffold2605:1316025 |
| Bta07152 | Cathepsin L-like protease | Frame Shift | Scaffold2737:56518 |
| Bta02560 | Cathepsin L-like protease | Frame Shift | Scaffold132:3684567 |
| Bta04696 | Cytochrome P450 | Splice acceptor | Scaffold1685:811440 |
| Bta06044 | Cytochrome P450 | Stop lost | Scaffold231:1494714 |
| Bta01556 | Phosphatidylethanolamine-binding protein | Frame Shift | Scaffold1224:613594 |
| Bta01355 | Phosphatidylethanolamine-binding protein 1 | start lost, splice acceptor variant | Scaffold1195:116803, Scaffold1195:118926 |
| Bta15207 | Phosphatidylethanolamine-binding protein 1 | Start lost | Scaffold923:587527 |
| Bta07891 | Phosphatidylethanolamine-binding protein 1 | splice donor | Scaffold300:6735496 |
| Bta12136 | Phosphatidylethanolamine-binding protein 1 | Frame Shift | Scaffold545:18333 |
| Bta13188 | Phosphatidylethanolamine-binding protein 1 | Splice acceptor | Scaffold637:1563358 |
| Bta02907 | Phosphatidylethanolamine-binding protein, putative | Frame Shift | Scaffold14:2449776 |
List of gene IDs which are potentially involved in TYLCV virus transmission and have genetic variants between Asia II 1 and MEAM1
| Gene ID | Annotation | Type of Variant | Variant Position |
|---|---|---|---|
| Bta10341 | Aldo-keto reductase | Frame Shift | Scaffold 403:3624744 |
| Bta04072 | Elicitin-like protein 6 | Frame Shift | Scaffold161:5952976 |
| Bta02276 | Ubiquitin carboxyl-terminal hydrolase | Frame Shift | Scaffold130:858376 |
| Bta14634 | Unknown protein | Frame Shift, Splice Donor Variant | scaffold811: 176696, Scaffold811:176710 |
List of gene IDs which are potentially involved in ToCV virus transmission and have genetic variants between Asia II 1 and MEAM 1
| Gene ID | Gene Name | Type of Variants | Scaffold:Snp Position |
|---|---|---|---|
| Bta08892 | 70 kDa heat shock protein | Frame shift | Scaffold3328:264318 |
| Bta01665 | AAA-ATPase-like domain-containing protein | frame shift | Scaffold1224:5022892 |
| Bta12603 | AAA-ATPase-like domain-containing protein | stop gain | Scaffold597:2078628 |
| Bta05346 | Afadin, putative | stop lost | Scaffold199:1272506 |
| Bta11978 | Alpha-glucosidase | Frame shift | Scaffold521:859550 |
| Bta01804 | Ankyrin repeat and LEM domain-containing protein | Frame shift | Scaffold123:4405622 |
| Bta01772 | Cathepsin B | Frame shift | Scaffold123:2832350 |
| Bta07402 | Cathepsin B | Frame shift | Scaffold2816:1342943 |
| Bta02120 | Cathepsin L-like protease | Frame shift | Scaffold1261:554552 |
| Bta02560 | Cathepsin L-like protease | Frame shift | Scaffold132:3684567 |
| Bta07152 | Cathepsin L-like protease | Frame shift | Scaffold2737:56518 |
| Bta06739 | Cation transport regulator-like protein 1 | Frame shift | Scaffold2605:2569958 |
| Bta08022 | CG13675, isoform D | Frame shift | Scaffold3040:3058531 |
| Bta04412 | CG14375 | Frame shift | Scaffold165:195426 |
| Bta03710 | CG17612, isoform A | splice acceptor variant | Scaffold155:194033 |
| Bta11746 | CG7120, isoform F | splice donor, frame shift | Scaffold52:4009995 |
| Bta10928 | Chromodomain Y-like protein 2 | Frame shift | Scaffold477:1214758 |
| Bta12891 | Citron Rho-interacting kinase | Frame shift | Scaffold613:2332964 |
| Bta02184 | Cystatin | frame shift | Scaffold128:1309680 |
| Bta07162 | DDB1-and CUL4-associated factor | start loss,stop gain, | Scaffold2737:410077, Scaffold2737:419749 |
| Bta07434 | DNA-directed RNA polymerase, omega subunit family protein | stop gain, frame shift | Scaffold2890:209245 |
| Bta14689 | Dolichyl-diphosphooligosaccharide--protein glycosyltransferase subunit STT3B | splice acceptor variant | Scaffold811:2521172 |
| Bta15680 | E3 ubiquitin-protein ligase TTC3 | Frame shift | Scaffold988:3252798 |
| Bta03681 | Eukaryotic translation initiation factor 3 subunit A | Frame shift | Scaffold1512:1481746 |
| Bta14560 | Galectin | Frame shift | Scaffold809:3964391 |
| Bta10009 | General transcription factor 3C polypeptide 2 | Frame shift | Scaffold382:2610001 |
| Bta04387 | GH16255p | splice acceptor variant | Scaffold1647:2597569 |
| Bta00770 | GK11989 | stop lost, frame shift | Scaffold1103:753116 |
| Bta01833 | Klarsicht, isoform E | stop gain, frame shift | Scaffold123:5600056 |
| Bta09051 | Laminin subunit beta-1 | stop gain | Scaffold338:1218247 |
| Bta01704 | Loquacious | Frame shift | Scaffold123:255863 |
| Bta03800 | Lysosomal-trafficking regulator | Frame shift | Scaffold155:4253425 |
| Bta05467 | Major royal jelly-related protein | stop lost | Scaffold199:6871677 |
| Bta05773 | NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 12 | stop gain | Scaffold2229:151223 |
| Bta15454 | Neuroendocrine convertase 1 | Frame shift | Scaffold959:3096344 |
| Bta10191 | Nidogen-2 | stop gain | Scaffold3978:2723 |
| Bta13257 | Protein patched | Frame shift | Scaffold641:259689 |
| Bta15368 | Protein phosphatase 1 L | Frame shift | Scaffold959:15812 |
| Bta13589 | Protein unc-45-like protein A | Frame shift | Scaffold651:2187902 |
| Bta02051 | Regucalcin | start loss | Scaffold1240:111531 |
| Bta10926 | Replication factor-a protein 1 | Frame shift | Scaffold477:1149858 |
| Bta12190 | Sortilin-related receptor | Frame shift | Scaffold545:2426274 |
| Bta02847 | Sulfotransferase | stop gain | Scaffold14:67127 |
| Bta08229 | Symplekin | splice acceptor variant | Scaffold317:609334 |
| Bta07946 | Terribly reduced optic lobes, isoform AN | splice acceptor | Scaffold3040:556936 |
| Bta05242 | Transcriptional protein SWT1 | Frame shift | Scaffold1898:321532 |
| Bta09856 | Trehalase | stop gain | Scaffold374:3016858 |
| Bta03298 | Trypsin-like serine protease | stop lost | Scaffold147:7182519 |
| Bta09090 | Tudor domain protein | stop lost | Scaffold338:1990664 |
| Bta08596 | Tudor domain-containing protein 1 | Frame shift | Scaffold322:4722919 |
| Bta03892 | Ubiquitin carboxyl-terminal hydrolase | start lost | Scaffold1580:568946 |
| Bta01518 | Unknown protein | stop gain | Scaffold1214:734963 |
| Bta01571 | Unknown protein | Frame shift | Scaffold1224:1139169 |
| Bta01615 | Unknown protein | Frame shift | Scaffold1224:3224397 |
| Bta02665 | Unknown protein | Frame shift | Scaffold1339:520464 |
| Bta02767 | Unknown protein | Frame shift | Scaffold137:1379435 |
| Bta02836 | Unknown protein | Frame shift | Scaffold139:1098948 |
| Bta02920 | Unknown protein | Frame shift | Scaffold14:3202002 |
| Bta03301 | Unknown protein | stop gain, frame shift | Scaffold147:7328434 |
| Bta03426 | Unknown protein | stop gain | Scaffold1496:690294 |
| Bta03435 | Unknown protein | Frame shift | Scaffold1496:1047497 |
| Bta04551 | Unknown protein | Frame shift | Scaffold165:5163918 |
| Bta04829 | Unknown protein | Frame shift | Scaffold17:652047 |
| Bta04921 | Unknown protein | Frame shift | Scaffold17:652047 |
| Bta05143 | Unknown protein | stop gain | Scaffold18461:1072084 |
| Bta05268 | Unknown protein | stop gain, frame shift | Scaffold1971:32055 |
| Bta05546 | Unknown protein | Frame shift | Scaffold2013:237841 |
| Bta05683 | Unknown protein | Frame shift | Scaffold2124:427571 |
| Bta05758 | Unknown protein | Frame shift | Scaffold2225:1204179 |
| Bta05761 | Unknown protein | stop gain | Scaffold2225:1258041 |
| Bta05893 | Unknown protein | stop gain | Scaffold226:1397519 |
| Bta06123 | Unknown protein | splice donor | Scaffold231:3876649 |
| Bta07727 | Unknown protein | Frame shift | Scaffold300:708005 |
| Bta07839 | Unknown protein | stop gain | Scaffold300:4527825 |
| Bta08000 | Unknown protein | stop gain, frame shift | Scaffold3040:2567504 |
| Bta08242 | Unknown protein | Frame shift | Scaffold317:1074159 |
| Bta08287 | Unknown protein | stop gain | Scaffold320:265593 |
| Bta08375 | Unknown protein | stop gain | Scaffold320:3813827 |
| Bta08462 | Unknown protein | Frame shift | Scaffold322:385722 |
| Bta08745 | Unknown protein | Frame shift | Scaffold325:3471439 |
| Bta10862 | Unknown protein | splice acceptor variant | Scaffold471:791307 |
| Bta11840 | Unknown protein | Frame shift | Scaffold52:7764853 |
| Bta12278 | Unknown protein | stop gain | Scaffold562:2009445 |
| Bta12668 | Unknown protein | start lost | Scaffold607:1307735 |
| Bta12727 | Unknown protein | Frame shift | Scaffold607:2833985 |
| Bta13235 | Unknown protein | Frame shift | Scaffold64:63239 |
| Bta13327 | Unknown protein | splice donor | Scaffold641:3718364 |
| Bta13745 | Unknown protein | Frame shift | Scaffold657:1097200 |
| Bta13859 | Unknown protein | stop gain | Scaffold67:1393372 |
| Bta13954 | Unknown protein | Frame shift | Scaffold699:810303 |
| Bta15302 | Unknown protein | Frame shift | Scaffold942:1732675 |
| Bta15415 | Unknown protein | Frame shift | Scaffold959:1270849 |
| Bta07758 | Zinc finger protein | Frame shift | Scaffold300:1778386 |
| Bta06175 | Zinc finger protein 227 | stop gain | Scaffold232:1822927 |
| Bta08766 | Zinc finger protein 34 | Frame shift | Scaffold325:3972542 |
| Bta11305 | Zinc finger protein 845 | Frame shift | Scaffold493:2884873 |
Structural Variants
| Type | scaffold | start | end | length | CNV | Genes* |
|---|---|---|---|---|---|---|
| duplication | Scaffold112 | 2,190,001 | 2,470,000 | 280,000 | 1.59861 | |
| duplication | Scaffold130 | 2,120,001 | 2,590,000 | 470,000 | 1.50412 | Bta02314 Bta02317 Bta02318 Bta02321 Bta02311 Bta02319 Bta02313 Bta02315 Bta02320 Bta02322 Bta02312 Bta02316 |
| duplication | Scaffold310 | 2,080,001 | 2,870,000 | 790,000 | 1.51561 | Bta08154 Bta08157 Bta08159 Bta08153 Bta08161 Bta08158 Bta08160 Bta08155 Bta08156 |
| duplication | Scaffold343 | 3,950,001 | 4,160,000 | 210,000 | 2.19297 | Bta09326 |
| duplication | Scaffold403 | 2,470,001 | 2,980,000 | 510,000 | 1.55897 | Bta10316 Bta10315 Bta10317 Bta10318 Bta10319 |
*Annotation of genes are described in Additional file 6