Literature DB >> 31470795

The Komodo dragon (Varanus komodoensis) genome and identification of innate immunity genes and clusters.

Monique L van Hoek1, M Dennis Prickett2, Robert E Settlage3, Lin Kang4, Pawel Michalak4,5,6, Kent A Vliet7, Barney M Bishop8.   

Abstract

BACKGROUND: We report the sequencing, assembly and analysis of the genome of the Komodo dragon (Varanus komodoensis), the largest extant lizard, with a focus on antimicrobial host-defense peptides. The Komodo dragon diet includes carrion, and a complex milieu of bacteria, including potentially pathogenic strains, has been detected in the saliva of wild dragons. They appear to be unaffected, suggesting that dragons have robust defenses against infection. While little information is available regarding the molecular biology of reptile immunity, it is believed that innate immunity, which employs antimicrobial host-defense peptides including defensins and cathelicidins, plays a more prominent role in reptile immunity than it does in mammals. .
RESULTS: High molecular weight genomic DNA was extracted from Komodo dragon blood cells. Subsequent sequencing and assembly of the genome from the collected DNA yielded a genome size of 1.6 Gb with 45x coverage, and the identification of 17,213 predicted genes. Through further analyses of the genome, we identified genes and gene-clusters corresponding to antimicrobial host-defense peptide genes. Multiple β-defensin-related gene clusters were identified, as well as a cluster of potential Komodo dragon ovodefensin genes located in close proximity to a cluster of Komodo dragon β-defensin genes. In addition to these defensins, multiple cathelicidin-like genes were also identified in the genome. Overall, 66 β-defensin genes, six ovodefensin genes and three cathelicidin genes were identified in the Komodo dragon genome.
CONCLUSIONS: Genes with important roles in host-defense and innate immunity were identified in this newly sequenced Komodo dragon genome, suggesting that these organisms have a robust innate immune system. Specifically, multiple Komodo antimicrobial peptide genes were identified. Importantly, many of the antimicrobial peptide genes were found in gene clusters. We found that these innate immunity genes are conserved among reptiles, and the organization is similar to that seen in other avian and reptilian species. Having the genome of this important squamate will allow researchers to learn more about reptilian gene families and will be a valuable resource for researchers studying the evolution and biology of the endangered Komodo dragon.

Entities:  

Keywords:  Antimicrobial peptide; Cathelicidin; Defensin; Gene cluster; Innate immunity; Komodo dragon; Monitor lizard; Varanus komodoensis

Mesh:

Substances:

Year:  2019        PMID: 31470795      PMCID: PMC6716921          DOI: 10.1186/s12864-019-6029-y

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Komodo dragon (Varanus komodoensis) is the world’s largest extant lizard, weighing up to 75–100 kg and measuring up to three meters in length. This species of monitor lizard, indigenous to Komodo and nearby islands in southern Indonesia (Fig. 1), is a relic of very large varanids that once populated Indonesia and Australia, most of which, along with other megafauna, died out after the Pleistocene [1]. Komodo dragons are endangered and actively conserved in zoos around the world and in Komodo National Park, a UNESCO World Heritage site, due to their vulnerable status [2]. They are believed to have evolved from other varanids in Australia, first appearing approximately 4 million years ago [1].
Fig. 1

Komodo dragon (Varanus komodoensis). Tujah, a large male Komodo dragon residing at the St. Augustine Alligator Farm Zoological Park, and the source of the DNA used in the present study. Photograph courtesy of the St. Augustine Alligator Farm Zoological Park in St. Augustine, Florida

Komodo dragon (Varanus komodoensis). Tujah, a large male Komodo dragon residing at the St. Augustine Alligator Farm Zoological Park, and the source of the DNA used in the present study. Photograph courtesy of the St. Augustine Alligator Farm Zoological Park in St. Augustine, Florida On their native Indonesian islands, Komodo dragons are the dominant terrestrial predators, even though their diet is based mainly on carrion [3]. The saliva of wild dragons (as opposed to zoo-kept animals) has been found to contain as many as 58 species of bacteria, many of which are pathogenic [3-5], which may also contribute to their effectiveness as predators. The lizards themselves appear to be unaffected by these bacteria, despite biting each other in fights and having bleeding gums during feedings. Furthermore, their plasma has been shown to have potent antimicrobial properties [6]. Thus, we hypothesized that Komodo dragons would have robust innate immunity and this innate immunity may be partially mediated by antimicrobial peptides. There are few studies regarding the reptilian immune response; however, as in mammals, reptiles have both an innate and adaptive immune response with cell mediated and humoral components. The reptile immune response is primarily dependent on an efficient innate immune response as the adaptive immune response does not consistently demonstrate evidence of a memory response [7]. Innate immunity, which includes chemokines and cytokines, provides the first line of defense against infection in higher vertebrates and is partially mediated by antimicrobial host-defense peptides [8, 9]. Antimicrobial host-defense peptides play complex roles in host defense against infection, with peptides exhibiting a range of pathogen-directed antimicrobial effects as well as host-directed immunomodulatory, chemotactic, inflammomodulatory and wound healing properties [8, 9]. The role and prevalence of antimicrobial peptides in the innate immune response of reptiles is only now being understood [10-15]. The plasma and cell extracts of crocodiles, alligators and Komodo dragons have been shown by several groups to have antimicrobial properties [6, 10, 16–20]. Recently, our group has made significant technical advances in developing a method for the identification and characterization of native antimicrobial peptides (BioProspector process), which we employed in the discovery of novel, non-canonical, active antimicrobial peptides in alligator plasma [21-23] and Komodo dragon plasma [24, 25]. The major classes of antimicrobial host-defense peptides in vertebrates include defensins and cathelicidins [8, 9]. These peptides are produced as part of the host-defense innate immune response by cells throughout the body, including epithelium, endothelium and white blood cells. Like most cationic antimicrobial host-defense peptides, defensins and cathelicidins tend to be relatively small peptides (< 100 amino acids in length) that simultaneously exhibit cationic and amphipathic qualities. They are generally membrane-active peptides that can disrupt bacterial membrane integrity as part of their antimicrobial mechanism. The cationic and amphipathic properties of these peptides contribute to their ability to preferentially target and disrupt bacterial membranes, which tend to be rich in anionic lipids, rather than host cell membranes, whose outer surfaces tend to be predominantly neutral in nature. The family of vertebrate defensin peptides includes alpha-, beta-, theta- and ovo-defensin subclasses, with alpha- and theta-defensins being unique to mammals and ovodefensins to birds and reptiles [26, 27]. Peptides in each subclass exhibit compact three-dimensional conformations stabilized by characteristic conserved patterns of cysteine residues and associated disulfide bond networks. The disulfide bond networks in each defensin subclass are critical to their ability to adopt well-defined structures, which are essential to their antimicrobial and host-directed properties. Cathelicidins are another major class of host-defense antimicrobial peptides and are unique to vertebrates [28]. The functional cathelicidin peptides exhibit diverse sequences and structures. However, they are distinguished by the presence of conserved N-terminal pre-pro-cathelin domains in the cathelicidin precursor proteins [29]. Cathelicidins are often packaged in azurotrophic granules in neutrophils and have been identified in chicken heterophils (avian white blood cells) [30]. The detailed characteristics of each peptide subclass are described in the relevant sections below. Advances in genomic techniques and the availability of sequenced genomes have rapidly expanded our understanding of the presence of innate immunity genes across different classes. The anole lizard has been found to have genes for most of the major classes of antimicrobial peptides that are produced by mammals and other vertebrates, including β-defensins and cathelicidins [13]. As in the case of birds, genes for α-defensin peptides have not been reported to date in reptiles; this class of antimicrobial peptides appears to be restricted to mammals [13]. However, the status of antimicrobial peptide genes in the Komodo dragon has not been determined, due to the lack of a published Komodo dragon genome. Their tolerance to regular exposure to potentially pathogenic bacteria in their saliva and apparent resistance to bacterial infection suggests that Komodo dragon’s evolutionary adaptations may extend to their innate immunity and the host-defense peptides that they employ. As part of our effort to extend our earlier study of Komodo dragon cationic antimicrobial peptides [24], genomic DNA and RNA were obtained from Komodo dragon blood samples and sequenced in order to provide a Komodo dragon-specific DNA sequence database to facilitate de novo peptide sequencing [24]. Here, we report the sequencing, assembly, and analysis of the Komodo dragon genome. This work will also provide evidence of the robust innate immunity of these lizards and will be a valuable resource for researchers studying the evolution and the biology of the endangered Komodo dragon. The analysis reported here is focused on genes associated with innate immunity and host-defense peptides. However, further investigation of the Komodo dragon genome may have broader impact on our understanding of the biology and evolution of reptiles.

Results and discussion

Cell types in Komodo dragon blood

A sample of blood was obtained from a Komodo dragon named Tujah at the Saint Augustine Alligator Farm Zoological Park in accordance with required safety and regulatory procedures, and with appropriate approvals. At the time of collection, we were interested in collecting both genomic DNA for sequencing as well as mRNA to generate a cDNA library to facilitate our proteomic studies. In birds, the heterophils (white blood cells) are known to express multiple antimicrobial peptides [30]. Antimicrobial peptides identified from chicken heterophils exhibit significant antimicrobial [31, 32] and host-directed immunomodulatory activities [29]. Accordingly, after obtaining an initial sample of fresh Komodo dragon blood, we allowed the white blood cells to settle out of the blood and collected them because they were likely to be involved with antimicrobial peptide expression. The collected Komodo dragon white blood cells were then divided evenly, with half being processed for the isolation of genomic DNA in preparation for sequencing and library generation, and the other half reserved for mRNA extraction for our proteomic studies. We then performed smears and identified the various cell types that we observed. Immune cell identification in Komodo dragon blood is challenging due to limited published literature for reference. The various cell types that were observed in Wright-stained blood smears are shown in Fig. 2. We identified these cells based on similarity to the immune cells we had previously identified in the American alligator blood [12]. Of interest were the large and elongated nucleated red blood cells of this reptile. In addition, we were able to identify heterophils (similar to granulocytes), a probable source of cathelicidin peptides, as well as monocyte and lymphocyte cells.
Fig. 2

Komodo dragon red blood cells and immune cells. Blood cells from Komodo dragon were visualized by Wright stain and imaged at 40x. Cell types are identified as: A. nucleated red blood cell, B. monocyte, C. lymphocyte, and D. heterophil

Komodo dragon red blood cells and immune cells. Blood cells from Komodo dragon were visualized by Wright stain and imaged at 40x. Cell types are identified as: A. nucleated red blood cell, B. monocyte, C. lymphocyte, and D. heterophil A second sample of Komodo dragon blood was later collected and processed for genomic DNA extraction by Dovetail Genomics for additional sequencing. The researchers at Dovetail Genomics did not separate white blood cells, and instead extracted DNA from cells pelleted directly from whole blood.

Assembly and annotation of the Komodo dragon genome

Previous analyses of Komodo dragon erythrocytes using flow cytometry estimated the genome to be approximately 1.93 Gb in size [33]. Using deep Illumina sequencing and Dovetail approaches, we obtained a draft genome assembly that was 1.60 Gb large, similar to the genome size of A. carolinensis lizard genome which is 1.78 Gb [34]. The draft assembly contains 67,605 scaffolds with N50 of 23.2 Mb (Table 1). A total of 17,213 genes were predicted, and 16,757 (97.35%) of them were annotated. Completeness estimates with CEGMA [35] were 56% (‘complete’) and 94% (‘partial’). The estimated percentage of repeats in the genome is 35.05% with the majority being LINEs (38.4%) and SINEs (5.56%) (Additional file 1: Fig. S1 & Additional file 2: Table S1). Genomic data will be available at NCBI with raw sequencing reads deposited in the Sequence Read Archive (#SRP161190), and the genome assembly at DDBJ/ENA/GenBank under the accession #VEXN00000000. The assembly version described in this paper is VEXN01000000.
Table 1

Genome assembly attributes

Assembly Attributes
Total Size1599,705,180 bp
No. of Scaffold67,605
No. of Scaffold > 5 K2608
Scaffold N5023,221,552 bp
GC Content44.30%
No. of Predicted Gene17,213
No. (Percentage) of Annotated Gene16,757 (97.35%)
Genome assembly attributes

Identification of potential innate immunity and antimicrobial peptide genes

Innate immunity in reptiles is a critical aspect of their evolutionary success, but it remains poorly understood in these animals. Innate immunity is defined as those aspects of immunity that are not antibodies and not T-cells. Innate immune responses to invading pathogens can include the expression of cytokines; the activation and recruitment of macrophages, leukocytes and other white blood cells; and the expression of antimicrobial peptides such as defensins and cathelicidins [13, 15]. We have taken a genomics-based approach [36] to identifying innate immunity genes in the Komodo dragon genome in this work. We have sequenced the Komodo genome and examined it for genes and clusters of important innate immunity antimicrobial peptide genes (β-defensins, ovodefensins and cathelicidins), which are likely involved in expressions of innate immunity in this giant lizard.

β-Defensin and related genes in Komodo genome

Defensins are one example of disulfide-stabilized antimicrobial peptides, with β-defensins being a uniquely vertebrate family of disulfide-stabilized, cationic antimicrobial peptides involved in the resistance to microbial colonization at epithelial surfaces [37-39]. The β-defensin peptides are defined by a characteristic six-cysteine motif with conserved cysteine residue spacing (C–X6–C–X (3–5)–C–X (8–10)–C–X6–CC) [40] and associated disulfide bonding pattern (Cys1-Cys5, Cys2-Cys4 and Cys3-Cys6); however, variations in the number of and spacing between cysteine residues has been observed. As with other cationic antimicrobial peptides, β-defensins typically exhibit a net positive (cationic, basic) charge. One of the first extensive reports of an in vivo role for β-defensin peptide expression in reptiles is the inducible expression of β-defensins in wounded anole lizards (Anolis carolinensis) [10, 11, 14, 41–43]. Reptile neutrophils appear to have granules that contain both cathelicidin-like peptides as well as β-defensin peptides. β-defensin-like peptides are also found in reptile eggs [26]. It is well-known that some species of lizard can lose their tails as a method of predator escape, and that these tails then regenerate from the wound site without inflammation or infection. β-defensin peptides are expressed both within the azurophilic granulocytes in the wound-bed as well as in the associated epithelium [41, 43] and are observed in phagosomes containing degraded bacteria. There is a distinct lack of inflammation in the wound, which is associated with regeneration, and two β-defensins in particular are expressed at high levels in the healing tissues [10, 42] Overall, there appears to be a significant role for the β-defensins in the wound healing and regeneration in the anole lizard [44]. β-defensin genes have been generally observed to reside in clusters within the genomes of vertebrates [45, 46]. In humans, as many as 33 β-defensin genes were identified in five clusters [47, 48]. Recently, analyses of the genomes of several avian species including duck, zebra finch and chicken revealed that the genome of each species contained a β-defensin cluster [49-52]. A β-defensin-like gene cluster has recently been identified in the anole lizard (Prickett, M.D., unpublished work in progress), which is closely related to the Komodo dragon [13]. Interestingly, the cathepsin B gene (CTSB) has been identified as a strong marker for β-defensin clusters in humans, mice, and chickens [51]. Thus, we examined the Komodo genome for the cathepsin B gene (CTSB) as a potential marker to aid in the identification of the β-defensin cluster(s) therein. Through these analyses, we identified a total of 66 potential β-defensin genes in the Komodo dragon genome, of which 18 are thought to be Komodo dragon-specific β-defensin genes (Table 2). The β-defensin genes identified from the Komodo dragon genome exhibit variations in cysteine spacing, gene size, the number of cysteine residues that comprise the β-defensin domain, as well as the number of β-defensin domains. With respect to the conserved cysteine residue spacing, especially at the end (C–X6–C–X (3–5)–C–X (8–10)–C–X6–CC), we found considerable variability in our analysis of the β-defensin genes in the Komodo dragon genome, in that five Komodo dragon β-defensin genes have seven resides between the last cysteines, 16 have six residues between the last cysteines, 42 have five residues between the last cysteines, and three Komodo dragon β-defensin genes exhibit more complex cysteine-residue spacing patterns (Table 2).
Table 2

Identified Komodo dragon Defensin genes grouped based on scaffold locations of gene clusters

NameClosest Anole OrthologClosest Snake OrthologParalogsSequences
Scaffold 210This scaffold contains the cluster of six β-ovodefensin genes, one intercluster β defensin, and 15 β-defensin genes that are similar to anole lizard β-defensins. The cluster begins with the CTSB gene. These genes include large β defensins.
VkBDic1_VARCOLzBD81SnBDic1,2GLAKRKPRSRRECYSLDGSCYLGRCPSVLKKYGWCGTLKRCCIR
VkBD1_VARKOLzBD1SnBD1,AMGDLYDSLECHNHHGHCRRICFHNERSIGTCTNRRQLCCK
VkBD2_VARKOLzBD2SnBD2,VkBD3,4,5,6GKGDFVTLGCLFRGGACKTDTCRDNEVQIGNCSNTKQLCCKKTR
VkBD3_VARKOLzBD3VkBD2,4,5,6GALQDKKGSCDLALHNCRMGFCSQEEIPTGEFCFEPVILCCKHLPEKSKSPERLEGVASFADVLRNLL
VkBD4_VARKOLzBD4SnBD3,VkBD2,3,5,6GGGQREARFVSHCLRRGGICRYDDCSEGEEQIGTCYHHTMICCRDEVM
VkBD5_VARKOLzBD4SnBD3,4,VkBD2,3,4,6GASLNLSARICWQRGGRCRSSPCYYDEAQIGTCYHARLKCCRQTE
VkBD6_VARKOLzBD5VkBD2,3,4,5GHSNEIDTQQCLKDKGTCHPTICPTQKMSKGTCYSGAQLCCVGESVHMCVRPFFPSCCFSRLSCSRSGGL
VkBD7_VARKOLzBD6,82SnBD5,6,VkBD8GQPHICKRMGGYCQITSKPCPYGFLPTTCGIFQTCCAKKPPTTNDCEKLGGFCRVPLHEKCPSGQDLPANCGINARCCKPETA
VkBD8_VARKOLzBD6,82SnBD5,6,VkBD7GQVPACTKLGGHCVTPLTVTCPYGSLGANCGINAICCAR
VkBD9_VARKOTIGTFLVIQAKLCRKAGGFCISRFAHCPSSEIILIRCRSAQKCCKQ
VkBD10_VARKOLzBD11SnBD7,EGCPVEAMLLTCTSLGGFCMLEPNQTCPSGLILDVPCHFKRRCCSKKEV
VkBD11_VARKOGQSLSCRELGGLCVALADDDCDSEEILPADCGRLTLCCKG
VkBD12_VARKOLzBD12–17

SnBD8

(python only)

VkBD13,14EQALRRVPPVQIRSIRRYCNGRDGYCLDRKFFCESGFVYKEEYNDCMFQNKNKCCVIQKSK
VkBD13_VARKOLzBD12–17SnBD8 (python only)VkBD12,14DPSLPSKKAATSRFTTFACNGNFGYCLPRNWRCFSGLVYKEEFNDCPLPDIFKCCL
VkBD14_VARKOLzBD12–17SnBD8 (python only)VkBD12,13NKGLPTTTRVGKIPCLGPWGHCFFRKYRCASGFVMKERFNNCPNTRTLKCCVL
VkBD15_VARKOGFLHVLARHCQKLGGTCKPHKAQCKVRIQILVPCGPGRKCCL
Scaffold 286We identified 40 β defensins genes, including multiple large defensins (with some peptides containing eight instead of typical six cysteine residues) and peptides with multiple domains.
VkBD16_VARKOVkBD17, 18GGGLQDGEDPPGPRASCNSGRGYCLRCGAACPSGQIYVYNDCANPCSNKCCVRR
VkBD17_VARKOVkBD16,18GQSSYWGHLSCNSGRGYCLPCHLRCPYGYYYYNDCPGQCRYKCCVRR
VkBD18_VARKOVkBD16,17APDRVPHEQAPPCNHGLGYCHPCSLPCTSGKYYYYNDCDKPCANRCCTKQ
VkBD19_VARKOGTELMEEEYLARAILCHSGQGVCRPRDLQCPSGLTYIYNDCPKTELYKCCAK
VkBD20_VARKOEQNRLARSKCVCRKFCYPNEYPMGMCAIILVPMCCTFDPQGGS
VkBD21_VARKOVkBD22VSDKSSSCRSAGGKCYLILCPRGTGRIGKCSFTHVCCK
VkBD22_VARKOVkBD21GRTEVHDVRMCKLYGGECFLLVCPPGRAFIGKCSRLNVCCRG
VkBD23_VARKOLzBD26SnBD9, 10 (Colubroidea only)GLAPDTPHDRSSCELNSGFCYSSNCPRCFGQYGFCDDKHPSCCVRDNSIPGCEVVTTPTYDKSSKTTAVPTTSETAPPQTAPPQTAPPQTAPPQTAPDESDKKC
VkBD24_VARKOLzBD27GYGLVPAGLSDTIKCRTTPRSFCKAVVCPPTFEPTGTCFGGSMHCCSK
VkBD25_VARKOGEFANISNYYQCRHAGAMCAIFKCPAPFNSIGKCGIFKPCCI
VkBD26_VARKOGSTEIIGEAKCAEYTGTCRLFECPISSMVENIGYCREDYVCCIR
VkBD27_VARKOLzBD31DGGTCRQLKGFCSTRRCPDKTFEVLGRCTPRKACCQKKANA
VkBD28_VARKOLzBD32,33SnBD13,14GNAQAAEPDTLQCVRAGGSCNFGECRPPSVASGTCKGETLNCCKW
VkBD29_VARKOLzBD34,35, 37,73VkBD30,31GHSVSNEAECKREGGFCIIAIGNPCRFPYGFIGKCSWWKFCCK
VkBD30_VARKOLzBD34,35, 37,73VkBD29,31GSARWVKNEDECKEDGGTCLAIVIGGCGPLLQIGICGLGRNCCK
VkBD31_VARKOLzBD34,35, 37,73VkBD29,30GCSISDRKECREGGGFCLPFLPNMCQVFSIIIGKCSRWSYCCKW
VkBD32_VARKOASTINTSQQCKNAGGHCIWSSCGFPLCPVGRCGYWSLCCKA
VkBD33_VARKOGTPASPVRGPLQCVKQGGFCMSGNCRFPLRKLGTCHRFKACCIR
VkBD34_VARKOLzBD39SnBD15GIYTTLDLEMLSCLSAKDAYCKTGHCLEYSVSLGKCNSYLSCCRSMINNLWRCAIAKGYCSMWTCPPPLIKNGRCSRDSPCCVQ
VkBD35_VARKOLzBD41,43SnBD17 (Colubroidea only)VkBD36GLAMRTERIDSAEECAKIQGFCTSEPECNTVYPIQGTCGEGTLCCLE
VkBD36_VARKOLzBD41,43SnBD17 (Colubroidea only)VkBD35TGYTCVLEEILTKEDCELIGATCIDGEDCNPPFPSQGTCGEGTVCCIP
VkBD37_VARKOLzBD50GLTTYVTSPEECESIKGFCTPRLCQENFRTLGECSAGIPCCTR
VkBD38_VARKOSnBD17 (python only)GFTRDIWDRPTCRSSQGFCWLITCPWPFAKHIGECIWPILRCCA
VkBD39_VARKOLzBD52,53,55 (2 defensin domains)VkBD43DLAADCRLREGRCTRTSCSKTDIYLGECAKEIQCCKRDPALHCERQGGLCTQASCTGSDINLGPCGRGLKCCKKDPVAHCEAQGGKCVQNSCPTSDVFLGRCGNGFQCCKT
VkBD40_VARKOLzBD52–55GQPQKYIEKKECMDFGAQCTKAHCGEDYIFYGFCRTGVACCKR
VkBD41_VARKOLzBD57SnBD22 (pseudo-gene)GVALPFQDDSVCIERGGRCMHLPCHPLRRIGRCLLNTYCCH
VkBD42_VARKOLzBD58GSTQFIKLAHKCLDAGGRCQKQACDFRKSIGRCNDQELCCKR
VkBD43_VARKOLzBD52,53,55 (2 defensin domains)VkBD39DPVASCASQGGKCTPDICPSNFVLVGQCDQKLLCCKRDPVAHCKSQGGRCMQSRCASNNEYLGDCGPAVWCCKM
VkBD44_VARKOGFAHMGINFREQCQWAEGKCQLYHCPAGWKKIGKCSKVVPCCSDK
VkBD45_VARKOGSNPKQCEQKRAQCMIVCPRTHKKVGHCGRSLSCCAQR
VkBD46_VARKOGYTSAVARCRRRGGKCHFGRCPQGKNRIGLCFLGTPCCTR
VkBD47_VARKOLzBD63GFTTEIRRARTCIEANGKCRLATCLHDPWERIGKCGDKRFCCKK
VkBD48_VARKOVkBD50,52, 53,54GHTFGFLRCRRNGGHCFPRGCPVGWKPRGRCIGRFTCCVRQGKRPAMEKARQ
VkBD49_VARKOGHTFGVGECRDQFGSCKFAVCRNGWRRVGWCFLAVPCCRR
VkBD50_VARKOVkBD48,52, 53,54GTGQTPGALRCQRNGGLCFPWGCPPGWRPRGRCFGRFTCCIR
VkBD51_VARKOGRTSGLVACQRKRGICLVGRCRPGWRQVGWCSRGVSCCKW
VkBD52_VARKOVkBD48,50, 53,54GQKPVILRCLRNGGRCFSWRCPSGWMLRGPCLRPFICCVR
VkBD53_VARKOVkBD48,50, 52,54GRTFGPPRCRRIGGFCIPRRCPSGWRQRGPCLGPFTCCKR
VkBD54_VARKOVkBD48,50, 52,53GTARTFGPARCRRIGGLCIPWGCPSGWRRRGRCFGRFTCCKR
VkBD55_VARKOGGSSDGQPVSAPSSARLMRRCRSFYSPCKTCPVLLKELSCQVVHHPCCPPLKLLHL
Scaffold 7We identified seven β-defensins on the edge of a large scaffold. This cluster ends with the gene XHO1
VkBD56_VARKOVkBD58,60GCLGVYEKGRRQCCTRNGHCYFLFCKKGTVKIGTCNFFSRCCSR
VkBD57_VARKOGLAGENELDQRGCRRRCLVRHCTRRGRFYVCRRYYICCGR
VkBD58_VARKOVkBD56,60ECLGVYVKGRRQCRTQNGHCYFFYCKKDTYKIGICNFFTICCSR
VkBD59_VARKOVSSPEECGRHGGLRLAGPRFSCWSGLLTGHCDAKHRCCKR
VkBD60_VARKOVkBD56,58ECHGMYVNGKRQCHTQNGHCYFLYCKKDTYKIGTCNFFSRCCSR
VkBD61_VARKOLzBD77GFSWSNDATLRCRYTGGYCDSLICRWPLRNAGFRCKNNRPCCKR
VkBD62_VARKOLzBD78GSARTPRSDLECQVHKGMCFPHGCPAQWSRIGSCSVRKHRCCR
Scaffold 45We identified three β-defensin genes and TRAM2 on the edge of a large scaffold. The gene VKBD80 may be a duplication. TRAM2represents the end of this cluster.
VkBD79_VARKOPGPYPCNWVCHTSPGVQSVCSAMPVRLKLLLGLASVCISVSTLCCFRQH
VkBD80a_VARKOLzBD80SnBD40,GRCRRLKGVCRHTLCHPVEVYVGRCNNGMGNCCVDDAEDIRKHIK
VkBD80b_VARKOLzBD80SnBD40,GRRRRLKGVCRHTLCHPVEVYVGRCNNGMGNCCVDDAEDIRKHIK

Reptile β-defensin orthologs are also indicated (where they exist) as well as the β-defensin peptide sequences encoded by each gene. Overall, 66 β-defensin genes were identified, of which 4 encode multiple defensin domains (VkBD7, VkBD34, VkBD39, and VkBD43)

Identified Komodo dragon Defensin genes grouped based on scaffold locations of gene clusters SnBD8 (python only) Reptile β-defensin orthologs are also indicated (where they exist) as well as the β-defensin peptide sequences encoded by each gene. Overall, 66 β-defensin genes were identified, of which 4 encode multiple defensin domains (VkBD7, VkBD34, VkBD39, and VkBD43) As with birds and other reptiles, the majority of Komodo dragon defensin genes appear to reside in two separate clusters within the same syntenic block (Fig. 3). One cluster is a β-ovodefensin cluster flanked on one end by the gene for XK, Kell blood group complex subunit-related family, member 6 (XKR6) and on the other end by the gene for Myotubularin related protein 9 (MTMR9). The intercluster region of circa 400,000 bp includes the genes for Family with sequence similarity 167, member A (FAM167A); BLK proto-oncogene, Src family tyrosine kinase (BLK); Farnesyl-diphosphate farnesyl transferase 1 (FDFT1); and CTSB (cathepsin B), which is a flanking gene for the β-defensin cluster (Fig. 3). In birds, turtles, and crocodilians, the other end of the β-defensin cluster is followed by the gene for Translocation associated membrane protein 2 (TRAM2). As is the case with all of the other squamate (lizards and snakes) genomes surveyed, the flanking gene for the end of the β-defensin cluster cannot be definitively determined at present as there are no squamate genomes with intact clusters available.
Fig. 3

β-defensin gene family clusters. Scaffold locations of the identified Komodo dragon defensin and ovodefensin genes, highlighting the defensin and ovodefensin clusters in the Komodo dragon genome

β-defensin gene family clusters. Scaffold locations of the identified Komodo dragon defensin and ovodefensin genes, highlighting the defensin and ovodefensin clusters in the Komodo dragon genome The end of the cluster could either be flanked by XPO1 or TRAM2 or neither. Two of the three genes found on scaffold 45 with TRAM2 (VkBD80a, VkBD80b) are nearly identical and potentially the result of an assembly artifact. The genes are orthologs for the final gene in the avian, turtle, and crocodilian β-defensin clusters. The anole ortholog for this gene is isolated and is not associated with TRAM2, XPO1, nor any other β-defensins, and there are no β-defensins found in the proximity of anole TRAM2. Two of the seven genes associated with XPO1 have orthologs with one of the five anole genes associated with XPO1 but it cannot be determined in either species if these are part of the rest of the β-defensin cluster or part of an additional cluster. The snake orthologs are associated with TRAM2 but are not part of the cluster.

Structural diversity

Diversity can be seen in variations in structure of the β-defensin domain. Typically, a β-defensin consists of 2–3 exons: a signal peptide, an exon with the propiece and β-defensin domain with six cysteines, and in some cases, a short third exon. Variations in the number of β-defensin domains, exon size, exon number, atypical spacing of cysteines, and/or the number of cysteines in the β-defensin domain can be found in all reptilian species surveyed (unpublished). There are three β-defensins with two defensin domains (VkBD7, VkBD34, and VkBD43) and one with three defensin domains (VkBD39). The Komodo dragon β-defensin genes VkBD12, VkBD13, and VkBD14 and their orthologs in anoles have atypically large exons. The group of β-defensins between VkBD16 and VkBD21 also have atypically large exons. Atypical spacing between cysteine residues is found in three β-defensins, VkBD20 (1–3–9-7), VkBD57 (3–4–8-5), and VkBD79 (3–10–16-6). There are four β-defensins with additional cysteine residues in the β-defensin domain: VkBD6 with 10 cysteine residues, and a group of three β-defensins, VkBD16, VkBD17, and VkBD18, with eight cysteine residues. The two β-defensin domains of VkBD7 are homologous to the one β-defensin domain of VkBD8 with orthologs in other species of Squamata. In the anole lizard A. carolinensis there are two orthologs, LzBD6 with one β-defensin domain and the non-cluster LzBD82 with two β-defensin domains. The orthologs in snakes (SnBD5 and SnBD6) have one β-defensin domain. VkBD34 is an ortholog of LzBD39 in anoles and SnBD15 in snakes. VkBD39 and VkBD43 consist of three and two homologous β-defensin domains respectively, which are homologous to the third exons of LzBD52, LzBD53, and LzBD55, all of which have two non-homologous β-defensin domains. VkBD40 with one β-defensin domain is homologous to the second exons of LzBD52, LzBD53, LzBD54 (with one defensin domain), and LzBD55. An increase in the number of cysteines in the β-defensin domain results in the possibly of forming additional disulfide bridges. Examples of this variation can be found in the psittacine β-defensin, Psittaciforme AvBD12 [52]. The β-defensin domain of VkBD6 appears to consist of 10 cysteines, four of which are part of an extension after a typical β-defensin domain with an additional paired cysteine (C-X6-C-X4-C-X9-C-X6-CC-X7-C-X7-CC-X5-C). The group of Komodo β-defensins VkBD16, VkBD17, and VkBD18, in addition to having an atypical cysteine spacing, also have eight cysteines within a typical number of residues. The β-defensin following this group, VkBD19, is a paralog of these three genes; however, the β-defensin domain contains the more typical six cysteine residues. The gene structures of these Komodo β-defensin genes are subject to confirmation with supporting evidence. There are a number of atypical structure elements in anole lizards including additional non β-defensin domain exons or larger exons. Analyses of the peptide sequences encoded by the newly identified Komodo dragon β-defensin genes revealed that the majority (53 out of 66) of them are predicted to have a net positive charge at physiological conditions, as is typical for this class of antimicrobial peptide (Table 3). However, it is notable that four peptides (VkBD10, VkBD28, VkBD30 and VkBD34) are predicted to be weakly cationic or neutral (+ 0.5–0) at pH 7, while nine peptides (VkBD3, VkBD4, VkBD11, VkBD19, VkBD23, VkBD26, VkBD35, VkBD36 and VkBD37) are predicted to be weakly to strongly anionic. These findings suggest while these peptides exhibit canonical β-defensin structural features and reside in β-defensin gene clusters, one or more of these genes may not encode for β-defensin-like peptides or canonical β-defensins, because β-defensins typically are cationic and their positive charge contributes towards their antimicrobial activity.
Table 3

Physical properties of identified β-defensin peptides

Defensin domainLengthMolecular weightCys Residue SpacingChargeIsoelectric point
VkBD1_VARKO414816.466–3–9-64.57.9826
VkBD2_VARKO444770.496–4–9-648.5177
VkBD3_VARKO687510.696–4–9-6-15.635
VkBD4_VARKO485466.136–4–9-6−0.56.1876
VkBD5_VARKO455112.86–4–9-64.58.5056
VkBD6_VARKO707548.76–4–9-7-7 − 54.57.9825
VkBD7_VARKO838922.436–6–8-5/6–6–7-568.2233
VkBD8_VARKO393850.546–6–7-52.57.9391
VkBD9_VARKO465087.156–6–9-57.59.7437
VkBD10_VARKO495346.376–6–7-50.56.8341
VkBD11_VARKO404130.686–7–9-5-53.8346
VkBD12_VARKO617362.56–6–12-769.0239
VkBD13_VARKO566373.346–6–12-727.9229
VkBD14_VARKO536051.226–6–12-79.510.3828
VkBD15_VARKO424598.666–6–9-69.510.246
VkBD16_VARKO545598.256–2–3-10-3-327.796
VkBD17_VARKO475538.286–2–3-9-3-368.4828
VkBD18_VARKO505606.316–2–3-10-3-32.57.512
VkBD19_VARKO525871.767–6–10-7−0.55.6992
VkBD20_VARKO434861.771–3–9-727.9237
VkBD21_VARKO384009.736–4–9-56.59.1233
VkBD22_VARKO424645.556–4–9-54.58.5176
VkBD23_VARKO10411,026.096–4–9-6−34.6839
VkBD24_VARKO485012.867–4–9-63.58.2461
VkBD25_VARKO424575.416–4–9-53.58.2423
VkBD26_VARKO444914.626–4–11-5−34.2446
VkBD27_VARKO414535.296–4–10 − 579.6914
VkBD28_VARKO454618.166–4–9-605.9979
VkBD29_VARKO444958.86–7–9-54.58.5135
VkBD30_VARKO444538.296–8–9-506.0306
VkBD31_VARKO445009.926–7–10-538.1509
VkBD32_VARKO414339.016–4–9-53.58.1572
VkBD33_VARKO444782.746–4–9-58.510.5052
VkBD34_VARKO849330.957–4–9-5/7–4–9-54.58.0587
VkBD35_VARKO475058.76–5–9-5-53.9254
VkBD36_VARKO485026.656–5–9-5−83.371
VkBD37_VARKO434702.366–4–9-5−14.7382
VkBD38_VARKO445172.096–4–10-62.57.9416
VkBD39_VARKO11111,483.566–4–9-5/6–4–9-5/6–4–9-557.86
VkBD40_VARKO434914.686–4–9-52.57.9327
VkBD41_VARKO414641.486–4–8-53.57.9625
VkBD42_VARKO424717.476–4–8-55.58.8017
VkBD43_VARKO747874.16–4–9-5/6–4–9-52.57.65
VkBD44_VARKO455073.966–4–9-558.5147
VkBD45_VARKO384229.986–3–9-589.7441
VkBD46_VARKO404388.176–4–9-59.510.8323
VkBD47_VARKO445072.966–4–10-56.59.154
VkBD48_VARKO525934.016–4–9-51311.9504
VkBD49_VARKO404566.326–4–9-55.58.814
VkBD50_VARKO424627.46–4–9-5710.4698
VkBD51_VARKO404467.326–4–9-5911.3419
VkBD52_VARKO404623.656–4–9-5811.1693
VkBD53_VARKO404537.416–4–9-51011.9738
VkBD54_VARKO424760.656–4–9-51112.1412
VkBD55_VARKO566092.226–2–9-66.58.7963
VkBD56_VARKO445031.943–4–8-58.59.2085
VkBD57_VARKO404852.666–4–9-58.59.9206
VkBD58_VARKO445254.166–7–8-56.58.8379
VkBD59_VARKO404369.026–4–9-55.58.6986
VkBD60_VARKO4452326–4–10-56.58.6105
VkBD61_VARKO445146.916–4–9-679.5242
VkBD62_VARKO434829.593–10–16-67.59.1865
VkBD79_VARKO495260.296–4–9-648.2526
VkBD80a_VARKO455112.966–4–9-65.58.5212
VkBD80b_VARKO4551666–4–9-66.59.0493
VkBDic1_VARKO445069.056–4–9-51010.3026

Molecular weights, charges and isoelectric points were calculated based on predicted peptide sequences using the EMBOSS tool pepstats [53]

Physical properties of identified β-defensin peptides Molecular weights, charges and isoelectric points were calculated based on predicted peptide sequences using the EMBOSS tool pepstats [53]

Identification of Komodo dragon ovodefensin genes

Ovodefensin genes have been found in multiple avian and reptile species [26], with expression found in egg white and other tissues. Ovodefensins including the chicken peptide gallin (Gallus gallus OvoDA1) have been shown to have antimicrobial activity against the Gram-negative E. coli and the Gram-positive S. aureus. Presumptive β-ovodefensins are found in a cluster in the same syntenic block as the β-defensin cluster in birds and reptiles. There have been 19 β-ovodefensins found in A. carolinensis (one with an eight cysteine β-defensin domain) and five in snakes (four with an eight cysteine β-defensin domain) (Prickett, M.D., unpublished work in progress). The Komodo dragon cluster consists of six β-ovodefensins (Tables 4 and 5). Two of these may be Komodo dragon specific; VkOVOD1, which is a pseudois an ortholog of SnOVOD1 in addition to the first β-ovodefensin in turtles and crocodilians. The defensin domains VkOVOD3, VkOVOD4, and VkOVOD6 consist of eight cysteines, orthologs of SnOVOD2, SnOVOD3, and SnOVOD5, respectively. VkOVOD4 and VkOVOD6 are orthologs of LzOVOD14.
Table 4

Ovodefensin peptides predicted in the Komodo dragon genome

NameClosest Anole OrthologClosest Snake OrthologSequences
Scaffold 210 OvodefensinsThis scaffold contains the cluster of six β-ovodefensin genes, one intercluster β defensin, and 15 β-defensin genes that are similar to anole lizard β-defensins. The cluster begins with the CTSB gene. These genes include large β defensins.
VkOVOD1_VARKOLzOVOD1GM*PPHLCFLAMTLHLACSFCGMNFCAGAPRKLWCYCAAQRCCAL
VkOVOD2_VARKOSnOVOD1GRRRASDTQSCIRREGYCKPQCNETGPVPELNLGTCTNRQQKCCRPQEL
VkOVOD3_VARKOSnOVOD2GCKGHRSCSSLGGRCAKSCKSLGSGYTAYHTCDCARSGLQCCVPHKKG
VkOVOD4_VARKOLzOVOD14SnOVOD3GYRQHCSSYCHGFCSSKCPQHKVAQANCACALYGGICCVS
VkOVOD5_VARKOGLRDWGSEALLCSGSAPISTADFRRWCWGTLALCHGSYAMEYRRGPFCPPCCL
VkOVOD6_VARKOSnOVOD5RHCNTYCYGQCLKSCPYGWVPTSNCACQHTSNICCLPP

Reptile ovodefensin orthologs are also indicated (where they exist) as well as the ovodefensin peptide sequences encoded by each gene

Table 5

Physical properties of identified ovodefensin peptides

Defensin domainLengthMolecular weightCys Residue SpacingChargeIsoelectric point
VkOVOD1_VARKO454930.779–2–4-8-1-448.0905
VkOVOD2_VARKO495582.36–3–13-648.5149
VkOVOD3_VARKO484950.695–6–12–3-1-68.58.9348
VkOVOD4_VARKO404271.93–3–3-9-1-64.58.0913
VkOVOD5_VARKO535869.7614–6–13-21.57.5215
VkOVOD6_VARKO384252.893–3–3-9-1-637.8246

Molecular weights, charges and isoelectric points were calculated based on predicted peptide sequences using the EMBOSS tool pepstats [53]

Ovodefensin peptides predicted in the Komodo dragon genome Reptile ovodefensin orthologs are also indicated (where they exist) as well as the ovodefensin peptide sequences encoded by each gene Physical properties of identified ovodefensin peptides Molecular weights, charges and isoelectric points were calculated based on predicted peptide sequences using the EMBOSS tool pepstats [53]

Identification of the Komodo dragon cathelicidin genes

Cathelcidin peptide genes have recently been identified in reptiles through genomics approaches [13]. Several cathelicidin peptide genes have been identified in birds [52, 54–58], snakes [59, 60] and the anole lizard [11, 14, 61]. The release of functional cathelicidin antimicrobial peptides has been observed from chicken heterophils, suggesting that reptilian heterophils may also be a source of these peptides [30, 62]. Alibardi et al. have identified cathelicidin peptides being expressed in anole lizard tissues, including associated with heterophils [11, 14, 61]. Cathelicidin antimicrobial peptides are thought to play key roles in innate immunity in other animals [29] and so likely play this role in the Komodo dragon as well. In anole lizards, the cathelicidin gene cluster, consisting of 4 genes, is organized as follows: cathelicidin cluster . We searched for a similar cathelicidin cluster in the Komodo dragon genome. Searching the Komodo dragon genome for cathelicidin-like genes revealed a cluster of three genes that have a “cathelin-like domain”, which is the first requirement of a cathelicidin gene, located at one end of saffold 84. However, this region of scaffold 84 has assembly issues with gaps, isolated exons, and duplications. Identified Komodo dragon cathelicidin genes have been named after their anole orthologs. Two of the Komodo dragon cathelicidins (Cathelicidin2 and Cathelicidin4.1) are in sections with no assembly issues. By contrast, Cathlicidin4.2 was constructed using a diverse set of exons 1–3 and a misplaced exon 4 to create a complete gene, which is paralogous to Cathelicidin4.1. As the cluster is found at one end of the scaffold, there may be additional unidentified cathelicidins that are not captured in this assembly. A common feature of cathelicidin antimicrobial peptide gene sequences is that the N-terminal cathelin-domain encodes for at least 4 cysteines. In our study of alligator and snake cathelicidins we also noted that typically following the last cysteine, a three-residue pattern consisting of VRR or similar sequence immediately precedes the predicted C-terminal cationic antimicrobial peptide [12, 13, 15, 60, 63]. Additional requirements of a cathelicidin antimicrobial peptide gene sequence are that it encodes for a net-positive charged peptide in the C-terminal region, it is typically encoded by the fourth exon, and it is typically approximately 35 aa in length (range 25–37) [13, 15]. Since the naturally occurring protease responsible for cleavage and release of the functional antimicrobial peptides is not known, prediction of the exact cleavage site is difficult. As can be seen in Table 6, the predicted amino acid sequences for each of the identified Komodo dragon cathelicidin gene candidates are listed. Performing our analysis on each sequence, we made predictions and conclusions about whether each potential cathelicidin gene may encode for an antimicrobial peptide.
Table 6

Predicted cathelicidin antimicrobial peptide gene sequences

Cathelicidin GenePredicted Sequence of Precursor Protein
Cathelicidin2_VARKOMELFLLGVTLLMLGVPGATSPPLDAPSALLPRDVARMAVEDFNQEAEGQAVFRLLKLKHTHKVKFRWGFHFSLNFTIKETHCRKSDNYRIENCRYKTKGPIQDCSAQVSVLNFMPDSPLTSVKCRRGNGRGSRKAQPSAMLQAEERQPQVHVEVYLPSAYSIAALSVAEEEPSRPAARPH
Cathelicidin4.1_VARKOMASPWALPLMLLLLGMVGAAPSTAPPSSYDQVIAAAVDIYNQQQKPEYAFRLLEAEPQPDWNSSAQTNQVLRFSIKETVCRPSEKADLSQCDYKPDGVDRDCSGFYSPQQSPPTIMVQCEDVDQELDRVTRFRWRRFFRKAKRFLKRHGVSIAIGTVRLLRRFG
Cathelicidin4.2_VARKOMASPRALLLMLLLLRTVGAAPSAAPPSSYEQVIAAAVDTYNQQQKPAYAFRLLEAEPQPDWNPSAGTTQPLRFSIKETVCRPSEKADLSQCDYKPNGVDRDCSGFYSPQQQSPPTILVQCEDLDRVTRRRWRRFFQKAKRFVKRHGVSIAVGAYRIIG

Full predicted animo acid sequence of Komodo cathelicidin proteins. Note, the Cathelicidin2_VARKO lacks the VTR sequence that is typically present in other reptile cathleicidin precursor proteins, including Cathelicidin4.1_VARKO and Cathelicidin4.2_VARKO

Predicted cathelicidin antimicrobial peptide gene sequences Full predicted animo acid sequence of Komodo cathelicidin proteins. Note, the Cathelicidin2_VARKO lacks the VTR sequence that is typically present in other reptile cathleicidin precursor proteins, including Cathelicidin4.1_VARKO and Cathelicidin4.2_VARKO It can be seen that the predicted N-terminal protein sequence of Cathelicidin2_VARKO (VK-CATH2) contains four cysteines (underlined, Table 6). However, there is not an obvious “VRR” or similar sequence in the ~ 10 amino acids following the last cysteine residue as we saw in the alligator and related cathelicidin sequences [12, 13, 15]. In addition, analysis of the 35 C-terminal amino acids reveals a predicted peptide sequence lacking a net positive charge. For these reasons, we predict that the Cathelicidin2_VARKO gene sequence does not encode for an active cathelicidin antimicrobial peptide at its C-terminus (Table 7).
Table 7

Predicted active cathelicidin peptides and calculated properties (APD3 [64])

Peptide NamePredicted Functional Cathelicidin SequenceLengthNet ChargeMW
VK-CATH2No C-terminal cathelicidin peptide predicted.NANANA
VK-CATH4.1FRWRRFFRKAKRFLKRHGVSIAIGTVRLLRRFG33 aa+ 124133.02
VK-CATH4.2RRWRRFFQKAKRFVKRHGVSIAVGAYRIIG30 aa+ 103660.39
Predicted active cathelicidin peptides and calculated properties (APD3 [64]) For the identified Cathelicidin4.1_VARKO gene, the predicted cathelin-domain includes the requisite four cysteine residues (Table 6), and the sequence “VTR” is present within 10 amino acids of the last cysteine, similar to the “VRR” sequence in the alligator cathelicidin gene [12, 13, 15]. The 33-aa C-terminal peptide following the “VTR” sequence is predicted to have a net + 12 charge at physiological pH, and a large portion of the sequence is predicted to be helical [65, 66], which is consistent with cathelicidins. The majority of known cathelicidins contain segments with significant helical structure [67]. Finally, analysis of the sequence using the Antimicrobial Peptide Database indicates that the peptide is potentially a cationic antimicrobial peptide [64]. Hence, we predict that this gene likely encodes for an active cathelicidin antimicrobial peptide, called VK-CATH4.1 (Table 7). In addition, this peptide demonstrates some homology to other known antimicrobial peptides in the Antimicrobial Peptide Database [64] (Table 8). It shows a particularly high degree of sequence similarity to cathelicidin peptides identified from squamates, with examples included in Table 8. Thus, the predicted VK-CATH4.1 peptide has many of the hallmark characteristics of a cathelicidin peptide and is a strong candidate for further study. Table 8 shows the alignment of VK_CATH4.1 with known peptides in the Antimicrobial Peptide Database [64].
Table 8

Comparison to other cathelicidins

NameSpeciesSequenceLengthVK-CATH4.1/ VK-CATH4.2 Similarity (%)
VK-CATH4.1 V. komodoensis ----FRWRRFFRKAKRFLKRHGVSIAIGTVRLLRRFG- 33100/70.00
VK-CATH4.2 V. komodoensis ----RRWRRFFQKAKRFVKRHGVSIAVGAYRIIG---- 3070.00/100.00
Reptiles (Squamates)
 GJ-CATH G. japonicus -----RWRRFWGKAKRGIKKHGVSIALAALRLRG---- 2962.07/65.52
 AC-CATH A. carolinensis -----RWGRFWRGAKRFVKKHGVSIALAGLR------- 2665.38/65.38
 PB-CATH P. bivitatis -----RWRRFIRGAGRFARRYGWRIALG---------- 2360.87/56.52
 NA-CATH N. atra ----KRFKKFFKKLKNSVKKRAKKFFKKPKVIGVTFPF 3421.21/26.67
Other Vertebrates
 AM-CATH A. mississippiensis -KIKKGFKKIFKRLPP--IGVGVSIPLAGKR------- 2824.00/24.00
 CAP18 O. cuniculus -GLRKRLRKFRNKIKEKLKKIGQKIQGFVPKLAPRTDY 3733.33/26.67
 SMAP-29 O. aries ----RGLRRLGRKIAHGVKKYGPTVL-RIIRIAG---- 2924.14/34.48
 cc-CATH C. coturnix LVQRGRFGRFLKKVRRFIPKVIIAAQIG-----SRFG 3239.29/28.00
 CMAP27 G. gallus -----RFGRFLRKIRRFRPKVTITIQGS-----ARFG- 2740.74/29.17
 Lactoferritin B. taurus -FKCRRWQWR-------MKKLGAPSITCV---RRAF-- 2527.27/25.00

Homology and alignment of proposed Komodo dragon cathelicidin peptides with known peptides in the Antimicrobial Peptide Database APD3 [64]. These sequence alignments and sequence similarities were generated using Clustal ω [68]

Comparison to other cathelicidins Homology and alignment of proposed Komodo dragon cathelicidin peptides with known peptides in the Antimicrobial Peptide Database APD3 [64]. These sequence alignments and sequence similarities were generated using Clustal ω [68] For the identified Cathelicidin4.2_VARKO gene, the predicted cathelin domain includes the requisite four cysteine residues (Table 6). As was noted in the Cathelicidin4.1_VARKO gene, the sequence “VTR” is present within 10 amino acids of the fourth cysteine residue, and immediately precedes the C-terminal segment, which encodes for a 30-aa peptide that is predicted to be antimicrobial [64]. The amino acid sequence of the C-terminal peptide is predicted to have a net + 10 charge at physiological pH, and it demonstrates varied degrees of homology to other known antimicrobial peptides in the Antimicrobial Peptide Database [64]. Thus, like VK-CATH4.1, this candidate peptide also exhibits many of the hallmark characteristics associated with cathelicidin peptides, and is a second strong candidate for further study. Table 8 shows the homology and alignment of VK-CATH4.2 with known peptides from the Antimicrobial Peptide Database. Finally, the gene sequence encoding the functional peptide VK-CATH4.2 is found on exon 4, which is the typical location of the active cathelicidin peptide. This exon encodes the peptide sequence LDRVTRRRWRRFFQKAKRFVKRHGVSIAVGAYRIIG. The predicted peptide VK-CATH4.2 is highly homologous with peptides from other predicted cathelicidin genes, with similar predicted C-terminal peptides, from A. carolinensis, G. japonicus, and P. bivittatus (Table 8). Residues 2–27 of VK-CATH4.2 are 65% identical and 80% similar to the anole Cathelicidin-2 like predicted C-terminal peptide (XP_008116755.1, aa 130–155). Residues 2–30 of VK-CATH4.2 are 66% identical and 82% similar to the gecko Cathelicidin-related predicted C-terminal peptide (XP_015277841.1, aa 129–151). Finally, aa 2–24 of VK-CATH4.2 are 57% identical and 73% similar to the Cathelicidin-related OH-CATH-Like predicted C-terminal peptide (XP_007445036.1, aa 129–151).

Conclusions

Reptiles, including Komodo dragons, are evolutionarily ancient, are found in diverse and microbially-challenging environments, and they accordingly appear to have evolved robust innate immune systems. All of these features suggest that reptiles may express interesting antimicrobial peptides. A few reptilian antimicrobial peptides including defensin and cathelicidin peptides have been previously identified and studied that demonstrate broad-spectrum antimicrobial and antifungal activities. While defensins and cathelicidins are known in three of the four orders of reptiles: the testudines, crocodilians, and the squamata, few peptides have been identified to date in lizards and none in varanids (including Komodo dragon). Genes encoding antimicrobial peptides involved in innate immunity have previously been found in birds and reptiles, some of which are localized within clusters in the genome. Cathelicidin genes have been identified in birds and reptiles, including crocodilians, lizards and snakes. Clusters of β-defensin genes were recently identified in birds by one of our team [52]. While the origins of these gene clusters have not been well established, the phenomenon may have biological significance, potentially helping to coordinate the expression of these genes. Thus, these functionally related loci may have been selectively maintained through reptile and avian innate immunity evolution. This paper presents a new genome, that of the Komodo dragon, one of the largest extant lizards and the largest vertebrate to exhibit the ability to reproduce through parthenogenesis. Annotated genomes have been published for only a limited number of lizard species, and the present Komodo dragon genome is the first varanid genome assembly to be reported, and therefore will help to expand our understanding of lizard evolution in general. We present an annotated genome that contains as many as 17,213 genes. While there are many aspects of evolution and biology of interest to study in the Komodo dragon, we chose to focus on aspects of innate immunity, specifically antimicrobial peptides, as this was the source of our interest in the Komodo genome [24]. Antimicrobial peptides are present in mammals, birds, amphibians and fish but have not been well-characterized in reptiles despite the central position of this class in vertebrate evolution. We have sought to contribute to this understanding through our prior studies of antimicrobial peptides from birds [52], alligators [12, 21–23], snakes [12, 60, 63, 69–72], and now Komodo dragon [24, 25]. In the present study, we report the identification of genes encoding Komodo dragon defensin and cathelicidin peptides. We have elucidated 66 potential β–defensin genes, including 18 that appear to be unique to Komodo dragons. The remaining 48 peptides appear to have homologs in anole lizards and/or snakes. Similar to avian genomes, the Komodo dragon genome does not contain α-defensin genes; this class of antimicrobial peptides appears to be restricted to mammals [13]. Additionally, six potential β-ovodefensins were identified in the genome. These β–defensin and β-ovodefensin genes are localized in defensin-gene clusters within the genome. In addition to defensins, we identified three potential cathelicidin genes in the genome; however, upon further analysis it was determined that one of these apparent cathelicidin genes did not actually encode a cathelicidin peptide. The remaining two genes, Cathelicidin4.1_VARKO and Cathelicidin4.2_VARKO, are predicted to encode functional cathelicidin peptides at the C-terminal end of the precursor peptide. These peptides show significant degrees of similarity to other reptile cathelicidins. These findings are significant; however, the identified defensin and cathelicidin gene clusters appear to reside near scaffold edges, and therefore may not represent the full complement of defensin and cathelicidin genes that may be present in the Komodo dragon genome. The defensin and cathelicidin genes and gene clusters that we have identified here exhibit similarities to those that have been reported for the anole lizard and snakes, but they also show characteristics that are unique to the Komodo dragon. We anticipated that the findings presented here should contribute to a deeper understanding of innate immunity and antimicrobial peptides in reptiles and vertebrates in general.

Methods & experimental procedures

Komodo dragon blood samples

Komodo dragon (Varanus komodoensis) blood was collected by staff at the St. Augustine’s Alligator Farm Zoological Park (St. Augustine, FL) in compliance with relevant guidelines, using protocols approved by the GMU IACUC (GMU IACUC# 0266). Blood was collected in plastic blood collecting tubes treated with K2EDTA as the anticoagulant. Samples were immediately placed on ice, and then shipped on ice overnight to GMU.

Library preparation and multiplexing

Genomic DNA was prepared from a sample that had been enriched for leukocytes by a settling protocol (24 h, 37 °C, 5% CO2) from fresh Komodo dragon blood. DNA-seq libraries were constructed using PrepX ILM DNA Library Reagent Kit (Catalog No. 400044, Lot No. F0199) on the Apollo 324 robot (WaferGen, CA). Briefly, 150 ng of genomic DNA was resuspended in 50 μl of nuclease-free water and fragmented to 200–250 bp, using Covaris M220 to 300 bp at Peak Incident Power of (W) 50, Duty Factor of 20%, Cycles per Burst of 200, and Treatment Time of 75 s. Briefly, the ends were repaired and an ‘A’ base added to the 3′ end, preparing the DNA fragments for ligation to the adapters, which have a single ‘T’ base overhang at their 3′ end. The adapters enabled PCR amplification and hybridization to the flow cell. Following ligation, the excess adapters were removed and 300 ± 50 bp fragments (225 bp insert) were enriched for library amplification by PCR. The library that was generated was then validated using an Agilent 2100 Bioanalyzer and quantitated using a Quant-iT dsDNA HS Kit (Invitrogen) and qPCR. The samples were multiplexed based on qPCR quantitation to obtain similar distribution of reads of multiplexed samples.

Chicago library preparation

High molecular weight genomic DNA was extracted from blood cells collected from fresh Komodo dragon whole blood. A Chicago library was prepared as described previously [73]. Briefly, ≥ 0.5 μg of high molecular weight genomic DNA (50 kbp mean fragment size) was extracted from whole Komodo dragon blood using a Qiagen blood and cell midi kit, reconstituted into chromatin in vitro, and fixed with formaldehyde. Fixed chromatin was then digested with MboI, the 5′ overhangs were filled in with biotinylated nucleotides, and then free blunt ends were ligated. After ligation, crosslinks were reversed and the DNA purified from protein. Purified DNA was treated to remove biotin that was not internal to ligated fragments. The DNA was sheared to ~ 350 bp mean fragment size, and sequencing libraries were generated using NEBNext Ultra enzymes and Illumina-compatible adapters. Biotin-containing fragments were then isolated using streptavidin beads before PCR enrichment of the library.

Cluster generation and HiSeq paired-end sequencing

Libraries were clustered onto a flow cell using Illumina’s TruSeq PE Cluster Kit v3-cBOT-HS (PE-401-3001) and sequenced on an Illumina HiSeq 2500. The Chicago library was sequenced using 2 × 101 PE Rapid-Run (153 M read pairs) and the TruSeq SBS Kit v3-HS (200-cycles) (FC-401-3001), while the Virginia Bioinformatics Institute Genomics Core provided a 2 × 151 PE Rapid-Run (149 M read pairs) using TruSeq Rapid SBS Kit-200 cycle (2500) (FC-402–4001) and two TruSeq Rapid SBS Kit-50 cycles (FC-402–4002).

Scaffolding the draft genome with HiRise

N50 is defined as the scaffold length such that the sum of the lengths of all scaffolds of this size or less is equal to 50% of the total assembly length. The initial Komodo dragon draft genome assembly in FASTA format generated at Virginia Tech with Illumina 150 PE (Celera Assembler 8.2, default parameters, [74]) resulted in 1599 Mbp with a scaffold N50 of 35.8 kbp. This assembly, additional Illumina shotgun sequences (100 PE) and Chicago library sequence in FASTQ format were used as input data for HiRise, a software pipeline designed specifically for using Chicago library sequence data to assemble genomes [73]. Shotgun and Chicago library sequences were aligned to the draft input assembly using a modified SNAP read mapper (http://snap.cs.berkeley.edu). The separations of Chicago read pairs mapped within draft scaffolds were analyzed by HiRise to produce a likelihood model, and the resulting likelihood model was used to identify putative misjoins and score prospective joins. After scaffolding, shotgun sequences were used to close gaps between contigs.

Genome annotation and completeness

Assembly sequences were first masked using RepeatMasker (v4.0.3, http://www.repeatmasker.org/) with parameters set to “-s -a -nolow” and using a customized repeat library. Protein-coding genes were predicted using MAKER2 [75], which used anole lizard (A. carolinensis, version AnoCar2.0) and python (P. bivittatus, version bivittatus-5.0.2) protein sequences that were downloaded from Ensembl (www.ensembl.org) and RefSeq (www.ncbi.nlm.nih.gov/refseq) as protein homology evidence, along with the previously assembled RNA-seq data [24] as the expression evidence, and integrated with prediction methods including Blastx, SNAP [76] and Augustus [77]. The SNAP HMM file was generated by training the anole lizard gene sequences. An Augustus model file was generated by training 3026 core genes of vertebrates from a genome completeness assessment tool BUSCO [78]. Predicted genes were subsequently used as query sequences in a Blastx database search of NR database (the non-redundant database, http://www.ncbi.nlm.nih.gov/). Blastx alignments with e-value greater than 1e− 10 were discarded, and the top hit was used to annotate the query genes. Repeat families were identified by using the de novo modeling package RepeatModeler (http://www.repeatmasker.org/RepeatModeler/). Then, the de novo identified repeat sequences were combined with manually selected vertebrate repeats from RepBase (https://www.girinst.org/repbase/) to form a customized repeat library. The completeness of assembly was estimated using CEGMA by examining 248 core eukaryotic genes [35].

Transcriptome

A transcriptome generated from RNA isolated from Komodo blood cells has been previously described [24] and was used here to aid in the assembly annotation. Briefly, 280–300 bp libraries (160–180 bp insert) were generated, clustered onto a flow cell using Illumina’s TruSeq PE Cluster Kit v3-cBOT-HS and sequenced using TruSeq SBS Kit v3-HS (300 cycles, 2 × 150 cycle paired-end) on an Illumina HiSeq 2500.

Identification of defensin and cathelicidin genes within the genome

Lizard and snake defensin and cathelicidin genes had been previously identified in prior analyses of published genomes for Anolis carolinensis [34] Ophiophagus hannah (king cobra) [79] Python bivittatus (Burmese python) [80] as well as the pit vipers Protobothrops mucrosquamatus (https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Protobothrops_mucrosquamatus/100/) and Vipera berus berus (https://www.ncbi.nlm.nih.gov/bioproject/170536) (https://www.hgsc.bcm.edu/reptiles/european-adder-genome-project) (Additional file 3: Table S2). This data was used in our analyses of the Komodo dragon genome. Genes from A. carolinensis (β-defensins, ovodefensins, cathelicidins, and genes flanking the defensin and cathelicidin clusters) were used as queries in a TBLASTN against the Komodo genome. Due to the diversity of β-defensins, homology searches are not sufficient to identify the entire β-defensin repertoire, so a combination of strategies was used. Genomic scaffolds containing hits were extracted and genes identified by BLAST were manually curated using Artemis [19]. Scaffolds with hits to β-defensins were then further examined manually for the characteristic β-defensin motif and signal peptides not previously identified by the initial BLAST search. Gene structures were determined based on previously annotated A. carolinensis orthologs when possible. Annotated β-defensin genes were named by using the initials for the species and genus (Vk) as a prefix and a five-letter abbreviation as a suffix (VkBDx_VARKO) and numbered in order following CTSB on scaffold 210. Β-ovodefensins were similarly named in order following MTMR9 (VkOVODx_VARKO). Β-defensins on scaffold 826 were numbered using anole orthologs as a reference for gene order. Β-defensins on other scaffolds were named based on their anole orthologs. Cathelicidins were named based on their anole orthologs.

Peptide prediction

Predicted amino acid sequences were compared to other known protein sequences using blast-p at NCBI (https://www.ncbi.nlm.nih.gov) tool [81, 82]. Prediction of size, charge, helicity and other properties of proposed antimicrobial peptides was performed using Antimicrobial Peptide Database APD3 Calculation and Prediction tool http://aps.unmc.edu/AP/prediction/prediction_main.php [64]. Homology searching against other peptides in the APD3 database was done using the proffered option after the calculation and prediction tool was applied. Figure S1. Repeat families. The Komodo dragon genomic profile of repeat element families. (JPG 662 kb) Table S1. Repeat element families. (DOCX 61 kb) Table S2. Reptile Defensin Orthologs. (DOCX 158 kb)
  76 in total

1.  EMBOSS: the European Molecular Biology Open Software Suite.

Authors:  P Rice; I Longden; A Bleasby
Journal:  Trends Genet       Date:  2000-06       Impact factor: 11.639

Review 2.  [beta]-defensins in lung host defense.

Authors:  Brian C Schutte; Paul B McCray
Journal:  Annu Rev Physiol       Date:  2002       Impact factor: 19.318

Review 3.  Defensins: antimicrobial peptides of innate immunity.

Authors:  Tomas Ganz
Journal:  Nat Rev Immunol       Date:  2003-09       Impact factor: 53.106

4.  A whole-genome assembly of Drosophila.

Authors:  E W Myers; G G Sutton; A L Delcher; I M Dew; D P Fasulo; M J Flanigan; S A Kravitz; C M Mobarry; K H Reinert; K A Remington; E L Anson; R A Bolanos; H H Chou; C M Jordan; A L Halpern; S Lonardi; E M Beasley; R C Brandon; L Chen; P J Dunn; Z Lai; Y Liang; D R Nusskern; M Zhan; Q Zhang; X Zheng; G M Rubin; M D Adams; J C Venter
Journal:  Science       Date:  2000-03-24       Impact factor: 47.728

5.  Discovery of new human beta-defensins using a genomics-based approach.

Authors:  H P Jia; B C Schutte; A Schudy; R Linzmeier; J M Guthmiller; G K Johnson; B F Tack; J P Mitros; A Rosenthal; T Ganz; P B McCray
Journal:  Gene       Date:  2001-01-24       Impact factor: 3.688

6.  Discovery of five conserved beta -defensin gene clusters using a computational search strategy.

Authors:  Brian C Schutte; Joseph P Mitros; Jennifer A Bartlett; Jesse D Walters; Hong Peng Jia; Michael J Welsh; Thomas L Casavant; Paul B McCray
Journal:  Proc Natl Acad Sci U S A       Date:  2002-02-19       Impact factor: 11.205

7.  Aerobic salivary bacteria in wild and captive Komodo dragons.

Authors:  Joel M Montgomery; Don Gillespie; Putra Sastrawan; Terry M Fredeking; George L Stewart
Journal:  J Wildl Dis       Date:  2002-07       Impact factor: 1.535

Review 8.  Cathelicidins--a family of multifunctional antimicrobial peptides.

Authors:  R Bals; J M Wilson
Journal:  Cell Mol Life Sci       Date:  2003-04       Impact factor: 9.261

Review 9.  Genomics-based approaches to gene discovery in innate immunity.

Authors:  ToddE Scheetz; Jennifer A Bartlett; Jesse D Walters; Brian C Schutte; Thomas L Casavant; Paul B McCray
Journal:  Immunol Rev       Date:  2002-12       Impact factor: 12.988

10.  Gene prediction with a hidden Markov model and a new intron submodel.

Authors:  Mario Stanke; Stephan Waack
Journal:  Bioinformatics       Date:  2003-10       Impact factor: 6.937

View more
  3 in total

1.  Major histocompatibility complex genes and locus organization in the Komodo dragon (Varanus komodoensis).

Authors:  Kent M Reed; Robert E Settlage
Journal:  Immunogenetics       Date:  2021-05-12       Impact factor: 2.846

2.  Structure, function, and evolution of Gga-AvBD11, the archetype of the structural avian-double-β-defensin family.

Authors:  Nicolas Guyot; Hervé Meudal; Sascha Trapp; Sophie Iochmann; Anne Silvestre; Guillaume Jousset; Valérie Labas; Pascale Reverdiau; Karine Loth; Virginie Hervé; Vincent Aucagne; Agnès F Delmas; Sophie Rehault-Godbert; Céline Landon
Journal:  Proc Natl Acad Sci U S A       Date:  2019-12-23       Impact factor: 11.205

3.  The Two Domains of the Avian Double-β-Defensin AvBD11 Have Different Ancestors, Common with Potential Monodomain Crocodile and Turtle Defensins.

Authors:  Nicolas Guyot; Céline Landon; Philippe Monget
Journal:  Biology (Basel)       Date:  2022-04-30
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.