| Literature DB >> 30846749 |
Lucie Kucerova1, Michal Zurovec2,3, Barbara Kludkiewicz1, Miluse Hradilova4, Hynek Strnad4, Frantisek Sehnal5,6.
Abstract
Seroins are small lepidopteran silk proteins known to possess antimicrobial activities. Several seroin paralogs and isoforms were identified in studied lepidopteran species and their classification required detailed phylogenetic analysis based on complete and verified cDNA sequences. We sequenced silk gland-specific cDNA libraries from ten species and identified 52 novel seroin cDNAs. The results of this targeted research, combined with data retrieved from available databases, form a dataset representing the major clades of Lepidoptera. The analysis of deduced seroin proteins distinguished three seroin classes (sn1-sn3), which are composed of modules: A (includes the signal peptide), B (rich in charged amino acids) and C (highly variable linker containing proline). The similarities within and between the classes were 31-50% and 22.5-25%, respectively. All species express one, and in exceptional cases two, genes per class, and alternative splicing further enhances seroin diversity. Seroins occur in long versions with the full set of modules (AB1C1B2C2B3) and/or in short versions that lack parts or the entire B and C modules. The classes and the modular structure of seroins probably evolved prior to the split between Trichoptera and Lepidoptera. The diversity of seroins is reflected in proposed nomenclature.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30846749 PMCID: PMC6405961 DOI: 10.1038/s41598-019-40401-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Simplified cladogram of lepidopteran superfamilies. The superfamilies represented in our RNA sequencing analysis are indicated by arrows on the right side. Based on Mitter et al.[15] and Regier et al.[23].
Affiliation of analyzed Lepidopteran species to families and superfamilies.
| Species name and abbreviation | Superfamily | Family | Data origin* | Numbers of seroins in classes** | ||
|---|---|---|---|---|---|---|
| Sn1 | Sn2 | Sn3 | ||||
| Hepialoidea | Hepialidae | Illumina | 2 | 1 | ? | |
| Tineoidea | Tineidae | Illumina | 1 | ? | 1 | |
| Sesioidea | Castniidae | (GenBank data) | 1 | ? | ? | |
| Tortricoidea | Tortricidae | Roche 454 | 4 | ? | 1 | |
| Tortricoidea | Tortricidae | (GenBank data) | 2 | ? | ? | |
| Tortricoidea | Tortricidae | (GenBank data) | 1 | ? | ? | |
| Carposinoidea | Carposinidae | (GenBank data) | 1 | ? | ? | |
| Pyraloidea | Crambidae | Illumina | 4 | 2 | 1 | |
| Pyraloidea | Crambidae | (GenBank data) | 2 | ? | ? | |
| Pyraloidea | Pyralidae | Lepbase | 1 | 1 | 1 | |
| Pyraloidea | Pyralidae | (GenBank data) | 4 | 1 | 2 | |
| Pyraloidea | Pyralidae | Roche 454 | 4 | 1 | ? | |
| Pyraloidea | Pyralidae | Roche 454 | 4 | 3 | 1 | |
| Geometroidea | Geometridae | (GenBank data) | ? | ? | 1 | |
| Bombycoidea | Brahmeidae | Illumina | 2 | 3 | 1 | |
| Bombycoidea | Saturniidae | (GenBank data) | 3 | ? | ? | |
| Bombycoidea | Saturniidae | Roche 454 | 3 | 2 | 1 | |
| Bombycoidea | Saturniidae | (GenBank data) | 1 | ? | ? | |
| Bombycoidea | Bombycidae | (GenBank data) | 2 | 3 | 1 | |
| Bombycoidea | Bombycidae | (GenBank data) | 2 | ? | ? | |
| Noctuoidea | Noctuidae | (GenBank data) | ? | ? | 2 | |
| Noctuoidea | Noctuidae | (GenBank data) | 1 | 2 | 1 | |
| Noctuoidea | Noctuidae | Illumina/GenBank | 3 | 2 | 1 | |
| Noctuoidea | Noctuidae | Illumina | 3 | 2 | ? | |
| Noctuoidea | Noctuidae | (GenBank data) | 1 | 2 | 1 | |
| Noctuoidea | Noctuidae | (GenBank data) | ? | ? | 1 | |
| Noctuoidea | Noctuidae | (GenBank data) | ? | ? | 1 | |
| Papilionoidea | Papilionidae | (GenBank data) | 2 | 1 | 1 | |
| Papilionoidea | Papilionidae | (GenBank data) | 2 | 1 | 1 | |
| Papilionoidea | Papilionidae | (GenBank data) | 1 | 4 | 1 | |
| Papilionoidea | Pieridae | Lepbase | 3+ | 1 | 1 | |
| Papilionoidea | Nymphalidae | (GenBank data) | ? | 1 | 1 | |
| Papilionoidea | Nymphalidae | Lepbase | 1 | ? | ? | |
The source of data and the numbers of cDNA forms of indicated seroin classes (listed from the ancestral to the derived superfamilies).
*New data were obtained by transcriptome sequencing using the pyrosequencing on Roche 454 or the Illumina method. Additional data were drawn from the GenBank, Lepbase or provided by the colleagues listed in the Acknowledgements.
**Numbers of seroins in each class. Questionmarks indicate that the absence of a class could be double checked by additional data screening or independent reverse transcription.
+Number of transcripts derived from the Seroin 1 gene in P. napi (PnSn1) is uncertain and is expected to be higher.
Classification of identified seroins and recognition of conserved splicing versions in class Sn1.
| Species | Splicing versions of Class 1 seroins | Class 2 | Class 3 | ||
|---|---|---|---|---|---|
| N-terminal | C-terminal | Long | |||
|
| — | — | HcSn1A, HcSn1A2 | HcSn2 | ? |
|
| TbSn1 | — | — | ? | TbSn3 |
|
| ? | ? | TlSn1 | ? | ? |
|
| CpSn1A | CpSn1B,CpSn1B2 | CpSn1C | ? | CpSn3 |
|
| ? | CfSn1B | CfSn1A | ? | ? |
|
| GmoSn1 | ? | ? | ? | ? |
|
| ? | ? | CsSn1 | ? | ? |
|
| OnSn1A, Osn1A2 | OnSn1B | OnSn1C | OnSn2A, OnSn2B | OnSn3 |
|
| ? | OfSn1B | OfSn1A | ? | ? |
|
| PiSn1 | ? | ? | PiSn2 | PiSn3 |
|
| — | — | AtSn1A, AtSn1B, AtSn1C, AtSn1D | AtSn2 | AtSn3A, AtSn3B |
|
| — | AkSn1A | AkSn1B, AkSn1B2, AkSn1C | AkSn2 | ? |
|
| GmSn1A | GmSn1B, GmSn1B2 | GmSn1C | GmSn2A, GmSn2B, GmSn2C | GmSn3 |
|
| ? | ? | ? | ? | ObSn3 |
|
| AeSn1A | AeSn1B | — | AeSn2A, AeSn2B, AeSn2C | AeSn3 |
|
| SrSn1A, SrSn1C | SrSn1B | — | ? | ? |
|
| AySn1A, AySn1C | AySn1B | — | AySn2A, AySn2B | AySn3 |
|
| AmSn1 | ? | ? | ? | ? |
|
| BmSn1–2 | BmSn1-1 | — | BmSn2A, BmSn2B, BmSn2C | BmSn3 |
|
| BhSn1–2 | BhSn1-1 | — | ? | ? |
|
| ? | ? | ? | ? | SeSn3A, SeSn3A2 |
|
| ? | ? | SlSn1 | SlSn2A, SlSn2B | SlSn3 |
|
| SliSn1A2, SliSn1A4 | SliSn1A | — | SliSn2A, SliSn2B | SliSn3 |
|
| MbSn1B | MbSn1A, MbSn1A2 | — | MbSn2A, MbSn2B | ? |
|
| ? | HaSn1 | ? | HaSn2A, HaSn2B | HaSn3 |
|
| ? | ? | ? | ? | HasSn3 |
|
| ? | ? | ? | ? | AsSn3 |
|
| ? | ? | PxSn1A, PxSn1B | PxSn2 | PxSn3 |
|
| ? | ? | PmSn1A, PmSn1B | PmSn2 | PmSn3 |
|
| ? | ? | PpSn1 | PpSn2A, PpSn2B, PpSn2C, PpSn2D | PpSn3 |
|
| ? | ? | PnSn1A, PnSn1B, PnSn1C | PnSn2 | PnSn3 |
|
| ? | ? | ? | DpSn2 | DpSn3 |
|
| ? | ? | HmSn1 | ? | ? |
*Seroins are unequivocally identified with the newly proposed nomenclature (section seroin nomenclature). Question marks show that available data might be insufficient for detection of the respective seroin.
Figure 2Comparison of the domain organization and major splice isoforms of Sn1, Sn2 and Sn3 proteins. Seroins occur in long versions with the full set of modules (AB1C1B2C2B3) and/or in short versions that lack parts or the entire B and C modules. The modules are indicated by color boxes. Three major splicing isoforms are generated in Sn1, four isoforms in Sn2 and two isoforms in Sn3. The alignment depicts the presence of individual modules in different isoforms.
Comparison of the number of reads recognized as class specific in several silk gland transcriptomes.
| Sample (cDNA) | Total number of reads for the library | Sn1 reads | Sn2 reads | Sn3 reads |
|---|---|---|---|---|
| 68365 | 1083 | 97 | 3 | |
| 76492 | 3428 | 154 | 12 | |
| 81015 | 4954 | 300 | 40 | |
|
| 120793 | 6919 | 6 | 701 |
|
| 76434 | 214 | 0 | 8 |
|
| 76439 | 346 | 1 | 0 |
|
| 25429 | 60 | 1 | 1 |
|
| 32551 | 6 | 3 | 1 |
*TBLASTN search was used to identify the reads; query sequence: 80 C-terminal amino acids, threshold e-value was set to e−20; Penn inst -penultimate instar larvae; Wander st. - post-feeding wandering last-instar larvae, Prepupa - apolyzing (initial phase of pupation) last-instar larvae.
Figure 3Simplified family-level phylogeny of seroins. The full-size ML phylogenetic tree is shown in Fig. S5A. The broadenings of horizontal lines indicates inclusion of more than one species in the respective family. Statistical evaluation calculated by aBayes test is shown next to the branches, only values higher than 50 are presented.