| Literature DB >> 35666730 |
Paul L Babb1,2, Matjaž Gregorič3, Nicholas F Lahens4, David N Nicholson5, Cheryl Y Hayashi6, Linden Higgins7, Matjaž Kuntner3,8, Ingi Agnarsson7,9, Benjamin F Voight1,2,4.
Abstract
Natural silks crafted by spiders comprise some of the most versatile materials known. Artificial silks-based on the sequences of their natural brethren-replicate some desirable biophysical properties and are increasingly utilized in commercial and medical applications today. To characterize the repertoire of protein sequences giving silks their biophysical properties and to determine the set of expressed genes across each unique silk gland contributing to the formation of natural silks, we report here draft genomic and transcriptomic assemblies of Darwin's bark spider, Caerostris darwini, an orb-weaving spider whose dragline is one of the toughest known biomaterials on Earth. We identify at least 31 putative spidroin genes, with expansion of multiple spidroin gene classes relative to the golden orb-weaver, Trichonephila clavipes. We observed substantial sharing of spidroin repetitive sequence motifs between species as well as new motifs unique to C. darwini. Comparative gene expression analyses across six silk gland isolates in females plus a composite isolate of all silk glands in males demonstrated gland and sex-specific expression of spidroins, facilitating putative assignment of novel spidroin genes to classes. Broad expression of spidroins across silk gland types suggests that silks emanating from a given gland represent composite materials to a greater extent than previously appreciated. We hypothesize that the extraordinary toughness of C. darwini major ampullate dragline silk may relate to the unique protein composition of major ampullate spidroins, combined with the relatively high expression of stretchy flagelliform spidroins whose union into a single fiber may be aided by novel motifs and cassettes that act as molecule-binding helices. Our assemblies extend the catalog of sequences and sets of expressed genes that confer the unique biophysical properties observed in natural silks.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35666730 PMCID: PMC9170102 DOI: 10.1371/journal.pone.0268660
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Summary statistics for the C. darwini genome and transcriptome assemblies.
|
| ||
| Genome Size | 1.81 Gb | |
|
|
|
|
| Assembly Size: | 1.53 Gb | 1.45 Gb |
| 1.51 Gb non-gap | 1.43 Gb non-gap | |
| % Genome Captured: | 84.9% | 80.4% |
| Number of Contigs: | 253,859 | 465,207 |
| N50 Contig Size: | 87,597 bp | 94,553 bp |
| Number of Scaffolds: | 232,896 | 45,784 |
| N50 Scaffold Size: | 453,395 bp | 489,784 bp |
| Largest Scaffold: | 4,645,134 bp | 4,645,134 bp |
| Scaffolds >100 kb: | 3082 | 3082 |
| BUSCO % recovered | 94.8% | 95.2% |
|
|
| |
| Read Input: | 6.19 x 108 reads | |
| Number of Transcripts: | 1,056,281 | |
| N50 Transcript Contig Size: | 911 bp | |
| BUSCO % recovered | 97.5% | |
Statistics regarding construction of the draft meta-assembled genome and “all isolates” transcriptome:
a genome size estimate calculated based on k-mer frequency (K = 25 scale);
b gap-closed meta-assembly of AllPaths LG + SOAPdenovo2 + Platanus (minimum scaffold length = 100 bp);
c gap-closed meta-assembly of AllPaths LG + SOAPdenovo2 + Platanus (minimum scaffold length = 1,000 bp + 49 additional scaffolds containing BLAST hits for previously published spider spidroin gene sequences);
d unique QC-filtered paired and single reads remapped to assembly;
e completeness based upon matches to 2,058 I. scapularis BUSCO loci. Additional genome assembly metrics are provided in and for transcriptome metrics.
Spidroin repeat motif summary for C. darwini and motif sharing with T. clavipes.
| Metric | Count |
|---|---|
| Number of Motif Variant Sequences types detected | 2771 |
| Types private to | 1493 (53.9) |
| Types private to | 403 (14.5) |
| Shared between | 875 (31.6) |
| Number of defined Motif Groups | 140 |
| Number of defined Motif Sub-groups | 302 |
| Number after variant descriptions collapsed | 246 |
| Total Motif Occurrences observed for n = 38 | 6950 |
| Total Motif Occurrences observed for n = 28 | 4074 |