| Literature DB >> 29559822 |
R Henrik Nilsson1,2, Andy F S Taylor3, Rachel I Adams4, Christiane Baschien5, Patrik Cangren1,2, Claudia Coleine6,7, Sydney I Glassman8, Yuuri Hirooka9, Laszlo Irinyi10,11,12, Wieland Meyer10,11,12, Keith A Seifert13,14, Frantisek Sklenář15,16, Sung-Oui Suh17, Richard Summerbell18,19, Sten Svantesson1,2, Michael Weiss20, Joyce Hc Woudenberg21, Silke Van den Wyngaert22, Neriman Yilmaz13,14, Urmas Kõljalg23, Kessy Abarenkov23.
Abstract
Recent DNA-based studies have shown that the built environment is surprisingly rich in fungi. These indoor fungi - whether transient visitors or more persistent residents - may hold clues to the rising levels of human allergies and other medical and building-related health problems observed globally. The taxonomic identity of these fungi is crucial in such pursuits. Molecular identification of the built mycobiome is no trivial undertaking, however, given the large number of unidentified, misidentified, and technically compromised fungal sequences in public sequence databases. In addition, the sequence metadata required to make informed taxonomic decisions - such as country and host/substrate of collection - are often lacking even from reference and ex-type sequences. Here we report on a taxonomic annotation workshop (April 10-11, 2017) organized at the James Hutton Institute/University of Aberdeen (UK) to facilitate reproducible studies of the built mycobiome. The 32 participants went through public fungal ITS barcode sequences related to the built mycobiome for taxonomic and nomenclatural correctness, technical quality, and metadata availability. A total of 19,508 changes - including 4,783 name changes, 14,121 metadata annotations, and the removal of 99 technically compromised sequences - were implemented in the UNITE database for molecular identification of fungi (https://unite.ut.ee/) and shared with a range of other databases and downstream resources. Among the genera that saw the largest number of changes were Penicillium, Talaromyces, Cladosporium, Acremonium, and Alternaria, all of them of significant importance in both culture-based and culture-independent surveys of the built environment.Entities:
Keywords: Indoor mycobiome; built environment; fungi; metadata; molecular identification; open data; sequence annotation; systematics; taxonomy
Year: 2018 PMID: 29559822 PMCID: PMC5804120 DOI: 10.3897/mycokeys.28.20887
Source DB: PubMed Journal: MycoKeys ISSN: 1314-4049 Impact factor: 2.984
Overview of genera. The 10 genera that saw the largest number of taxonomic changes during the workshop, plus the number of such changes.
| Genus | Number of changes |
|---|---|
|
| 714 |
|
| 601 |
|
| 533 |
|
| 372 |
|
| 327 |
|
| 196 |
|
| 167 |
|
| 136 |
|
| 132 |
|
| 106 |
| Total | 3284 |
Results of the taxonomic annotation part of the workshop. Name updates = number of sequences whose names were updated. RefS designations = number of reference sequences designated for individual SHs. Chimeras = number of chimeric sequences identified. Low read quality = number of sequences marked as being of substandard technical quality. The chimeras and the low read quality sequences were excluded from further use in UNITE (although kept in the system for future reference). Studies = number of distinct studies that saw at least one change to at least one sequence.
| Name updates | RefS designations | Chimeras | Low read quality | Sum of changes | Studies | |
|---|---|---|---|---|---|---|
| Sequences | 4783 | 505 | 5 | 94 | 5387 | 250 |
Results of the metadata annotation part of the workshop, specified for the built mycobiome sequence set (BMS) and the outdoor mycobiome sequence set (OMS). Country and host of collection plus host association were assembled for both of these. The number of sequences processed, plus the number of underlying published and unpublished scientific studies, are also provided. For the BMS, the nine MIxS-BE annotation standard items targeted at the workshop are specified in separate columns. The sequence numbers shown in the table refer to the number of sequences annotated for each data item.
| Number of sequences (annotated) | Number of different studies | Country of collection | Different countries | Host of collection | Different hosts | Host association | Comment | ||
|---|---|---|---|---|---|---|---|---|---|
| 924 (922) | 33 | 543 | 10 | 218 | 2 | 218 | 865 | ||
| 7657 (5264) | 218 | 4452 | 84 | 1524 | 275 | 1272 | 3181 | ||
| Both jointly | 8581 (6186) | 250 | 4995 | 84 | 1742 | 276 | 1490 | 4046 | |
| build_occup_type | space_typ_state | substructure_type | ventilation_type | indoor_space | indoor_surf | surf_material | surface-air contaminant | filter_type | |
| 597 | 732 | 19 | 95 | 4 | 76 | 130 | 195 | 0 | |
Figure 1.Analysis of the built environment sequences for country of collection. Country centroids based on the geographical centres of contiguous country land masses are marked with bubbles of different size on the global map to indicate the number of built environment sequences originating from these countries as stated explicitly in the underlying INSDC records or as restored during the present effort and in Abarenkov et al. (2016) (57 distinct countries, sequence count ranging from 1 to 3,091). The figure is based on Abarenkov et al. (2016) plus the data added during the workshop, such that it indicates the scientific state of ITS-based Sanger-derived sequencing of the built mycobiome as of spring 2017.
Figure 3.Analysis of the MIxS-BE “building occupancy type” (type of building where the underlying sample was taken). The figure is based on Abarenkov et al. (2016) plus the data added during the workshop, such that it indicates the scientific state of ITS-based Sanger-derived sequencing of the built mycobiome as of spring 2017.