| Literature DB >> 32689930 |
Mariana Buongermino Pereira1,2, Tobias Österlund1,2, K Martin Eriksson3,4, Thomas Backhaus2,3, Marina Axelson-Fisk1, Erik Kristiansson5,6.
Abstract
BACKGROUND: Integrons are genomic elements that mediate horizontal gene transfer by inserting and removing genetic material using site-specific recombination. Integrons are commonly found in bacterial genomes, where they maintain a large and diverse set of genes that plays an important role in adaptation and evolution. Previous studies have started to characterize the wide range of biological functions present in integrons. However, the efforts have so far mainly been limited to genomes from cultivable bacteria and amplicons generated by PCR, thus targeting only a small part of the total integron diversity. Metagenomic data, generated by direct sequencing of environmental and clinical samples, provides a more holistic and unbiased analysis of integron-associated genes. However, the fragmented nature of metagenomic data has previously made such analysis highly challenging.Entities:
Keywords: Antibiotic resistance; Functional annotation; Gene cassettes; Horizontal gene transfer; Integrons; Metagenomics; ORFans
Mesh:
Year: 2020 PMID: 32689930 PMCID: PMC7370490 DOI: 10.1186/s12864-020-06830-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Description of the computational pipeline used to detect attC sites in metagenomic data. Assembled metagenomic DNA sequences are used as input. Next, the gHMM-based HattCI is used to detect the attC sites present in the input sequences. Subsequently, the secondary structure of the detected attC sites is evaluated by a covariance model implemented in Infernal, which runs the search in its most sensitive mode. Identified attC sites on the same strand are considered to be part of the same integron when they are at maximum 4,000 nucleotides (nt) apart. Note that integrons with only one attC site are removed from the analysis in order to ensure a high true positive rate. Finally, the ORFs are predicted upstream of the attC sites
Size of each dataset in terms of assembled gigabases and number of sequences, together with the number of predicted attC sites and ORFs
| CAMERA [ | 66 | 179,126,552 | 354 (0.005) | 360 |
| MG-RAST [ | 13 | 7,881,749 | 5,377 (0.4) | 6,471 |
| NTenv (GenBank) [ | 87 | 86,661,686 | 5,094 (0.06) | 6,467 |
| EBI Metagenomics [ | 3 | 3,886,782 | 1,283 (0.4) | 1,668 |
| Tara Oceans [ | 61 | 57,540,959 | 2,746 (0.05) | 3,507 |
| Aquatic microbiome [ | 1 | 4,094,883 | 2 (0.002) | 2 |
| Marine biofilm2 | 3 | 2,046,453 | 1,440 (0.5) | 1,909 |
| Human gut [ | 10 | 6,589,348 | 2 (0.0002) | 2 |
| Human gut from diabetic patients [ | 2 | 891,652 | 2 (0.001) | 2 |
| Human gut from travelers [ | 18 | 20,555,914 | 14 (0.0008) | 14 |
| Elephant gut [ | 1 | 311,295 | 29 (0.03) | 41 |
| Corn and prairie crops soil [ | 2 | 4,944,181 | 29 (0.02) | 30 |
| Microbial fuel cells [ | 0.15 | 207,982 | 38 (0.3) | 42 |
| Subarctic microbiomes [ | 0.04 | 169,650 | 2 (0.05) | 2 |
1In parenthesis, copies per million bases.
2Prepared by the authors.
3Non-redundant hits.
4Non-redundant hits. Aminoacid sequences
Fig. 2Boxplots for a ORF length and b G/C-content for the integron-associated genes identified in this study. For comparisons, the corresponding data for three reference bacterial species have been included, Escherichia coli K-12, Staphylococcus aureus NCTC8325 and Bifidobacterium longum NCC2705. c Cluster analysis of the integron-associated genes. The x-axis shows the cluster threshold in sequence identity (higher value corresponds to a more homogeneous clusters) and the y-axis the number of produced clusters
Fig. 3Functional annotation of the integron-associated genes (solid bars) and other genes found in metagenomes using COG functional categories (striped bars). Of the 13,397 integron-associated genes in our catalog, 2,277 genes matched a COG with a known function. 116,259,264 ORFs were not associated with integrons in metagenomes, out of which 50,201,496 matched a COG with a known function. Percentages on the plot are given in relation to those numbers
Fig. 4Gene ontology analysis of the integron-associated genes using PFAM families. Out of the 13,397 integron-associated genes in our catalog, 3,488 matched a PFAM family with a known function, which were in turn mapped to the metagenomics GO slim. Not all PFAM families mapped to a GO term; as a result, 1534 genes had a corresponding GO term. Level 1 terms were removed and those with at least 5 counts were kept (For the whole list GO terms and their counts please see Additional file 3: Table S2)
Results from blast searches against the integron database INTEGRALL, and antibiotic and metal resistance databases, ResFinder and BacMet, respectively. Similarity thresholds used were 70% and 97%
| INTEGRALL [ | 201 (1.5%) | 51 (0.38%) |
| ResFinder [ | 31 (0.23%) | 25 (0.19%) |
| BacMet [ | 7 (0.052%) | 4 (0.030%) |