| Literature DB >> 30656067 |
Simon Orozco-Arias1,2, Ana María Núñez-Rincón2, Reinel Tabares-Soto1, Diana López-Álvarez2,3.
Abstract
The co-occurrence of plant species is a fundamental aspect of plant ecology that contributes to understanding ecological processes, including the establishment of ecological communities and its applications in biological conservation. A priori algorithms can be used to measure the co-occurrence of species in a spatial distribution given by coordinates. We used 17 species of the genus Brachypodium, downloaded from the Global Biodiversity Information Facility data repository or obtained from bibliographical sources, to test an algorithm with the spatial points process technique used by Silva et al. (2016), generating association rules for co-occurrence analysis. Brachypodium spp. has emerged as an effective model for monocot species, growing in different environments, latitudes, and elevations; thereby, representing a wide range of biotic and abiotic conditions that may be associated with adaptive natural genetic variation. We created seven datasets of two, three, four, six, seven, 15, and 17 species in order to test the algorithm with four different distances (1, 5, 10, and 20 km). Several measurements (support, confidence, lift, Chi-square, and p-value) were used to evaluate the quality of the results generated by the algorithm. No negative association rules were created in the datasets, while 95 positive co-occurrences rules were found for datasets with six, seven, 15, and 17 species. Using 20 km in the dataset with 17 species, we found 16 positive co-occurrences involving five species, suggesting that these species are coexisting. These findings are corroborated by the results obtained in the dataset with 15 species, where two species with broad range distributions present in the previous dataset are eliminated, obtaining seven positive co-occurrences. We found that B. sylvaticum has co-occurrence relations with several species, such as B. pinnatum, B. rupestre, B. retusum, and B. phoenicoides, due to its wide distribution in Europe, Asia, and north of Africa. We demonstrate the utility of the algorithm implemented for the analysis of co-occurrence of 17 species of the genus Brachypodium, agreeing with distributions existing in nature. Data mining has been applied in the field of biological sciences, where a great amount of complex and noisy data of unseen proportion has been generated in recent years. Particularly, ecological data analysis represents an opportunity to explore and comprehend biological systems with data mining and bioinformatics tools.Entities:
Keywords: Association rules; Bioinformatics; Brachypodium; Co-occurrence analysis; Data mining
Year: 2019 PMID: 30656067 PMCID: PMC6336012 DOI: 10.7717/peerj.6193
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Basic parameter values used in the implementation of the algorithm.
| Variable | Value |
|---|---|
| Minimum support | 0.15 |
| Minimum confidence | 0.3 |
| Negative minimum lift | 1 |
| Positive minimum lift | 1 |
Datasets specification using Brachypodium species.
| Dataset | Species | No. species | Total records |
|---|---|---|---|
| Dataset 1 | 2 | 105,986 | |
| Dataset 2 | 3 | 325 | |
| Dataset 3 | 4 | 26,018 | |
| Dataset 4 | 6 | 177,685 | |
| Dataset 5 | 7 | 177,691 | |
| Dataset 6 | 15 | 35,697 | |
| Dataset 7 | 17 | 179,026 |
Description of the native distribution of the 17 Brachypodium species used in datasets.
| Name | Native distribution | Records |
|---|---|---|
| Italy | 11 | |
| West Mediterranean | 8,908 | |
| Eurasia, SW Asia | 40,552 | |
| Mediterranean | 22,228 | |
| West Eurasia | 3,209 | |
| PanEurasia (Eurasia, Macaronesia) | 102,777 | |
| Circum-Mediterranean (Mediterranean, SW Asia) | 119 | |
| Circum-Mediterranean (Mediterranean, Macaronesia, SW Asia) | 40 | |
| Circum-Mediterranean (Mediterranean, Macaronesia, SW Asia) | 166 | |
| Macaronesia: Canary isles (Spain) | 17 | |
| Spain: Betic mountain ranges (southern Spain) | 198 | |
| Tropical Africa and South Africa | 105 | |
| Taiwan | 107 | |
| Madagascar | 2 | |
| America (from Mexico to N Bolivia) | 533 | |
| South Africa | 49 | |
| East Mediterranean and SW Asia | 6 | |
| Total | 179,026 | |
Results of all itemsets with four different distances.
| Itemsets | Distance (km) | Total transactions | Unique transactions | All rules | Positive rules |
|---|---|---|---|---|---|
| Dataset 1 (two species) | 1 | 4,666 | 2 | 2 | 0 |
| 5 | 6,699 | 2 | 2 | 0 | |
| 10 | 10,878 | 2 | 2 | 0 | |
| 20 | 19,454 | 2 | 2 | 0 | |
| Dataset 2 (three species) | 1 | 27 | 7 | 4 | 0 |
| 5 | 39 | 7 | 4 | 0 | |
| 10 | 52 | 7 | 5 | 0 | |
| 20 | 109 | 9 | 5 | 0 | |
| Dataset 3 (four species) | 1 | 943 | 2 | 2 | 0 |
| 5 | 1,265 | 2 | 2 | 0 | |
| 10 | 1,826 | 2 | 2 | 0 | |
| 20 | 2,607 | 2 | 2 | 0 | |
| Dataset 4 (six species) | 1 | 111,891 | 72 | 4 | 4 |
| 5 | 128,843 | 72 | 4 | 4 | |
| 10 | 148,157 | 72 | 6 | 6 | |
| 20 | 160,731 | 71 | 16 | 16 | |
| Dataset 5 (seven species) | 1 | 111,892 | 75 | 4 | 4 |
| 5 | 128,848 | 77 | 4 | 4 | |
| 10 | 106,851 | 75 | 4 | 4 | |
| 20 | 160,735 | 86 | 16 | 16 | |
| Dataset 6 (15 species) | 1 | 23,149 | 41 | 2 | 0 |
| 5 | 26,039 | 77 | 2 | 0 | |
| 10 | 30,082 | 110 | 2 | 0 | |
| 20 | 32,490 | 121 | 7 | 7 | |
| Dataset 7 (17 species) | 1 | 112,321 | 118 | 4 | 4 |
| 5 | 129,566 | 207 | 4 | 4 | |
| 10 | 148,972 | 273 | 6 | 6 | |
| 20 | 161,475 | 281 | 16 | 16 |
Figure 1Number of appearances of each species in the rules generated in dataset 7 using all distances.
Figure 2Behavior of species in the rules generated in (A) dataset 6 and (B) dataset 7.
Figure 3Frequency of transactions created for each species. Species with positive generated rules are presented with dotted lines.
Figure 4Rules composition and complexity for datasets 6 and 7.
Figure 5Co-occurrences found using a geographical distance of 20 km for the different species in (A) dataset 6 and (B) dataset 7.