| Literature DB >> 26250142 |
Carlos R Sanquetta1, Jaime Wojciechowski2, Ana P Dalla Corte3, Alexandre Behling4, Sylvio Péllico Netto5, Aurélio L Rodrigues6, Mateus N I Sanquetta7.
Abstract
BACKGROUND: The traditional method used to estimate tree biomass is allometry. In this method, models are tested and equations fitted by regression usually applying ordinary least squares, though other analogous methods are also used for this purpose. Due to the nature of tree biomass data, the assumptions of regression are not always accomplished, bringing uncertainties to the inferences. This article demonstrates that the Data Mining (DM) technique can be used as an alternative to traditional regression approach to estimate tree biomass in the Atlantic Forest, providing better results than allometry, and demonstrating simplicity, versatility and flexibility to apply to a wide range of conditions.Entities:
Mesh:
Year: 2015 PMID: 26250142 PMCID: PMC4528850 DOI: 10.1186/s12859-015-0662-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Statistical criteria of model selection applied to 180 data of individual biomass of native trees of the Atlantic Forest, using Data Mining. (Chebyshev distance, Manhattan distance, Quadratic Euclidean distance, Euclidean distance; ●:1/d 2, ○: 1/d; all variables)
Fig. 2Relationship between statistical criteria for model selection applied to the data of individual biomass of native trees of Atlantic Forest, Brazil (R² adj: adjusted coefficient of determination; syx: standard error of estimate; AIC: Akaike Information Criterion; BIC: Bayesian Information Criterion)
Fig. 3Statistical criteria of model selection applied to the data of individual biomass of native trees of the Atlantic Forest, using Data Mining: a - log transformed data; b - reduced data size. (Chebyshev distance, Manhattan distance, Quadratic Euclidean distance, Euclidean distance; ●:1/d 2, ○: 1/d; all variables)
Criteria for model selection for the Schumacher-Hall equation applied to 180 data of individual biomass of native species of the Atlantic Forest biome, Brazil
| R2adj. | Syx | AIC | BIC |
|---|---|---|---|
| 0.8082 | 10.4521 | 847.83 | 1792.14 |
Fig. 4Graphical analysis of residuals of models applied to the data of individual biomass of native trees of the Atlantic Forest: a - Data Mining, b - Schumacher-Hall model
Fig. 5Exemplification of the DM technique through the nearest neighbor distance
Criteria for model selection of individual tree biomass estimation in the Atlantic Forest biome, Brazil
| Criterion | Formulation | ||
|---|---|---|---|
| 1 | Adjusted coefficient of determination |
| (7) |
| Where: | (8) | ||
| 2 | Standard error of estimate |
| (9) |
| 3 | Akaike Information Criterion (Akaike [ |
| (10) |
| Akaike Information Criterion unbiased for small samplesa, used when |
| (11) | |
| 4 | Information Criterion or Bayesian Schwartz (Schwartz, [ |
| (12) |
| 5 | Residual Analysis |
| (13) |
aAccording to Burnham and Anderson [45]. Where: n = number of cases; k = number of parameters of the model
ŵ = Estimated biomass. w = Actual biomass; = average observed biomass
In AIC, AIC and BIC k must be increased by 1, which refers to a degree of freedom for the variance