| Literature DB >> 26848429 |
Abstract
The purpose of this study is to construct a valid and rigorous fraudulent financial statement detection model. The research objects are companies which experienced both fraudulent and non-fraudulent financial statements between the years 2002 and 2013. In the first stage, two decision tree algorithms, including the classification and regression trees (CART) and the Chi squared automatic interaction detector (CHAID) are applied in the selection of major variables. The second stage combines CART, CHAID, Bayesian belief network, support vector machine and artificial neural network in order to construct fraudulent financial statement detection models. According to the results, the detection performance of the CHAID-CART model is the most effective, with an overall accuracy of 87.97 % (the FFS detection accuracy is 92.69 %).Entities:
Keywords: Artificial neural network; Bayesian belief network; Decision tree CART; Decision tree CHAID; Fraudulent financial statements; Support vector machine
Year: 2016 PMID: 26848429 PMCID: PMC4729758 DOI: 10.1186/s40064-016-1707-6
Source DB: PubMed Journal: Springerplus ISSN: 2193-1801
Fig. 1DT concept diagram
Fig. 2BBN concept diagram
Fig. 3SVM concept diagram
Fig. 4ANN concept diagram
Fig. 5ANN neural cell
Research variables and definitions
| Variables | No. | Variable description | Definition/formula (the year before the year of fraud) |
|---|---|---|---|
| Financial variables | X 1 | Accounts receivable ratio | Accounts receivable ÷ total assets |
| X 2 | Current assets ratio | Current assets ÷ total assets | |
| X 3 | Fixed assets ratio | Fixed assets ÷ total assets | |
| X 4 | Operating income to total assets | Operating income ÷ total assets | |
| X 5 | Net income to total assets | Net income ÷ total assets | |
| X 6 | Net income to fixed assets | Net income ÷ fixed assets | |
| X 7 | The proportion of cash against total assets | Cash ÷ total assets | |
| X 8 | Natural logarithm of total assets | ln total assets | |
| X 9 | Natural logarithm of total liabilities | ln total liabilities | |
| X 10 | Gross profit ratio | Gross profit ÷ net sales | |
| X 11 | Operating expenses ratio | Operating expenses ÷ net sales | |
| X 12 | Debt ratio | Total liabilities ÷ total assets | |
| X 13 | Current ratio | Current assets ÷ current liabilities | |
| X 14 | Quick ratio | Quick assets ÷ current liabilities | |
| X 15 | Inventory turnover | Cost of goods sold ÷ average inventory | |
| X 16 | Cash flow ratio | Operating cash flow ÷ current liabilities | |
| X 17 | Pre-tax profit ratio | Pre-tax profit ÷ net sales | |
| X 18 | Accounts receivable turnover | Net sales ÷ average accounts receivable | |
| X 19 | Sales growth rate | (Current year’s sales − last year’s sales) ÷ last year’s sales | |
| X 20 | Debt-to-equity ratio | Total liabilities ÷ total equity | |
| X 21 | Returns on assets before tax, interest, and depreciation | Income before tax, interest and depreciation ÷ average total assets | |
| X 22 | The ratio of current liabilities against total assets | Current liabilities ÷ total assets | |
| X 23 | Total asset turnover | Net sales ÷ average total assets | |
| Non-financial variables | X 24 | The major stockholders’ stockholding ratio | Number of stocks held by the major shareholders ÷ total number of common stocks outstanding |
| X 25 | Duality of board director and CEO | If duality of board director and CEO existed, it is set as 1; otherwise, 0 | |
| X 26 | Size of the board of directors | Number of directors | |
| X 27 | The ratio of pledged stocks held by directors and supervisors | Number of pledged stocks held by directors and supervisors ÷ number of stocks held by directors and supervisors | |
| X 28 | The ratio of stocks held by directors and supervisors | Number of stocks held by directors and supervisors ÷ total number of common stocks outstanding | |
| X 29 | Audited by BIG4 (the big four CPA firms) | 1 for companies audited by BIG4, otherwise, it is 0 | |
| X 30 | Number of outside supervisors | Number of outside supervisors |
Fig. 6Research design and procedure
Selection results of decision tree CART
| Variables | Variable importance |
|---|---|
| X16 (cash flow ratio) | 0.496 |
| X02 (current assets ratio) | 0.480 |
| X19 (sales growth rate) | 0.016 |
| X09 (natural logarithm of total liabilities) | 0.008 |
Selection results of decision tree CHAID
| Variables | Variable importance |
|---|---|
| X12 (debt ratio) | 0.2347 |
| X16 (cash flow ratio) | 0.2181 |
| X14 (quick ratio) | 0.2119 |
| X02 (current assets ratio) | 0.1337 |
| X21 (returns on assets before tax, interest, and depreciation) | 0.1282 |
| X11 (operating expenses ratio) | 0.0734 |
Detection accuracy of CART models—tenfold cross validation
| Model | FFS (%) | Non-FFS (%) | Overall accuracy (%) |
|---|---|---|---|
| CART–CART | 88.59 | 77.78 | 83.19 |
| CART–CHAID | 81.88 | 79.51 | 80.70 |
| CART–BBN | 77.18 | 73.22 | 75.20 |
| CART–SVM | 75.17 | 74.19 | 74.68 |
| CART–ANN | 75.84 | 74.16 | 75.00 |
Type I error and Type II error of CART models
| Model | Type I error rate (%) | Type II error rate (%) | Overall error rate (%) |
|---|---|---|---|
| CART–CART | 11.41 | 22.22 | 16.81 |
| CART–CHAID | 18.12 | 20.49 | 19.30 |
| CART–BBN | 22.82 | 26.78 | 24.80 |
| CART–SVM | 24.83 | 25.81 | 25.32 |
| CART–ANN | 24.16 | 25.84 | 25.00 |
Detection accuracy of CHAID models—tenfold cross validation
| Model | FFS (%) | Non-FFS (%) | Overall accuracy (%) |
|---|---|---|---|
| CHAID–CART | 92.69 | 83.24 | 87.97 |
| CHAID–CHAID | 79.19 | 71.37 | 75.28 |
| CHAID–BBN | 81.88 | 80.13 | 81.01 |
| CHAID–SVM | 79.87 | 78.23 | 79.05 |
| CHAID–ANN | 83.20 | 81.59 | 82.40 |
Type I error and Type II error of CHAID models
| Model | Type I error rate (%) | Type II error rate (%) | Overall error rate (%) |
|---|---|---|---|
| CHAID–CART | 7.31 | 16.76 | 12.03 |
| CHAID–CHAID | 20.81 | 28.63 | 24.72 |
| CHAID–BBN | 18.12 | 19.87 | 18.99 |
| CHAID–SVM | 20.13 | 21.77 | 20.95 |
| CHAID–ANN | 16.80 | 18.41 | 17.60 |
The t-test of the models
| Model | CART | CHAID | df | P value |
|---|---|---|---|---|
| CART | 134.147*** | 155.612*** | 9 | 0.000 |
| CHAID | 190.738*** | 115.696*** | 9 | 0.000 |
| BBN | 117.247*** | 307.987*** | 9 | 0.000 |
| SVM | 117.391*** | 193.874*** | 9 | 0.000 |
| ANN | 169.994*** | 223.735*** | 9 | 0.000 |
* Significant at P < 0.1; ** significant at P < 0.05; and *** significant at P < 0.01
The Wilcoxon rank-sum test of the models
| Model | Mean rank | Z score | P value | |
|---|---|---|---|---|
| CART | CHAID | |||
| CART | 7.1 | 13.9 | −2.5324 | 0.0057 |
| CHAID | 15.1 | 5.9 | 3.4395 | 0.0003 |
| BBN | 5.5 | 15.5 | −3.7418 | <0.0001 |
| SVM | 5.5 | 15.5 | −3.7418 | <0.0001 |
| ANN | 5.5 | 15.5 | −3.7418 | <0.0001 |