| Literature DB >> 25302338 |
Suduan Chen1, Yeong-Jia James Goo2, Zone-De Shen2.
Abstract
As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.Entities:
Mesh:
Year: 2014 PMID: 25302338 PMCID: PMC4180392 DOI: 10.1155/2014/968712
Source DB: PubMed Journal: ScientificWorldJournal ISSN: 1537-744X
Figure 1Train and test subsets design.
Figure 2Research model.
Results of stepwise regression variable screening.
| Variable code | Variable classification | Variable description | Pr > ChiSq |
|---|---|---|---|
|
| Financial | Accounts receivables/total assets | 0.2401 |
|
| Financial | Inventory/current assets | 0.0339 |
|
| Financial | Interest protection multiples | 0.0694 |
|
| Financial | Debt ratio | 0.0294 |
|
| Financial | Cash flow ratio | 0.0025 |
|
| Financial | Accounts payable turnover | 0.0295 |
|
| Financial | Operation profit/last year operation profit >1.1 | 0.0267 |
|
| Nonfinancial | Pledge ratio of shares of the directors and supervisors | 0.0473 |
Hit ratio of three models using the train datasets.
| Research model | C5.0 | Logistic | SVM |
|---|---|---|---|
| Hit ratio | 93.94% | 83.33% | 78.79% |
C5.0 cross-validation results.
| C5.0 model | Predict value | Hit ratio | Type I error | Type II error | |||
|---|---|---|---|---|---|---|---|
| Non-FFS | FFS | ||||||
| Actual value | CV1 | Non-FFS | 25 | 3 | 83.93% | 10.71% | 21.42% |
| FFS | 6 | 22 | |||||
| CV2 | Non-FFS | 25 | 3 | 87.50% | 10.71% | 14.28% | |
| FFS | 4 | 24 | |||||
| CV3 | Non-FFS | 25 | 3 | 85.71% | 10.71% | 17.85% | |
| FFS | 5 | 23 | |||||
|
| |||||||
| Average | 25 | 3 | 85.71% | 10.71% | 17.85% | ||
| 5 | 23 | ||||||
SVM cross-validation results.
| SVM model | Predict value | Hit ratio | Type I error | Type II error | |||
|---|---|---|---|---|---|---|---|
| Non-FFS | FFS | ||||||
| Actual value | CV1 | Non-FFS | 26 | 2 | 73.21% | 7.14% | 46.42% |
| FFS | 13 | 15 | |||||
| CV2 | Non-FFS | 26 | 2 | 71.43% | 7.14% | 50.00% | |
| FFS | 14 | 14 | |||||
| CV3 | Non-FFS | 26 | 2 | 71.43% | 7.14% | 50.00% | |
| FFS | 14 | 14 | |||||
|
| |||||||
| Average | Non-FFS | 26 | 2 | 72.02% | 7.14% | 48.81% | |
| FFS | 14 | 14 | |||||
Logistic regression cross-validation results.
| Logistic regression model | Predict value | Hit ratio | Type I error | Type II error | |||
|---|---|---|---|---|---|---|---|
| Non-FFS | FFS | ||||||
| Actual value | CV1 | Non-FFS | 25 | 3 | 80.36% | 10.71% | 28.57% |
| FFS | 8 | 20 | |||||
| CV2 | Non-FFS | 26 | 2 | 82.14% | 7.14% | 28.57% | |
| FFS | 8 | 20 | |||||
| CV3 | Non-FFS | 25 | 3 | 80.36% | 10.71% | 28.57% | |
| FFS | 8 | 20 | |||||
|
| |||||||
| Average | Non-FFS | 25 | 3 | 80.95% | 9.52% | 28.57% | |
| FFS | 8 | 20 | |||||
Summary of classification results.
| Model | Type I error | Type II error | Hit ratio | Ranking |
|---|---|---|---|---|
| Logistic regression | 9.52% | 28.57% | 80.95% | 2 |
| SVM | 7.14% | 48.81% | 72.02% | 3 |
| DT C5.0 | 10.71% | 17.85% | 85.71% | 1 |
Paired-samples t test.
| Model |
| DF | Significant (two-tailed) |
|---|---|---|---|
| C5.0—logistic | −5.201 | 2 | 0.35 |
| Logistic—SVM | −16.958 | 2 | 0.03 |
| SVM—C5.0 | 9.823 | 2 | 0.10 |
Selection of the research variables.
| Variable classification | Variable code | Variable description and computation |
|---|---|---|
| Financial variables |
| Accounts receivables/total assets |
|
| Gross profit/total assets | |
|
| Inventory/current assets | |
|
| Inventory/total assets | |
|
| Net profit after tax/total assets | |
|
| Net profit after tax/fixed assets | |
|
| Cash/total assets | |
|
| Log total assets | |
|
| Log total liabilities | |
|
| Interest protection multiples (debt service coverage ratio, times interest earned) | |
|
| Gross profit margin | |
|
| Operating expense ratio | |
|
| Debt ratio | |
|
| Inventory turnover | |
|
| Cash flow ratio | |
|
| Net profit ratio before tax | |
|
| Accounts payable turnover | |
|
| Revenue growth rate | |
|
| Debt/equity ratio | |
|
| Earnings before interest, taxes, depreciation, and amortization | |
|
| Current liabilities/total assets | |
|
| Total assets turnover | |
|
| Account receivable/last year accounts receivable >1.1 | |
|
| Operation profit/last year operation profit >1.1 | |
|
| ||
| Nonfinancial variables |
| Shareholding ratio of the major shareholders |
|
| Shareholding ratio of directors and supervisors | |
|
| Whether the chairman concurrently holds the position of CEO | |
|
| Board size | |
|
| Pledge ratio of shares of the directors and supervisors | |