| Literature DB >> 29856787 |
Lúcia Adriana Dos Santos Gruginskie1, Guilherme Luís Roehe Vaccaro1,2.
Abstract
The quality of the judicial system of a country can be verified by the overall length time of lawsuits, or the lead time. When the lead time is excessive, a country's economy can be affected, leading to the adoption of measures such as the creation of the Saturn Center in Europe. Although there are performance indicators to measure the lead time of lawsuits, the analysis and the fit of prediction models are still underdeveloped themes in the literature. To contribute to this subject, this article compares different prediction models according to their accuracy, sensitivity, specificity, precision, and F1 measure. The database used was from TRF4-the Tribunal Regional Federal da 4a Região-a federal court in southern Brazil, corresponding to the 2nd Instance civil lawsuits completed in 2016. The models were fitted using support vector machine, naive Bayes, random forests, and neural network approaches with categorical predictor variables. The lead time of the 2nd Instance judgment was selected as the response variable measured in days and categorized in bands. The comparison among the models showed that the support vector machine and random forest approaches produced measurements that were superior to those of the other models. The evaluation of the models was made using k-fold cross-validation similar to that applied to the test models.Entities:
Mesh:
Year: 2018 PMID: 29856787 PMCID: PMC5983432 DOI: 10.1371/journal.pone.0198122
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Related works on overall case length.
| Keyword | Support Vector Machine | Naive Bayes | Random Forests | Data Mining | Regression Analysis | Survival Analysis |
|---|---|---|---|---|---|---|
| Length of judicial proceedings | 0 | 0 | 0 | 2 | 4 | 0 |
| Court delay | 0 | 0 | 0 | 2 | 19 | 3 |
| Disposition time and court | 0 | 0 | 0 | 1 | 51 | 2 |
| filings court time | 0 | 0 | 0 | 0 | 0 | 0 |
| Time to Court Case Resolution | 0 | 0 | 0 | 0 | 0 | 0 |
Source: The authors
Fig 1Histogram of lead time.
Fig 2Distribution of lead time by cabinet.
Fig 3Distribution of lead time by lawsuit class.
Fig 4A—Justice type and lead time, B—lawsuit type and lead time, C—main plaintiff and lead time, D—main defendant and lead time.
Fig 5Distribution of lead time by party.
Model parameters.
| Approach | Parameters |
|---|---|
| Support Vector Machine | Function = Kernel, Gamma = 0.95, cost = 0.3 |
| Naive Bayes | No parameters |
| Random Forest | Number of variables = 50, number of trees = 500 |
| Neural Network | Hidden layers = 5, Initial weights in [-0.1, 0.1], decay = 0.0005, maximum number of iterations = 500 |
Source: The authors.
Measures of model performance in the test phase.
| Approach and Class | Accuracy | Sensitivity | Specificity | Precision | F1 |
|---|---|---|---|---|---|
| Support Vector Machine | 77.58 | 91.61 | 54.41 | 76.84 | 83.58 |
| Naive Bayes | 69.83 | 74.26 | 62.52 | 76.59 | 75.41 |
| Random Forests | 77.65 | 89.72 | 57.73 | 77.80 | 83.33 |
| Neural Network | 75.76 | 90.46 | 51.50 | 75.49 | 82.30 |
| Support Vector Machine | 78.79 | 42.85 | 89.35 | 54.18 | 47.86 |
| Naive Bayes | 71.17 | 47.07 | 78.25 | 38.87 | 42.58 |
| Random Forests | 78.29 | 43.67 | 88.47 | 52.67 | 47.75 |
| Neural Network | 76.20 | 40.67 | 86.64 | 47.22 | 43.70 |
| Support Vector Machine | 89.65 | 22.23 | 97.36 | 49.11 | 30.60 |
| Naive Bayes | 85.70 | 29.64 | 92.12 | 30.09 | 29.86 |
| Random Forests | 89.00 | 26.31 | 96.17 | 44.05 | 32.94 |
| Neural Network | 89.42 | 12.55 | 98.22 | 44.62 | 19.59 |
| Support Vector Machine | 97.67 | 58.49 | 99.62 | 88.43 | 70.39 |
| Naive Bayes | 96.72 | 36.39 | 99.72 | 86.49 | 51.23 |
| Random Forests | 97.46 | 57.05 | 99.47 | 84.19 | 68.01 |
| Neural Net | 97.60 | 55.52 | 99.70 | 90.07 | 68.70 |
Source: The authors.
Accuracy of the models in the test phase and the 10-fold cross-validation.
| Approach | Test phase | 10-fold cross-validation (mean ± standard deviation) |
|---|---|---|
| Support Vector Machine | 71.84 | 71.47 ± 0.55 |
| Naive Bayes | 61.71 | 60.64 ± 0.60 |
| Random Forests | 70.79 | 70.85 ± 0.52 |
| Neural Network | 69.49 | 69.78 ± 0.48 |
Source: The authors.
Measures of performance of the models in the test phase and the 10-fold cross-validation (mean ± standard deviation).
| Approach | Accuracy | Sensitivity | Specificity | Precision |
|---|---|---|---|---|
| Support Vector Machine | 77.39 ± 0.45 | 91.17 ± 0.38 | 54.78 ± 0.98 | 76.78 ± 0.59 |
| Naive Bayes | 68,70 ± 0,44 | 73,29 ± 0,46 | 61,16 ± 0,95 | 75,58 ± 0,55 |
| Random Forests | 77.40 ± 0.36 | 89.49 ± 0.39 | 57.56 ± 0.90 | 77.57 ± 0.51 |
| Neural Network | 75.79 ± 0.49 | 91.83 ± 0.80 | 49.48 ± 2.58 | 74.90 ± 0.78 |
| Support Vector Machine | 78.62 ± 0.40 | 42.14 ± 1.29 | 89.22 ± 0.28 | 53.17 ± 0.74 |
| Naive Bayes | 70.53 ± 0.59 | 46.23 ± 1.11 | 77.60 ± 0.55 | 37.48 ± 0.87 |
| Random Forests | 78.10 ± 0.42 | 42.06 ± 1.08 | 88.57 ± 0.32 | 51.67 ± 0.91 |
| Neural Network | 77.25 ± 0.50 | 37.19 ± 2.46 | 88.89 ± 0.96 | 49.34 ± 1.05 |
| Support Vector Machine | 89.55 ± 0.35 | 24.74 ± 1.78 | 97.02 ± 0.21 | 48.98 ± 3.02 |
| Naive Bayes | 85.65 ± 0.38 | 29.52 ± 1.32 | 92.12 ± 0.44 | 30.21 ± 1.85 |
| Random Forests | 88.85 ± 0.35 | 28.43 ± 2.30 | 95.81 ± 0.25 | 43.90 ± 2.59 |
| Neural Network | 89.14 ± 0.29 | 15.77 ± 3.60 | 97.59 ± 0.44 | 42.74 ± 4.20 |
| Support Vector Machine | 97.38 ± 0.16 | 55.43 ± 1.76 | 99.60 ± 0.09 | 88.07 ± 2.63 |
| Naive Bayes | 96.39 ± 0.19 | 32.73 ± 1.57 | 99.76 ± 0.05 | 87.88 ± 2.44 |
| Random Forests | 97.24 ± 0.17 | 55.48 ± 1.59 | 99.44 ± 0.08 | 84.20 ± 1.82 |
| Neural Network | 97.37 ± 0.15 | 54.20 ± 2.12 | 99.65 ± 0.09 | 89.44 ± 2.47 |
Source: The authors.
Comparison between the approaches according to the accuracy, specificity and F1 in the evaluation phase—results of Tukey’s multiple comparisons test.
| Measure | Best performance |
|---|---|
| Class accuracy | SVM |
| Specificity | RF |
| F1 | SVM |
| Total accuracy | SVM (p value ≤ 0.05) |
Source: The authors.
* greater value