| Literature DB >> 35591257 |
Zhigang Sun1,2, Aiping Jiang1, Guotao Wang1,2, Min Zhang1, Huizhen Yan2.
Abstract
Existing material identification for loose particles inside sealed relays focuses on the selection and optimization of classification algorithms, which ignores the features in the material dataset. In this paper, we propose a feature optimization method of material identification for loose particles inside sealed relays. First, for the missing value problem, multiple methods were used to process the material dataset. By comparing the identification accuracy achieved by a Random-Forest-based classifier (RF classifier) on the different processed datasets, the optimal direct-discarding method was obtained. Second, for the uneven data distribution problem, multiple methods were used to process the material dataset. By comparing the achieved identification accuracy, the optimal min-max standardization method was obtained. Then, for the feature selection problem, an innovative multi-index-fusion feature selection method was designed, and its superiority was verified through several tests. Test results show that the identification accuracy achieved by RF classifier on the dataset was improved from 59.63% to 63.60%. Test results of ten material verification datasets show that the identification accuracies achieved by RF classifier were greatly improved, with an average improvement of 3.01%. This strongly promotes research progress in loose particle material identification and is an important supplement to existing loose particle detection research. This is also the highest loose particle material identification accuracy achieved to in aerospace engineering, which has important practical value for improving the reliability of aerospace systems. Theoretically, it can be applied to feature optimization in machine learning.Entities:
Keywords: RF classifier; feature selection; material identification; missing value processing; sealed relays; standardization and normalization processing
Mesh:
Year: 2022 PMID: 35591257 PMCID: PMC9102643 DOI: 10.3390/s22093566
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1Block diagram of the loose particle material identification experimental system.
Figure 2DZJC-III PIND loose particle automatic detection system.
Model of the sealed relay samples, material information, and weight information about the contained loose particles.
| Model One | Model Two | Model Three | |||
|---|---|---|---|---|---|
| Material | Weight | Material | Weight | Material | Weight |
| Copper wires | 0.04 mg | Solder particles | 0.03 mg | Aluminum particles | 0.03 mg |
| 0.08 mg | 0.07 mg | 0.06 mg | |||
| 0.11 mg | 0.11 mg | 0.10 mg | |||
| 0.15 mg | 0.15 mg | 0.15 mg | |||
| 0.19 mg | 0.19 mg | 0.19 mg | |||
| Hot glue particles | 0.03 mg | PVC particles | 0.03 mg | Silica gel particles | 0.6 mg |
| 0.08 mg | 0.07 mg | 0.9 mg | |||
| 0.12 mg | 0.11 mg | 1.2 mg | |||
| 0.15 mg | 0.15 mg | 1.4 mg | |||
| 0.16 mg | 0.19 mg | 1.6 mg | |||
Figure 3Sealed relay samples. (a) Sealed relay samples of model one; (b) sealed relay samples of model two; (c) sealed relay samples of model three.
Experimental conditions specified according to the Chinese GJB65B standard.
| Impact Acceleration | Vibration Frequency | Vibration Acceleration |
|---|---|---|
| 200 g | 27 Hz | 5 g |
| 200 g | 40 Hz | 5 g |
| 200 g | 100 Hz | 5 g |
Detailed description of loose particle features.
| Feature Name | Feature Description | Symbolic Representation |
|---|---|---|
| Pulse area | Area of the pulse signal. |
|
| Degree of symmetry between left and right | Initiation process of the signal characterized by the symmetry angle. |
|
| Characterize the onset speed of the signal from the perspective of the pulse rising speed | Initiation velocity of the signal characterized Pulse Rise Percentage. |
|
| Duration | Difference between the start and the end time. |
|
| Energy density | Measure of the distribution characteristics of the signal energy. |
|
| Pulse ratio | Ratio of pulse duration to pulse length. |
|
| Crest factor | Extreme degree of the peak value in the waveform. |
|
| Degree of symmetry between upper and lower | Ratio of rise time to fall time. |
|
| Area ratio | Area ratio of the spectrum. |
|
| Zero-crossing rate | Value of the sign change of a section of signal; its magnitude is related to the frequency of the signal. |
|
| Variance | Used to calculate the difference between each variable and the overall mean. |
|
| Spectral centroid | Used to describe the spectral distribution and characterize the frequency of the loose particle signal. |
|
| Cepstral coefficient | Obtained from the square root of the mean square frequency; used to describe the energy spectrum. |
|
| Cepstral coefficient difference | Obtained from the square root of the frequency variance; also used to describe the energy spectrum. |
|
Description of missing values in the material dataset.
| Label | Total Number of Data Points | Number of Missing Values |
|---|---|---|
| 0 | 175,416 | 231 |
| 1 | 169,943 | 243 |
| 2 | 177,936 | 186 |
| 3 | 168,796 | 84 |
| 4 | 173,105 | 208 |
| 5 | 174,580 | 157 |
Prediction accuracies achieved by the RF classifier on six datasets.
| Method | Mean/% | Median/% | Mode/% | Lagrange Interpolation/% | Newton Interpolation/% | Direct Discarding/% | |
|---|---|---|---|---|---|---|---|
| Number | |||||||
| 1 | 57.92 | 58.02 | 57.90 | 59.45 | 59.32 | 59.83 | |
| 2 | 58.05 | 57.85 | 57.89 | 59.51 | 59.36 | 59.06 | |
| 3 | 58.01 | 57.88 | 57.84 | 59.43 | 59.29 | 59.75 | |
| 4 | 57.92 | 58.01 | 57.86 | 59.46 | 59.51 | 59.99 | |
| 5 | 57.96 | 57.87 | 57.83 | 59.45 | 59.42 | 59.37 | |
| 6 | 58.10 | 57.89 | 57.91 | 59.46 | 59.36 | 59.64 | |
| 7 | 58.05 | 57.91 | 57.82 | 59.50 | 59.47 | 59.52 | |
| 8 | 57.99 | 57.93 | 57.89 | 59.51 | 59.41 | 59.70 | |
| 9 | 58.12 | 57.89 | 57.82 | 59.47 | 59.37 | 59.94 | |
| 10 | 58.08 | 57.85 | 57.84 | 59.46 | 59.49 | 59.50 | |
| Mean value | 58.02 | 57.91 | 57.86 | 59.47 | 59.40 | 59.63 | |
Prediction accuracies achieved by the RF classifier on the three datasets.
| Method | z-Score Standardization/% | Min–Max Standardization/% | Row Normalization/% | |
|---|---|---|---|---|
| Number | ||||
| 1 | 63.86 | 63.96 | 63.00 | |
| 2 | 63.95 | 63.99 | 62.89 | |
| 3 | 63.27 | 63.40 | 63.01 | |
| 4 | 63.37 | 63.31 | 62.94 | |
| 5 | 63.45 | 63.38 | 63.01 | |
| 6 | 63.31 | 63.41 | 63.01 | |
| 7 | 63.34 | 63.27 | 63.17 | |
| 8 | 63.42 | 63.51 | 62.93 | |
| 9 | 63.40 | 63.41 | 62.97 | |
| 10 | 63.40 | 63.41 | 63.08 | |
| Mean value | 63.48 | 63.51 | 63.00 | |
Figure 4Heat map of the material dataset.
Feature selection effects of the material dataset.
| Stage | Absolute Value of Pcc 1 | Ranking Number | Ranking | Cumulative Sum of Rankings | Comprehensive Ranking | ||
|---|---|---|---|---|---|---|---|
| Feature | |||||||
|
| −0.0701 | 4 | 0 | 1 | 5 | 4 | |
|
| 0.0574 | 9 | 0 | 1 | 10 | 9 | |
|
| 0.0684 | 5 | 0 | 1 | 6 | 5 | |
|
| 0.0636 | 7 | 0 | 1 | 8 | 7 | |
|
| −0.1945 | 3 | 0 | 1 | 4 | 3 | |
|
| 0.0636 | 7 | 0 | 1 | 8 | 7 | |
|
| −0.0062 | 14 | 4.325047 × 10−7 | 14 | 28 | 14 | |
|
| −0.0137 | 13 | 1.521347 × 10−28 | 13 | 26 | 13 | |
|
| 0.0392 | 10 | 1.666064 × 10−221 | 10 | 20 | 10 | |
|
| −0.2025 | 2 | 0 | 1 | 3 | 2 | |
|
| −0.0317 | 12 | 6.039703 × 10−145 | 12 | 24 | 12 | |
|
| −0.2077 | 1 | 0 | 1 | 2 | 1 | |
|
| −0.0331 | 11 | 2.512781 × 10−158 | 11 | 22 | 11 | |
|
| 0.0641 | 6 | 0 | 1 | 7 | 6 | |
1 Pcc: Pearson correlation coefficient.
Feature selection effects of different feature selection methods.
| Method | Before Feature Selection/% | After Feature Selection/% | Increase/% |
|---|---|---|---|
| Pearson correlation coefficient | 63.51 | 48.79 | −14.72 |
| 63.51 | 63.51 | 0 | |
| Multi-index-fusion | 63.51 | 64.46 | 0.95 |
Identification effects in different processing stages.
| Stage | Identification Accuracy/% |
|---|---|
| Missing value processing | 59.63 |
| Standardization and normalization | 63.51 |
| Feature selection | 63.60 |
Detailed description of the material verification dataset.
| Label | Total Number of Data Points | Label | Total Number of Data Points |
|---|---|---|---|
| 0 | 9994 | 3 | 10,006 |
| 1 | 10,021 | 4 | 10,105 |
| 2 | 9987 | 5 | 10,018 |
Identification effects in different verification stages.
| Stage | Identification Accuracy/% |
|---|---|
| Missing value processing | 67.53 |
| Standardization and normalization | 69.88 |
| Feature selection | 70.14 |
Feature optimization effects achieved by the RF classifier on ten material verification datasets.
| Number | Before Optimization/% | After Optimization/% | Increase/% |
|---|---|---|---|
| 1 | 67.53 | 70.14 | 2.61 |
| 2 | 64.82 | 68.19 | 3.37 |
| 3 | 60.89 | 64.13 | 3.24 |
| 4 | 62.05 | 64.88 | 2.83 |
| 5 | 61.22 | 64.27 | 3.05 |
| 6 | 65.70 | 68.16 | 2.46 |
| 7 | 64.69 | 67.75 | 3.06 |
| 8 | 62.49 | 66.08 | 3.39 |
| 9 | 63.76 | 66.91 | 3.15 |
| 10 | 66.62 | 69.58 | 2.96 |