| Literature DB >> 28991154 |
Peilin He1, Pengfei Jia2, Siqi Qiao3, Shukai Duan4.
Abstract
For an electronic nose (E-nose) in wound infection distinguishing, traditional learning methods have always needed large quantities of labeled wound infection samples, which are both limited and expensive; thus, we introduce self-taught learning combined with sparse autoencoder and radial basis function (RBF) into the field. Self-taught learning is a kind of transfer learning that can transfer knowledge from other fields to target fields, can solve such problems that labeled data (target fields) and unlabeled data (other fields) do not share the same class labels, even if they are from entirely different distribution. In our paper, we obtain numerous cheap unlabeled pollutant gas samples (benzene, formaldehyde, acetone and ethylalcohol); however, labeled wound infection samples are hard to gain. Thus, we pose self-taught learning to utilize these gas samples, obtaining a basis vector θ. Then, using the basis vector θ, we reconstruct the new representation of wound infection samples under sparsity constraint, which is the input of classifiers. We compare RBF with partial least squares discriminant analysis (PLSDA), and reach a conclusion that the performance of RBF is superior to others. We also change the dimension of our data set and the quantity of unlabeled data to search the input matrix that produces the highest accuracy.Entities:
Keywords: electronic nose; self-taught learning; sparse autoencoder; wound infection
Mesh:
Substances:
Year: 2017 PMID: 28991154 PMCID: PMC5677371 DOI: 10.3390/s17102279
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Schematic diagram of the experimental system.
Figure 2Experimental setup.
Sensitive characteristics of gas sensors.
| Sensors | Sensitive Characteristics |
|---|---|
| TGS813 | Methane, Propane, Ethanol, Isobutane, Hydrogen, Carbon monoxide |
| TGS816 | Combustible gases, Methane, Propane, Butane, Carbon monoxide, Hydrogen, Ethanol, Isobutane |
| TGS822 | Organic solvent vapors, Methane, Carbon monoxide, Isobutane, n-Hexane, Benzene, Ethanol, Acetone |
| TGS2600 | Gaseous air contaminants, Methane, Carbon monoxide, Isobutane, Ethanol, Hydrogen |
| MQ135 | Ammonia, Benzene series material, Acetone, Carbon monoxide, Ethanol, Smoke |
Note: The response of these three sensors is non-specific. Table 1 just lists their main sensitive gases, and they are also sensitive to other gas.
Pathogens in wound infection and their metabolites.
| Pathogens | Metabolites |
|---|---|
| Acetic acid, Aminoacetophenone, Ammonia, Ethanol, Formaldehyde, Isobutanol, Isopentyl acetate, Isopentanol, Methyl ketones, Trimethylamine, 1-Undecene, 2,5-Dimethylpyrazine isoamylamine, 2-Methylamine | |
| Acetaldehyde, Acetic acid, Aminoacetophenone, Butanediol, Decanol, Dimethyldisulfide, Dimethyltrisulfide, Dodecanol, Ethanol, Formaldehyde, Formic acid, Hydrogen sulfide, Indole, Lactic acid, Methanethiol, Methyl ketones, Octanol, Pentanols, Succinic acid, 1-Propanol | |
| Butanol, Dimethyldisulfide, Dimethyltrisulfide, Esters, Methyl ketones, Isobutanol, Isopentanol, Isopentyl acetate, Pyruvate, Sulphur compounds, Toluene, 1-Undecene, 2-Aminoacetophenone, 2-Butanone, 2-Heptanone, 2-Nonanone, 2-Undecanone |
Concentration of the target gases.
| Gases | Concentration Range (ppm) | Number of Samples |
|---|---|---|
| benzene | [0.1721, 0.7056] | 480 (12 × 12) |
| formaldehyde | [0.0668, 0.1425] | 491 (12 × 11) |
| acetone | [0.0565, 1.2856] | 549 (12 × 12) |
| ethylalcohol | [0.0832, 0.6732] | 1144 (12 × 21) |
Amount of samples.
| Pollutant Gas | Amount of Samples | Wound Infection | Amount of Samples |
|---|---|---|---|
| benzene | 132 | 20 | |
| formaldehyde | 203 | 20 | |
| acetone | 153 | 20 | |
| ethylalcohol | 164 | uninfected | 20 |
Figure 3Response curve of the sensor array of S. aureus (one of the wound infections).
Figure 4Basic neural network.
Figure 5Sparse autoencoder.
Figure 6Flow chart of self-taught learning.
Figure 7Input feature matrix and reconstructed matrix of one sample.
Accuracy of three kinds of feature matrix with radial basis function (RBF) (%).
| Dimension | Raw | Spares Autoencoder | ||
|---|---|---|---|---|
| Training Set | Test Set | Training Set | Test Set | |
| 3 × 3 | 98.3 | 60 | 100 | 60 |
| 4 × 4 | 100 | 70 | 91.6 | 75 |
| 5 × 5 | 88.3 | 80 | 80 | 90 |
Figure 8Accuracy with different number of unlabeled samples. (a) shows the change of accuracy as the number of examples increases when dimension is 3 × 3; (b) shows the change of accuracy as the number of examples increases when dimension is 4 × 4.
Accuracy with partial least squares discriminant analysis (PLSDA) and RBF (%).
| Dimension | RBF | PLSDA | |||
|---|---|---|---|---|---|
| Train Set | Test Set | Train Set | Test Set | ||
| 3 × 3 | raw | 98.3 | 60 | 53.3 | 45 |
| sa | 100 | 60 | 53.3 | 45 | |
| 4 × 4 | raw | 100 | 70 | 80 | 60 |
| sa | 91.6 | 75 | 76.6 | 75 | |
| 5 × 5 | raw | 88.3 | 80 | 76.6 | 60 |
| sa | 80 | 90 | 78.3 | 40 | |
Note: “sa” in Table 6 represents “spares autoencoder”.
Figure 9Accuracy with RBF. When dimension changes from 3 × 3 to 5 × 5, (a) shows the accuracy of unprocessed data set (training set and test set); (b) shows the accuracy of data set processed by self taught learning.
Figure 10Accuracy with PLSDA. When dimension changes from 3 × 3 to 5 × 5, (a) shows the accuracy of unprocessed data set (training set and test set); (b) shows the accuracy of data set processed by self taught learning.
Accuracy with different hidden layer (%).
| Hidden Layer | 5 | 10 | 20 | 40 | 100 | 700 | 2000 | 10,000 |
|---|---|---|---|---|---|---|---|---|
| Training set | 96.6 | 80 | 93.3 | 84.15 | 91.6 | 91.6 | 91.6 | 91.6 |
| Test set | 65 | 90 | 90 | 87.5 | 80 | 75 | 75 | 75 |
Note: the dimension is 5 × 5 and the classifier is RBF.
Accuracy with different hidden layer (%).
| Hidden Layer | 5 | 10 | 20 | 40 | 100 | 700 | 2000 | 10,000 |
|---|---|---|---|---|---|---|---|---|
| Training set | 88.3 | 91.6 | 100 | 89.15 | 100 | 91.6 | 95 | 100 |
| Test set | 80 | 75 | 75 | 80 | 75 | 85 | 80 | 70 |
Note: the dimension is 4 × 4 and the classifier is RBF.
Accuracy with different hidden layer (%)
| Hidden Layer | 5 | 10 | 20 | 40 | 100 | 700 | 2000 | 10,000 |
|---|---|---|---|---|---|---|---|---|
| Training set | 96.6 | 100 | 100 | 98.3 | 100 | 96.6 | 98.3 | 98.3 |
| Test set | 60 | 60 | 60 | 75 | 65 | 75 | 70 | 65 |
Note: the dimension is 3 × 3 and the classifier is RBF.