| Literature DB >> 36212088 |
Mukhtar Opeyemi Yusuf1, Divya Srivastava1, Deepak Singh2, Vijaypal Singh Rathor3.
Abstract
Completely Automated Public Turing Test To Tell Computer and Humans Apart (CAPTCHA) is a computer program that prevents malicious computer users. Text-CAPTCHA schemes utilize less-computational costs. Hence, they are the most popularly used. This paper investigates the effectiveness of state-of-the-art (SOTA) text-CAPTCHA schemes, proposes a Multiview deep learning system to break them, and highlights their weaknesses. Rather than the usual single-view feature extraction, the proposed model explores correlational features from multiple views to increase the model's generalization and classification accuracy. The model combines convolutional neural networks and recurrent networks to preserve the input text-CAPTCHA's spatial and sequential order. The proposed system has successfully achieved average accuracies ranging from 93.6% to 100%, and the average time to break a text-CAPTCHA scheme ranges from 0.0032 to 0.21 seconds on eight different datasets. Furthermore, an ablation study on 71 human users was conducted to evaluate the effectiveness of the schemes. The results demonstrated that the proposed system effectively outperforms the human users whom the schemes were designed to serve. Lastly, when compared with existing systems, the proposed system outperforms existing SOTA systems with an accuracy gap of almost 40% higher.Entities:
Keywords: CAPTCHA; Connectionist temporal classification; Discriminative features; Multiview integration; Multiview learning classification; Security and privacy
Year: 2022 PMID: 36212088 PMCID: PMC9527388 DOI: 10.1007/s13042-022-01675-8
Source DB: PubMed Journal: Int J Mach Learn Cybern ISSN: 1868-8071 Impact factor: 4.377
Fig. 1Demonstrating some of the commonly used security features in securing text-CAPTCHA schemes: (a) Character isolated, (b) Overlapping, (c) Rotation and Random Noise, (d) Warping, (e) Hollow scheme, (f) Background as a noise
Fig. 2The proposed Multiview deep learning architecture. Two convolutional blocks are trained to extract spatial features from the Multiview text-CAPTCHA images. The extracted features are further transformed into sequential features using a bi-directional GRU. The extracted features are integrated and a connectionist temporal classification (CTC) loss function is applied
Fig. 3Randomly selected samples and corresponding generated Multiview data as used in this paper
Fig. 4A structure of a single unit LSTM cell
Fig. 5A structure of a single GRU cell
Fig. 6Showing the architecture of a bi-directional GRU
Showing the CAPTCHA schemes used in this study with their security features
| Dataset | Rotation | Warping | Background noise | Random lines/arcs | Random coloration | Overlapping | Size (in number of data sample) |
|---|---|---|---|---|---|---|---|
| CAPTCHA_V2 | – | – | 2140 | ||||
| Pypl CAPTCHA | – | – | 604800 | ||||
| Railway CAPTCHA | – | – | – | 100K | |||
| Sphinx CAPTCHA | – | 990K | |||||
| Images-1L-CAPTCHA | – | – | 100K | ||||
| Strokes CAPTCHA | – | – | 10K | ||||
| Sample CAPTCHA | – | – | 25K | ||||
| New_Data CAPTCHA | – | – | – | – | 10K |
Fig. 7Random samples of the dataset used
The pre-processing techniques implemented in the CAPTCHA dataset used in this study
| Dataset | Median filtering | Threshold | Bilateral filtering |
|---|---|---|---|
| CAPTCHA_V2 | – | ||
| Images-1L-CAPTCHA | – | ||
| PyPl CAPTCHA | – | ||
| Railway CAPTCHA | – | ||
| Sphinx CAPTCHA | – | ||
| Strokes CAPTCHA | – | ||
| Sample CAPTCHA | |||
| New_Data CAPTCHA | – |
Showing the average time taken to break the text-CAPTCHA schemes used in this study
| Dataset | Time taken (in seconds) |
|---|---|
| CAPTCHA_V2 | 0.0032 |
| Images-1L-CAPTCHA | 0.134 |
| PyPl CAPTCHA | 0.21 |
| Railway CAPTCHA | 0.067 |
| Sphinx CAPTCHA | 0.0514 |
| Strokes CAPTCHA | 0.0832 |
| Sample CAPTCHA | 0.1129 |
| New_Data CAPTCHA | 0.132 |
Dataset security features in the proposed system compared to dataset security features in existing systems. refers to datasets used in [41]. refers to datasets used in [19]
Showing the accuracy result for the text-CAPTCHA schemes used in this study
| Dataset | Accuracy | Epoch | Training time |
|---|---|---|---|
| CAPTCHA_V2 | 100% | 5 | 16mins 35s |
| Images-1L-CAPTCHA | 100% | 6 | 35mins 16s |
| Pypl CAPTCHA | 94% | 25 | 1hr 23mins |
| Railway CAPTCHA | 98% | 25 | 1hr 08mins |
| Sphinx CAPTCHA | 97.3% | 25 | 1hr 52mins |
| Strokes CAPTCHA | 97% | 25 | 1hrs 26mins |
| Sample CAPTCHA | 97.8% | 38 | 1hr 9mins |
| New_Data CAPTCHA | 93.6% | 26 | 1hr 22mins |
Fig. 8Piechart showing the demography of participant that participates in the ablation study
Fig. 9Chart comparing the performance accuracy of human users and the proposed system on all the text-CAPTCHA dataset used in this paper
Fig. 10Graphs demonstrating the train loss, test loss, and accuracy of the proposed system during training
Comparing the performance of the proposed system with existing systems on the CAPTCHA_V2 dataset
| CAPTCHA Schemes | Accuracy |
|---|---|
| Elie et al. [ | 51.1% |
| Ye et al. [ | 51.6% |
| Proposed system |
Proposed system performance compared to corresponding text-CAPTCHA schemes in existing systems with respect to their security features
| A: Proposed system | Accuracy (A) | B: Existing system [ | Accuracy (B) | C: Existing system [ | Accuracy (C) |
|---|---|---|---|---|---|
| Railway CAPTCHA | 98% | Dig | 95% | – | – |
| Images-1L-CAPTCHA | 100% | Baidu(2013) | 89% | – | – |
| Strokes CAPTCHA | 97% | Slashdot | 86.4% | – | – |
| PyPl CAPTCHA | 94% | Sina | 90% | Sina | 90% |
| Sample CAPTCHA | 97.8% | Taobao, reCAPTCHA (2011) | 90.7%, 87.4% | Tencent | 75.4% |