| Literature DB >> 27190752 |
Rodolfo A Pazos R1, Marco A Aguirre L1, Juan J González B1, José A Martínez F1, Joaquín Pérez O2, Andrés A Verástegui O1.
Abstract
In the last decades the popularity of natural language interfaces to databases (NLIDBs) has increased, because in many cases information obtained from them is used for making important business decisions. Unfortunately, the complexity of their customization by database administrators make them difficult to use. In order for a NLIDB to obtain a high percentage of correctly translated queries, it is necessary that it is correctly customized for the database to be queried. In most cases the performance reported in NLIDB literature is the highest possible; i.e., the performance obtained when the interfaces were customized by the implementers. However, for end users it is more important the performance that the interface can yield when the NLIDB is customized by someone different from the implementers. Unfortunately, there exist very few articles that report NLIDB performance when the NLIDBs are not customized by the implementers. This article presents a semantically-enriched data dictionary (which permits solving many of the problems that occur when translating from natural language to SQL) and an experiment in which two groups of undergraduate students customized our NLIDB and English language frontend (ELF), considered one of the best available commercial NLIDBs. The experimental results show that, when customized by the first group, our NLIDB obtained a 44.69 % of correctly answered queries and ELF 11.83 % for the ATIS database, and when customized by the second group, our NLIDB attained 77.05 % and ELF 13.48 %. The performance attained by our NLIDB, when customized by ourselves was 90 %.Entities:
Keywords: Databases; Natural language interface; Natural language processing; Semantic modelling
Year: 2016 PMID: 27190752 PMCID: PMC4851672 DOI: 10.1186/s40064-016-2164-y
Source DB: PubMed Journal: Springerplus ISSN: 2193-1801
Related works on customization and evaluation of NLIDBs
| NLIDB | Customized by | Complexity of the DB | Complexity of the query corpora | Comparison versus other NLIDBs | Performance |
|---|---|---|---|---|---|
| Masque/sql | – | – | – | – | – |
| Precise | The implementers |
|
| AT&T, CMU, MIT, SRI, BBN, UNISYS, MITRE, HEY (on ATIS). EQ, Mooney (on Mooney’s dataset) | Accuracy: 93.8 % (on ATIS). |
| NLPQC | Presumably the DBA | Library of the Concordia | Low | – | – |
| CLEF | The implementers | University: low moderate |
| – |
|
| DaNaLIX | The implementers | Geobase, Jobdata: low | Geoquery880, Jobquery640: moderate | COCKTAIL, GENLEX, NaLIX |
|
| C-PHRASE |
| Geobase: low | Geoquery880: moderate | Precise, WASP, SCISSOR, Z&C |
|
| Giordani and Moschitti ( | The implementers | Geobase: low | Geoquery500, Geoquery700: moderate | Precise, Krisp | F1*: 87 % |
| ELF, Conlon et al. ( | An expert | High | Unknown | – |
|
* F1 is the harmonic mean of accuracy and recall
Fig. 1Database schema of the SID
Fig. 2Example of column description in the SID
Fig. 3Proposed architecture for the NLIDB
Fig. 4Functionality layers of the translation module
Fig. 5Example of lexical tagging
Fig. 6Example of shallow analysis for discriminating syntactic categories
Fig. 7Example of information generated during the lexical analysis
Fig. 8Example of treatment of imprecise and alias values using the SID
Fig. 11Example of information generated during the semantic analysis
Fig. 9Example of table and column identification using the SID
Fig. 10Example of separation of the Select and Where phrases
Fig. 13Example of connected semantic graph
Fig. 12Example of determination of implicit joins
Fig. 14Example of grammatical descriptors in column descriptions
Evaluation results of the first experiment
| NLIDB | Total queries | Correct w/initial customization | Correct w/fine-tuning | Minutes for customizing one query | Recall w/initial customization (%) | Recall w/fine-tuning (%) |
|---|---|---|---|---|---|---|
| ELF | 70 | 7 | 8.28 | 93.4 | 10 | 11.83 |
| Our NLIDB | 70 | 17 | 31.28 | 8.4 | 24.28 | 44.69 |
Evaluation results of the second experiment
| NLIDB | Total queries | Correct w/initial customization | Correct w/fine-tuning | Minutes for customizing one query | Recall w/initial customization (%) | Recall w/fine-tuning (%) |
|---|---|---|---|---|---|---|
| ELF | 70 | 7 | 9.44 | 49.2 | 10 | 13.48 |
| Our NLIDB | 70 | 17 | 53.94 | 3.2 | 24.28 | 77.05 |
Evaluation results with a customization performed by the implementers
| NLIDB | Total queries | Correct w/initial customization | Correct w/fine-tuning | Minutes for customizing one query | Recall w/initial customization (%) | Recall w/fine-tuning (%) |
|---|---|---|---|---|---|---|
| ELF | 70 | 7 | 11 | 40 | 10 | 15.7 |
| Our NLIDB | 70 | 17 | 63 | 2.6 | 24.28 | 90 |
Evaluation results using the Geoquery250 corpus
| NLIDB | Total queries | Queries with substitutions | Correctly answered | Recall (including substitutions) (%) | Recall (excluding substitutions) (%) |
|---|---|---|---|---|---|
| Our NLIDB | 250 | 0 | 141 | 56.4 | 56.4 |
| ELF | 250 | 0 | 89 | 35.6 | 35.6 |
| C-Phrase | 250 | 63 (38 correct) | 179 | 71.6 | 56.4 |