| Literature DB >> 35967462 |
Ram Kumar1, S C Sharma1.
Abstract
Query expansion is an important approach utilized to improve the efficiency of data retrieval tasks. Numerous works are carried out by the researchers to generate fair constructive results; however, they do not provide acceptable results for all kinds of queries particularly phrase and individual queries. The utilization of identical data sources and weighting strategies for expanding such terms are the major cause of this issue which leads the model unable to capture the comprehensive relationship between the query terms. In order to tackle this issue, we developed a novel approach for query expansion technique to analyze the different data sources namely WordNet, Wikipedia, and Text REtrieval Conference. This paper presents an Improved Aquila Optimization-based COOT(IAOCOOT) algorithm for query expansion which retrieves the semantic aspects that match the query term. The semantic heterogeneity associated with document retrieval mainly impacts the relevance matching between the query and the document. The main cause of this issue is that the similarity among the words is not evaluated correctly. To overcome this problem, we are using a Modified Needleman Wunsch algorithm algorithm to deal with the problems of uncertainty, imprecision in the information retrieval process, and semantic ambiguity of indexed terms in both the local and global perspectives. The k most similar word is determined and returned from a candidate set through the top-k words selection technique and it is widely utilized in different tasks. The proposed IAOCOOT model is evaluated using different standard Information Retrieval performance metrics to compute the validity of the proposed work by comparing it with other state-of-art techniques.Entities:
Keywords: Aquila optimization; COOT optimization; Information retrieval system; Modified Needleman Wunsch; Query expansion; Semantic information retrieval
Year: 2022 PMID: 35967462 PMCID: PMC9364863 DOI: 10.1007/s11227-022-04708-9
Source DB: PubMed Journal: J Supercomput ISSN: 0920-8542 Impact factor: 2.557
Fig. 1Overall architecture of the proposed model
Various grouping for each activity
| Different group | Activities |
|---|---|
| 0 | g,h,i,j |
| 1 | c,q |
| 2 | m,k,e |
| 3 | a,b,l |
| 4 | r,s |
| 5 | d |
| 6 | f,m,o |
| 7 | p,n,t |
Fig. 2Flowchart of the IAOCOOT algorithm
Symbol description
| Symbol | Description |
|---|---|
| Ranking | |
| Additional term size | |
| Average value of the term similarity | |
| Expansion word | |
| Query similarity | |
| Target word | |
| Mean vector obtained for each context word | |
| Vocabulary | |
| Best location on the search space | |
| Mean location of Aquila in the current iteration | |
| Total number of iterations | |
| Population size | |
| Random integer in the range [0, 1] | |
| Random location of Aquila | |
| Dimensionality | |
| Levy flight function | |
| Total number of search cycles | |
| Integer that lies within [1, d] | |
| Constant value (0.005) | |
| Random numbers that are set as 0 and 1, respectively | |
| Constant with a value of 0.001 | |
| Constant with a value of 1.5 | |
| The upper and lower bound values | |
| Quality function value | |
| Coot position | |
| Selected position of the leader |
Symbol description
| Symbol | Description |
|---|---|
| Query expansion terms | |
| Document collection | |
| Function that returns | |
| Vocabulary words included in | |
| Synsets | |
| Hypernyms | |
| Expansion term set | |
| Directed graph | |
| Query term and CET | |
| Frequency of | |
| Inverse document frequency of | |
| Total number of articles on Wikipedia |
Comparative analysis of MAP (mean average precision) for the TREC dataset
| Dataset used | Various methods | MAP |
|---|---|---|
| TREC dataset | WordNet | 0.2901 |
| Wikipedia | 0.3166 | |
| HQE | 0.3387 | |
| NOSIR | 0.3596 | |
| Proposed IAOCOOT | 0.3945 |
Comparative analysis of different expansion terms using the TREC dataset
| Various methods | 10 | 20 | 30 | 40 | 50 | 60 |
|---|---|---|---|---|---|---|
| WordNet | 0.2867 | 0.3045 | 0.3439 | 0.3402 | 0.3398 | 0.2853 |
| Wikipedia | 0.3065 | 0.2973 | 0.3667 | 0.3576 | 0.3521 | 0.3274 |
| HQE | 0.3398 | 0.3566 | 0.3512 | 0.3589 | 0.3464 | 0.3563 |
| NOSIR | 0.3361 | 0.3332 | 0.3698 | 0.3521 | 0.3497 | 0.3599 |
| Proposed IAOCOOT | 0.3334 | 0.3498 | 0.3675 | 0.3690 | 0.3587 | 0.3412 |
Fig. 3Comparative analysis of precision-recall curve awithout query expansion bwithquery expansion
Fig. 4Comparative analysis of various performance metrics
Query expansion terms were obtained by using different methodologies in the TREC dataset
| Query ID | Original query | Expansion terms acquired from WordNet | Expansion terms acquired from wikipedia | Expansion terms acquired from the proposed model |
|---|---|---|---|---|
| 132 | Covid19 vaccine | Immunogenicity, vitro diagnostics, outweigh risks, invading germs, aspirin, immunization, pre-clinical development | CDC’s covid-19 booster tool, vaccination card, COVAX global vaccine, acetaminophen, immunization, PubMed, aspirin | Pfizer, Moderna, 2-dose series, immunocompromised, booster dose, mRNA covid-19 vaccine, ibuprofen, metformin, allergic reactions, mammogram, |
| 141 | Ukraine disaster | Turbine hall, combustible material, burning reactor, fission products, nuclear accident, coal miners, core catchers | Relief valves, turbine generation, RBMK control rods, steam explosion, steam boiler, radioactive fallout, ionized airglow | RBMK-type nuclear reactor, decontamination, Chornobyl nuclear power plant, acute radiation, ionizing radiation, radioactive decay, control rods |
| 153 | Ukraine refugee relief | 762 project, psychological counseling, GoFundMe Campaign, Swiss-based organization, troop buildup, sunflower of peace | Project hope, UNICEF, world central kitchen, voices of children, the international committee of children, hygiene kits | CARE, Convoy of hope, Doctors without borders, international medical corps, Internews, humanitarian aid, Kyiv independent |
Fig. 5Comparative analysis of mean average precision
Query expansion for various metrics
| Performance metrics | |||
|---|---|---|---|
| Precision | 0.53 | 0.50 | 0.54 |
| Recall | 0.49 | 0.46 | 0.50 |
| F-measure | 0.46 | 0.40 | 0.45 |
| MAP | 0.42 | 0.38 | 0.40 |
Fig. 6Query expansion analysis