| Literature DB >> 35333180 |
Laura Douze1,2, Sylvia Pelayo1,2, Nassir Messaadi3, Julien Grosjean4,5, Gaétan Kerdelhué4,5, Romaric Marcilly1,2.
Abstract
BACKGROUND: A major factor in the success of any search engine is the relevance of the search results; a tool should sort the search results to present the most relevant documents first. Assessing the performance of the ranking formula is an important part of search engine evaluation. However, the methods currently used to evaluate ranking formulae mainly collect quantitative data and do not gather qualitative data, which help to understand what needs to be improved to tailor the formulae to their end users.Entities:
Keywords: human factors; information retrieval; search engine; search result ranking; topical relevance; user testing
Year: 2022 PMID: 35333180 PMCID: PMC8994140 DOI: 10.2196/30258
Source DB: PubMed Journal: JMIR Hum Factors ISSN: 2292-9495
Figure 1A screenshot of a search results page of LiSSa.fr [25].
Weighting of each criterion in the ranking formulae A and B.
| Criterion | Weighting for formula A | Weighting for formula B |
| Title | 10 | 10 |
| Subtitle | 10 | 10 |
| Author keywords | 5 | 5 |
| Major MeSHa termsb | 4 | 4 |
| Minor MeSH terms | 1 | 1 |
| Nonexploded indexing | 3 | 3 |
| Exploded indexingc | 1 | 1 |
| Manual indexingd | 3 | 3 |
| Automatic indexing | 1 | 1 |
| Year of publication | 10 for the current year and −2 for each year in the past | 10 for the current year and −0.6 for each year in the past |
| Type of publication; for example, good practice guidelines, consensus statements, directives, literature reviews, and meta-analyses | 0 | 3 |
aMeSH: Medical Subject Headings.
bIn the field of biomedicine, articles are often indexed according to the MeSH thesaurus. LiSSa considers the MeSH terms to be major when they correspond to one of the article’s main themes or minor when they correspond to one of the article’s subthemes.
cThe MeSH thesaurus is structured like a tree; an MeSH term typically has several hierarchical levels above and below it. For example, asthma belongs to the bronchial diseases category and one of its narrower terms is status asthmaticus. A search for asthma will thus also find an article indexed as status asthmaticus but the latter will be less weighted because indexing is said to be exploded.
dSome documents are indexed by a National Library of Medicine indexer; this is referred to as manual indexing. Other documents are indexed by text mining tools, which is referred to as automatic indexing. Manual indexing is considered to be more accurate and efficient than automatic indexing.
Distribution of the participants according to the order in which the formulae and the predetermined search queries were presented (N=20).
| Order of tested formula and predetermined search query | General physician participants, n (%) | Registrar participants, n (%) | |
|
| |||
|
| Query 1 | 3 (15) | 2 (10) |
|
| Query 2 | 2 (10) | 3 (15) |
|
| |||
|
| Query 1 | 2 (10) | 3 (15) |
|
| Query 2 | 3 (15) | 2 (10) |
List of the criteria shown to the participants.
| Name | Explanation |
| Title | The keyword is present in the article’s title. |
| Subtitle | The keyword is present in the article’s subtitle. |
| Author keywords | The keyword is present in the author keywords. |
| Abstract | The keyword is present in the article’s abstract. |
| Major MeSHa term | The keyword is present in the major MeSH term. |
| Minor MeSH term | The keyword is present in the minor MeSH term. |
| Exploded indexing or notb | Points are awarded if the indexing is not exploded (the keyword is the same as the MeSH term) vs exploded indexing (the keyword is found among the narrower MeSH terms). |
| Manual or automatic indexing | Points are awarded if the indexing is manual (performed by a National Library of Medicine indexer) rather than automatic (performed by text mining). |
| Association with a qualifier | Points are subtracted if the indexing qualifier is specified: for example, with |
| Year of publication | Points are awarded as a function of the article’s year of publication: the more recent it is, the more points it will be awarded. |
| Type of publication | Points are awarded if the article is a literature review, a good practice guideline, a consensus statement, a directive, or a meta-analysis. |
| Presence of an abstract | Points are awarded if an abstract in French is directly available on LiSSa (ie, without having to visit the journal’s website). |
| The journal’s importance | Points are awarded as a function of the journal’s impact. |
aMeSH: Medical Subject Headings.
bThe MeSH thesaurus contains qualifiers that can be linked to each keyword to make it more precise. For example, the index entry asthma can be specified by the qualifier diagnosis (asthma/diagnosis), to tell the reader that only the diagnosis of asthma is addressed in the article, and not its other aspects (treatment, complication, etc).
Participant characteristics.
| Participant number | Profile | Age (years) | Number of years of practice (including internship semesters for registrar) | Self-reported frequency of use of a search engine |
| P1 | GPa | 29 | 2 | Frequently |
| P2 | GP | 28 | 0 | Frequently |
| P3 | GP | 30 | 2.5 | Frequently |
| P4 | GP | 55 | 26 | Frequently |
| P5 | GP | 56 | 29 | Frequently |
| P6 | GP | 68 | 30 | Frequently |
| P7 | GP | 53 | 16 | Frequently |
| P8 | GP | 53 | 25 | Not often |
| P9 | GP | 55 | 27 | Frequently |
| P10 | GP | 33 | 5 | Frequently |
| P11 | Registrar | 24 | 0.5 | Never |
| P12 | Registrar | 26 | 0.5 | Never |
| P13 | Registrar | 28 | 4 | Never |
| P14 | Registrar | 26 | 1.5 | Frequently |
| P15 | Registrar | 30 | 4 | Not often |
| P16 | Registrar | 26 | 2 | Frequently |
| P17 | Registrar | 28 | 1.5 | Frequently |
| P18 | Registrar | 25 | 1.5 | Not often |
| P19 | Registrar | 31 | 4.5 | Frequently |
| P20 | Registrar | 29 | 5 | Not often |
aGP: general practitioner.
The mean and median ranking scores, the normalized discounted cumulative gain (NDCG), and overall satisfaction scores for formulae A and B (N=20 participants).
|
| Formula A | Formula B | ||
| Main ranking score, median (IQR), out of 5 | 3.57 (4-2.5) | 3.82 (4-3.5) | 3518.5 | .02 |
| Main NDCG, median (IQR), out of 1 | 0.87 (0.95-0.83) | 0.97 (0.99-0.94) | 7 | .01 |
| Overall satisfaction score, median (IQR), out of 7 | 4.7 (5-4.6) | 5.8 (6-5.6) | 27.5 | .01 |
Figure 2Types of negative verbalization about the articles for formula A or formula B; the number of participants is stated.
Figure 3Distribution of the overall satisfaction score for formulae A and B (left panel), and the number of participants who gave formula A a higher, equal, or lower score than formula B (right panel).
Mean and median criterion ranks (n=18 participants).
| Criterion | Mean rank | Median rank (IQR) |
| Title | 1.8 | 1 (1.87-1) |
| Abstract | 4.5 | 3 (5-3) |
| Author keywords | 5.0 | 4 (6.75-4) |
| Subtitle | 5.7 | 4 (6.75-2.25) |
| Type of publication | 6.1 | 6 (7.38-5) |
| Major MeSHa term | 6.5 | 6 (8-5) |
| Year of publication | 7.8 | 7.5 (10-5.25) |
| Presence of an abstract | 8.1 | 7.8 (11.5-4) |
| Manual or automatic indexing | 8.1 | 9 (11-6) |
| Associated with a qualifier | 8.6 | 9 (11-7.62) |
| The journal’s impact | 9.3 | 10 (11-8) |
| Exploded or nonexploded indexing | 9.9 | 10 (12-7.63) |
| Minor MeSH term | 9.9 | 10.8 (11.75-9) |
aMeSH: Medical Subject Headings.
Figure 4Boxplots of the scores (from 1 to 13) for each criterion. MeSH: Medical Subject Headings.
Adaptations of the ranking formula, the associated justifications, and their state of implementation.
| Criterion | Opportunity for improvement | Justification | State of implementation |
| Abstract | Take account of the keyword’s presence in the abstract. | Currently, the abstract is not considered at all, even though (on average), it was the second most important criterion, right after the title. However, the abstract is less strictly controlled than the author keywords and the MeSHa indexing, giving it less weight that the latter. | In total, 3 points have been attributed to this criterion. |
| Subtitle | Lower is the weight attributed to the keyword’s presence in the subtitle, relative to its presence in the title. | The subtitle had the same weighting as the title (ie, 10) but was judged to be less important by the participants because it was less useful. | The number of points attributed to this criterion has dropped from 10 to 8. |
| Type of publication | Add a subcriterion to downgrade types of publication judged to be irrelevant by the users. | The | A subcriterion had been added to the |
| Associated with a qualifier | Promote subject headings without a qualifier, except when the keyword is a qualifier. | To prioritize articles that generally address the search subject in first search results, adding the | One point is added when the subject heading is not associated with a qualifier. |
| The journal’s impact | Add this criterion but do not give it much weight. | This criterion is not of major importance to users but can be useful for differentiating between 2 articles with the same score. It was recommended that this criterion should be taken into account when calculating the scores but should not be given much weight. | This item has not yet been incorporated into the LiSSa database. At present, this information is available for only 30% of articles; it will therefore be necessary to determine the relevance of integrating this criterion into the formula. |
| Operation of the ranking formula | Add the points awarded for the | During the tests, some publications considered by the participants to be off-topic were listed in the top search results (eg, an article on bipolar depression for the query on | This recommendation needs to be tested because it might have a negative effect on ranking the search results; it might overprioritize the articles with a large number of indexed keywords (>20, in some cases), relative to articles with few keywords. |
aMeSH: Medical Subject Headings.