The trouble with keyword searches
Keywords are expected to help us identify relevant papers, when we conduct literature searches. Unfortunately, they’re not as effective at doing this as we might hope, since they aren’t always representative of the content of an article.
Subjectivity in keyword selection
Keywords may be selected by the author of a paper, in which case are likely to represent the themes which the author deems most important in their article (Névéol et al. 2010). However, these may not necessarily correspond with the dominant themes found in the paper itself. When authors do not provide keywords to accompany their own publications, they may be selected by editors, who then add their subjective interpretations of the text (Gerdsri et al., 2013; p.420).
Terminology used in keywords may vary according to preference, so that different terms are used by different authors to represent the same concept. Where standardised indexing terms are used, such as the Medical Subject Headings (MeSH®) in the bibliographic database, MEDLINE®, these can be substantially different from the author-selected keywords (Figure 1).
|Author keywords||MEDLINE Indexing Terms|
Rural health services
Health services accessibility
Hospitals, community/organisation & administration
Hospitals, rural/organisation & administration
Intensive care units/utilisation
Length of stay
Outcome assessment (health care)
Patient transfer/statistics & numerical data
Figure 1: Author keywords and MeSH indexing terms assigned to a sample article indexed in MEDLINE (Névéol et al. 2010)
Limited number of keywords
Author-selected keywords are also usually limited in number, typically to between 4 and 8 per article. This small number of keywords is unlikely to provide a comprehensive overview of the topics or themes in an article. Indeed, indexers assigned an average of 13.0 (+/-11.9) terms to papers in a collection of 14,398 open-access articles in PubMed Central®, suggesting that a greater number of terms is required to capture the thematic content of most papers.
An alternative approach
Fortunately, there’s a better way to enhance literature searches. Entity linking allows us to consider the context of words as well as the relationships between them. By linking words which carry the same meaning to an entity, we can extract entities from text, rather than relying on subjectively assigned keywords. Entities represent the themes contained in the text, removing the ambiguity associated with varying use of terminology.
Névéol, A., Doğan, R. I., & Lu, Z. (2010). Author Keywords in Biomedical Journal Articles. AMIA Annual Symposium Proceedings, 2010, 537–541.
Gerdsri, N., Kongthon, A. & Vatananan, R. S. (2013) Mapping the knowledge evolution and professional network in the field of technology roadmapping: a bibliometric analysis. Technology Analysis & Strategic Management, 25(4), 403-422.