An Interactive Iterative Method for Electronic Searching of Large Literature Databases

Marco A Hernandez, Pace University


PubMed® is an on-line literature database hosted by the U.S. National Library of Medicine. Containing over 21 million citations for biomedical literature--both abstracts and full text--in the areas of the life sciences, behavioral studies, chemistry, and bioengineering, PubMed ® represents an important tool for researchers. PubMed® searches return a list of citations based on keywords. Given the amount of information available through PubMed, this study asks, “How do you leverage your search to look for subtle relationships between documents?” Data suggest that users rely on only the first page or returned results. This presents a non-trivial problem. One reason is there is no formal standard or syntax, which has been identified to denote these relationships; another is that the nature of these relationships is ambiguous. Computational search models mimic techniques from Library Sciences, which anticipate that users usually do not fully understand, or cannot completely articulate their needs. Lacking complete information, librarians approximate need and return relevant documents in an iterative fashion. Current search engines rely on mathematical modeling for text retrieval. At a fundamental level, this involves the use of a dictionary or ontology for term refinement, word stemming (i.e. removing ing), and clustering analysis to represent the relationship between terms and documents. The nuances of the interaction with a librarian are lost in this transaction. This study created a tool, Iterative Matrix Search (IMS) that uses biomedical ontologies and electronic lexicons in order to include closely related words in a search. The tool returns an expanded key word list and a list of the documents where hits were found, as well as a normalized score outlining the relationship between the documents and key words. The study then surveyed life sciences researchers to gain an understanding of how they search for information. They submitted unsuccessful queries previously done against PubMed®. IMS expanded the participants’ initial queries and provided them with formatted results that allowed for connections through word associations. A follow-up interview determined researchers’ perceptions of tool’s utility in cataloging new information, knowledge, or insight. The technique demonstrated benefits of IMS to enable novel insights and relationships in the literature.

Subject Area

Information science|Computer science

Recommended Citation

Hernandez, Marco A, "An Interactive Iterative Method for Electronic Searching of Large Literature Databases" (2013). ETD Collection for Pace University. AAI3569888.



Remote User: Click Here to Login (must have Pace University remote login ID and password. Once logged in, click on the View More link above)