top of page

Competency E

“Design, query, and evaluate information retrieval systems.”

 

Statement of Competency E

 

The foundational and fundamental values of library and information professionals are not only to collect, preserve, and organize various materials to meet the educational, recreational, and informational needs of the society in which they exist, but to also connect the individuals that comprise those communities with those materials in a logical and skilled way.  “It is simply stated: we have vast amounts of information to which accurate and speedy access is becoming ever more difficult” (van Rijsbergen, 1975, p. 3).  Today, users interact with many types of information retrieval systems (IRSs), including: catalogs, online public access catalogs (OPACs), databases, and search engines.  A clear understanding and knowledge of IRSs is crucial for librarians in order to successfully provide effective retrieval and service to users.  Knowledge about the way information retrieval systems are created provides information professionals with insight into how these systems function.  Information professionals must also be able to effectively query these systems in order to find the best and most accurate information for library users.  Comprehension about information retrieval systems also imparts librarians with the critical ability to evaluate the effectiveness of such systems.  Knowledge about querying, designing, and evaluating these systems are crucial to librarians and information professionals in order to successfully satisfy the information needs of users. 

 

Design

 

Design of information retrieval systems is a complex process that must take into account the ways users seek and use information in order to create meaningful and beneficial systems.  According to scholar Gary Marchionini (1995), “interface design is based on understanding fundamental features of humans and systems” (Chapter 2) and “to develop such interfaces, models of information seeking must be taken into consideration for the basis for design” (Chapter 2).  Users must remain at the center of the design of information retrieval systems because they will be the ones seeking information from these systems.

 

There are a number of concepts that an information professional should know when designing an information retrieval system.  Two important concepts that must be balanced when designing these systems are precision and recall.  Precision is all about the accuracy of search queries (Bates, 2012).  Recall, on the other hand, refers to the size of retrieved items, for example, the breadth of a search’s results (Bates, 2012).  An IRS that uses more specific search terms offers more precise results for the system’s end users, however this also narrows the number of results (reduces recall).  This implies that the number of retrieved items would be reduced if a system included more specific terms.  An information retrieval system’s end users can also miss a significant amount of essential information when a system emphasizes precision; this is because too much information can be filtered out when it fails to meet certain criteria.  A system that emphasizes recall also can have drawbacks; one of the main problems that is associated with recall based systems is their tendency to overwhelm users with endless results that may also be irrelevant.

 

Controlled vocabulary is one of the tools that can be used in striking a balance between precision and recall in information retrieval systems.  Controlled vocabularies refer to standardized sets of searchable terms, such as headings and descriptors (Manning, Raghavan, & Schütze, 2008).  This tool offers a lot of precision in searching because it standardizes the terms used to express concepts.  A system with strictly controlled vocabularies can be challenging to new users who are not used to the required language.  Controlled vocabularies also reduce the flexibility of an information retrieval system, especially when definitions and terms change.  Natural language is another tool that can be employed in designing an effective IRS.  The use of natural language involves the searching of items or text from an individual document; this can be the full text or constituent parts such as author, abstract, or title.  The main advantage of employing natural language in designing an IRS is that it provides a search mechanism that is user-friendly and more adaptable.  Although natural language requires no human processing, it can be hindered by inconsistencies of everyday language (Manning, Raghavan, & Schütze, 2008).

 

When designing an efficient information retrieval system, it is important to consider the kind of information the system will contain, as well as the type of users that will be looking for it.  In order to determine these two factors, a designer needs to consider the information seeking behavior of the end users, and use specific keywords and search terms to classify or categorize the information into the database.  Keywords and search items can be categorized by subject indexing.  Subject indexing refers to the act of categorizing or describing a document in terms of symbols or index terms so as to describe a document, summarize it, or increase the ease of finding it (Manning, Raghavan, & Schütze, 2008).  Within the area of subject indexing, there are two types of subject indexing systems: pre-coordinate indexing and post-coordinate indexing systems.  Pre-coordinate indexing refers to a method of indexing where various concepts are combined to create subject headings or descriptors; these heading and descriptors are assigned to documents in order to ease the retrieval of information on complex subjects.  Post-coordinate indexing is an indexing method where the documents’ subject descriptors and headings represent concepts that the user must combine during searching in order to get information on complex subjects.  All of the above concepts must be taken into consideration when designing an information retrieval system.

 

Query

 

“Query formulation involves two kinds of mappings: a semantic mapping of the information seeker’s vocabulary used to articulate the task onto the system’s vocabulary used to gain access to the content; and an action mapping of the strategies and tactics the information seeker deems best to forward the task onto the rules and features the system interface allows” (Marchionini, 1995, chapter 3).

 

Querying an IRS involves comprehension of the proper query language of that system in order to gain access to content.  The ability to query an IRS is crucial for librarians and information professionals in order to provide themselves and users with the most relevant and useful information needed.  One of the main factors that allows for effective searching is the ability of the user to make his search fit the IRS.  A user’s search can only fit the system if he has a good understanding of the system design (Bates, 2012).  Strategies for searching also play an important role in ensuring that a user retrieves what is required; users are often encouraged to use more than one search strategy for good results.

 

Searching for information on the Internet is quite easy; this is because the Web is filled with lots of it.  Retrieving information from databases is a bit different because databases have much less information than the Internet, and there is also a level of strictness in selecting the documents to be indexed.  Because of this, there is a need for users to ensure that their searches fit these databases.  A user may want more precise results by searching the controlled vocabulary.  A user may also want to search the full text of an entire document, as this will provide more results, but the results will also contain more irrelevant information.  Effective querying in this case will depend on how well the user knows the system.

 

Effective querying also depends on the search strategy, crucial to which is understanding how information is structured in a database (Manning, Raghavan, & Schütze, 2008).  It is also important for a user to know the content of each information source.  For example, the main content of an information retrieval system would include books, images, and journals, among others.  Business or medical databases may be good sources of information on marketing, financial, and pharmaceutical topics.  Choosing the appropriate information resource for the search is an essential part of using information retrieval systems.  The selection of the right information sources should be followed by the development of strategies that are appropriate for each of them.  A user may want a broad or a precise search about a subject, and each of these searches would be conducted differently.  Although searches for different items should be different, the key thing is how one structures a search.

 

There are many search tools that can be used for accessing a database.  Many information retrieval systems utilize Boolean searching.  Boolean searching enables a user to include or exclude particular kinds of information from the search results (Bates, 2012).  Drop down menus are also available in many systems, and are easier to use since they assist a user in phrasing their searches.  More modern and user-friendly search tools have the ability to choose keywords, interpret natural language searches, and even guess what the user needs.  Although search tools are helpful, a good understanding of their structure and capabilities are key to effective searches.  Information professionals need to be aware of these structures and tools because they must interact with IRSs in order to provide information to users.  Additionally, librarians will also need to be able to guide and train users how to use these systems, so a thorough understanding is required.

 

Evaluate

 

One of the main areas that makes up information retrieval is that of evaluation, which C.J. van Rijsbergen (1975) defines as, “the measurement of the effectiveness of retrieval” (p. 5).  Evaluation of information retrieval systems is a key proficiency for librarians and information professionals.  Evaluation of information retrieval systems is important in order to determine not only the potential usage of such systems, but also of their efficiency in linking users with the information they seek.  An IRS may be evaluated in various ways.  It can be evaluated in terms of its ability to meet the needs and expectation of the users, the developers (technical), and the management (Manning, Raghavan, & Schütze, 2008).  For users, the IRS should be evaluated with regard to its effectiveness in providing the required information.  From a technical point of view, an information retrieval system can be assessed on its ability to carry out advanced searches, and its ability to filter search terms, among other factors.  For the management, an information retrieval system can be evaluated on its value (i.e. cost effectiveness and ability to pay back).  Evaluation of such systems is useful to information professionals because they can use this skill to determine the value of systems for the use of their patrons and for their own information seeking.  Evaluation is also useful for designers, as they can determine how it will impact the information seeking of their potential users.

 

Evidence

 

Evidence 1: LIBR-202- Exercise #1: Querying Information Retrieval Systems

 

This assignment was created for LIBR-202: Information Retrieval.  This exercise was a hands-on experience with three information retrieval and management systems, Google Scholar, Library Literature & Information Science Full Text (H.W. Wilson), and RefWorks.  This exercise used principles of query formation using the query languages specific to each system in order to form ten queries surrounding a topic of my choice.  In order to show success with query formation we had to obtain 5 articles from each system, in addition to exporting the citations for each of these articles to RefWorks, and provide screenshots to highlight success with these tasks.  In addition to query formation and learning about the specific requirements of query language for each system, each system was evaluated by reflecting on the information seeking processes employed with each system, and the differences and similarities between using each service.  This exercise gave me a hands-on understanding of the importance of query formation of information retrieval systems and a knowledge of the differences between systems both good and bad.  Being able to effectively query and locate information both for myself and for library users is certainly a task that will be required of me in my future professional career.

  

Evidence 2: LIBR-202- Project #1: Designing an Information Retrieval System

 

This project, designed for LIBR-202: Information Retrieval, is a twofold articulation of a collection of objects and a description of the users who could benefit from a database built from this collection.  In the first part of the project, a theoretical discussion about the ideas of classification, systems of classification, and standards, is presented with relevant citations from course readings.  A collection of objects was chosen and described, in addition to a listing of specific attributes of that collection.  The first part of the project combines all of the aforementioned work into a chart which describes the individual objects of the collection in terms of the characteristics of each attribute.  The second part of the project is a description of the group of people who would benefit from a database comprised of this collection of objects and a discussion of the specific information needs of this group.  A list of potential questions individuals from this group would pose were formulated, in addition to a discussion of what attributes of the collection were needed to serve the user or to serve the design of the database.  This assignment helped me to link the theoretical concepts of classifications, systems of classification and standards to a practical application in the describing of a collection of objects.  It also helped me understand the importance of design of an information retrieval system in terms of user need.  While I don’t have plans on designing my own informational retrieval systems in the future, understanding how to do so will enrich my interaction of them in my professional career.

 

Evidence 3: LIBR-202- Project #3: Evaluation of an Information Retrieval System

 

This project, created for LIBR-202: Information Retrieval, is an examination and evaluation of the information retrieval system, RefWorks.  The first part of the project is an analysis of theoretical concepts related to the design of information retrieval systems, as presented in course readings.  The second part of the project is an application of these concepts to the evaluation of the effectiveness of a particular functionality of the information retrieval, library-based service, RefWorks.  In order to highlight the concepts discussed in the evaluation of RefWorks, screenshots were utilized to point out certain characteristics of the IRS.  This project helped me to define and understand fundamental theoretical concepts related to the design of information retrieval systems and provided a practical experience with applying these concepts to evaluate the effectiveness of the system.  It also gave me a foundational understanding of how to conduct an evaluation of an information retrieval system, and the reasons behind why certain things work or don’t.  Evaluation is a crucial part of assessing the usefulness of IRSs, and will be a task that I will continue to use in my future career as a library professional in a public library.

 

Evidence 4: LIBR-210- Paper #1: Database Evaluation Letter to a Vendor

 

This paper was created for LIBR-210: Reference and Information Services.  The purpose of the paper was to create an imaginary/hypothetical letter to a database vendor, written from the standpoint of a librarian.  For this assignment I wrote from the perspective of an academic librarian employed at San Jose State University, who is interested in subscribing to the ProQuest database, Ethnic NewsWatch.  The letter I composed includes notes about my experience with using the database in the areas of content, indexing, abstracting, search interface, navigability, and user support.  I specifically discuss both the strengths and weaknesses of the database, and provide suggestions for improvements to make the database more useful for student users at SJSU.  For example, I note that though the database makes efforts to provide materials in both English and Spanish, indexing and subject terms are only listed in English.  This essentially requires users to be bilingual in English and Spanish to use the database, even if they are limiting their searches to Spanish language works.  I suggest that subject terms be included in Spanish and other languages as the database continues to add more of these materials to it.  This assignment was a great exercise in critically evaluating a particular type of information retrieval system, in relationship to a variety of usability factors.  The format of writing a letter to the vendor directly from the perspective of a library professional also helped me to understand the importance of the evaluation duty that libraries have when it comes to deciding whether or not to purchase access to a database.  Using evaluation skills when it comes to information retrieval systems will be something that I continue to use in my professional career, especially when it comes to retrieving information for users, teaching users how to use such systems, and making important decisions about purchasing these systems for use.

 

References                               

 

Bates, M. J. (2012). Understanding information retrieval systems: Management, types, and

standards. Boca Raton, FL: CRC Press.

 

Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval.

Cambridge, MA: Cambridge University Press.

 

Marchionini, G. (1995). Information seeking in electronic environments. Cambridge, MA:

Cambridge University Press.

 

Van Rijsbergen, C. J. (1975). Information retrieval.  London, UK: Butterworths.

 

Evidence Files

 

Click to download the following files:

 

LIBR-202 Exercise #1: Querying Information Retrieval Systems

 

LIBR-202 Project #1: Designing an Information Retrieval System

 

LIBR-202 Project #3: Evaluation of an Information Retrieval System

 

LIBR-210 Paper #1: Database Evaluation Letter

© 2016 by Jennifer Archuleta Santure

Proudly created with Wix.com

  • LinkedIn Social Icon
  • goodreads_icon_100x100-4a7d81b31d932cfc0be621ee15a14e70
  • Facebook Social Icon
bottom of page