by Michael Kim, Past Chair (Published September 1, 2012)
Program: Web-scale Discovery Implementation with the End User in Mind
Moderator: Rafal Kasprowski, Rice University
Speakers: Rafal Kasprowski, Rice University; Debra Kolah, Rice University; Harry Kaplanian, EBSCO Publishing
Harry Kaplanian, the first speaker, began his presentation from a brief history of library information discovery services. By and large, machine-based library catalog services started in 1980s. He argues that 1982 was a landmark year when people began to see that the library card catalogs might disappear. He identifies the 1990s as the time period that electronic databases began to appear for libraries, which is not a part of the library physical collection. Due to introduction of the electronic databases, patrons, Kaplanian argues, don’t know anymore where they should find electronic information easily. In order to help the patron, federated search became very popular after 2000. However, federated search services have many drawbacks, such as slow speed, incomplete result sets, and conflict of multiple ranking algorithms between multiple electronic databases. After federated search services, a newer generation of cataloging layers services became very popular. The cataloging layers services provide a single index for all local content fast and provides complete result sets. However, the downside would be difficulty with to setup and not tightly integrated with Library Information Systems (LIS).
Web-scale discovery service is a more recent development that has become a “buzz word” past three-to-four years. Basically, the services provide indexing services by which the single compassing index of all e-contents can be placed with single search box. There is no worry about relevancy ranking because vendors can adjust complex ranking criteria for customers. The search will be “super-fast.” The downside is that the services may not be tightly integrated with LIS. This generation of services may eventually be replaced by the next generation LIS, called “Web-Scale Management” systems.
The next speaker, Debora Kolah, continued the discussion of web-scale discovery by focusing on its implantation at Rice University. She was a part of Resource Discovery Tools Working Group, a task force consisting of 8 members representing Cataloging, Institutional Repository, Information Technology, Office of User Experience, and Electronic Resource Management teams, that in December 2010 was charged with the task of finding a new discovery layer. Their interim report was submitted in March 2011 to select three products for trials. Their final report was generated in late July 2011. In the end, EBSCO’s EDS product went live in January 2012.
During the investigation, she consulted “Web Scale Discovery Services,”[1] a report that included very detailed analyses. In addition, she reviewed a few user studies written by Nancy Foster, Andrew Asher, and a team from Rice University.
Rafal Kasprowski followed with a third presentation that focused on the web-scale discovery product selection process at Rice University and made recommendations on what to look for in selecting a web-scale discovery product.
The advantage of web-scale discovery systems over federated search products is that it makes records from content providers available from a single index. Additionally, search results are combined and their relevance ranked as if they came from a single source following the same metadata organization. Hence, it is crucial to check if the web-scale discovery vendor provides this service.
Also, it is important to verify a vendor’s claims to find out: 1. Combining the search results and its relevance rankings are independent from each separate database scheme; 2. Providing complementary federated searching in addition to true unified index. 3. Using a grand unified index of all records and presenting a library’s individual holdings as a subset of the larger index.
Kasprowski explained that the web-scale discovery systems can be hosted either by vendor or by library. In the vendor-host model, the service will sit on external vendor server as SaaS (Software as a Service). The local library-host model, however, may provide an opportunity for the library to have extensive and unique customizations and some code-level access to the service.
He also covered relevant issues, such as content coverage, handling of institutional records, application of local practices. Kasprowski pointed to other points of consideration, including the quality of customer service technical support, the number of customers with similar LIS and Institutional Repository software, the degree of integration with vendor’s other services, and compatibility with other remote access software (proxy, VPN, etc.).
Kaprowski acknowledged that it is difficult to ask all the right questions when dealing with unknown. He suggested asking detailed questions, assuming as little as possible, and checking what you understand by asking. He also stressed that it is equally important to resist the urge to find the ultimate product. All solutions are imperfect and libraries may have to repeat a new product search every two to three years.
The session was sponsored by ProQuest & Dialog. The PowerPoint slides are available at http://www.slideshare.net/rkaspro
