{CHArt logo}

Digital Art History? Exploring Practice in a Network Society

Annette A. Ward, Margaret E. Graham, K. Jonathan Riley and Nic Sheen

Enhancing a Historical Digital Art Collection: Evaluation of Content-Based Image Retrieval on Collage

Keywords: CBIR - Content-based image retrieval, digital images, Virage® software

A growing number of users are visiting museum, library, archive, and other cultural heritage digital image collections online. However, searching for images using traditional text-based methods can be challenging. Enhanced software that retrieves images by colour, texture, and shape may be increasingly useful when words do not provide the desired results. Additionally, enhanced software can also provide a novel way to browse through a database.

Content-based image retrieval, or CBIR, is a computer-derived technique for retrieving images based on elements such as colour, texture, and shape. It uses features of a selected painting, print, drawing, or other object to find visually similar images. CBIR locates matches in a collection regardless of whether they share keywords with the original image. Retrieval of images by keywords, subject descriptors, or indexing terms is not CBIR, even if the keywords describe the content of the image.

As part of a research project funded by Resource: the Council for Museums, Archives and Libraries (CMAL/RE/103), commercial CBIR software was added to the digital image websites at three institutions: the British Library, the British Broadcasting Corporation (BBC) and Collage (collage.cityoflondon.gov.uk/), the Corporation of London Guildhall Library and Guildhall Art Gallery website. The goal of this research was to assess the effectiveness of content-based image retrieval through a systematic evaluation while providing users with accessibility to images independent of applying indexing terminology. The following will explain the integration of content-based image retrieval on the Collage site and highlight some results of the online user questionnaire.

Selection of Operational Sites and Collaborators

Of critical importance in meeting project objectives was selecting collaborative partners that met specific criteria which included a digitized image collection of at least 10,000 images, a progressive outlook toward information retrieval, and willingness to adopt new technology. Since technical expertise and valuable data were shared, it was important to establish a relationship based on shared goals, outcomes, and trust. Confidentiality agreements and consensus regarding delegation of work were also required.

It was also important that each collaborator benefit from the partnership. In this instance, the Guildhall enhanced their reputation by offering advanced search capabilities. In addition, demographic information supplied by respondents to the online questionnaire provided a user profile that helps Guildhall staff meet client needs and develop services and products. It was especially important that the Guildhall did not incur additional expenses for any of those activities. Costs for implementing the project were shared between the Institute for Image Data Research and iBase, freeing Guildhall resources to accomplish their? original mission. Data collected regarding the users' assessment of Collage and content-based image retrieval provides essential information for researchers and software developers to evaluate technology that refines searches. This, in turn, is important to the future of information retrieval.

The original medieval library at the London Guildhall was founded in the 1420s through the will of Richard Whittington. The modern institution dates from 1824 and includes the Print Room, the Guildhall Library and the Guildhall Art Gallery. The Guildhall Art Gallery houses over 4,000 paintings of which approximately 250 are exhibited at any one time. Together, the collections from the Print Room and the Guildhall Art Gallery present an extensive and comprehensive view of London and the foundation for Collage. The collection includes paintings, engravings, maps, photographs, prints, and drawings. Most, but not all, relate to London. The majority of prints and drawings include topographical views of London from the 17th to the 20th centuries. There are extensive collections of satirical prints, panoramas, images related to social themes, and 16th-century maps. An unrivalled collection of London maps dating from the 16th century to the present includes parish, ward, borough, and thematic maps.

Collage was conceived in 1995 in order to digitize the collection of the London Guildhall Library. By 1997 the database contained 36,000 images. The next phase of the project delivered public access to the images within the Print Room at the Guildhall Library via a simple user interface. This in-house database was redesigned for the Internet at the end of 1998. Due to copyright restrictions, visitors to the web version of Collage may access 22,400 images of the total 36,000. At the time, Collage was the largest digital imaging project in Europe.

iBase has provided the software for Collage since its inception, working closely with the London Guildhall Library to provide technical expertise. iBase Image Systems was formed in 1992 as the result of collaboration between the founding directors and the National Museum of Film, Photography and Television to provide an image database for an early photographic collection of 1920s racing cars for Zoltan Glass. iBase developed a revolutionary hard disk solution that was adopted by other UK heritage institutions, including the British Library and the Natural History Museum.

The Institute for Image Data Research (IIDR) at The University of Northumbria in Newcastle upon Tyne was established in 1997 as a multidisciplinary research institute integrating researchers from computing, information and library management, psychology, art history, philosophy, apparel marketing, business and management, and engineering to study the relationships between images, computer technology, and people. Research has investigated aspects of special relevance to art history including: image retrieval; how people search for, retrieve, and use images; impact of digital images on art historians; developing and testing software for retrieving trademark and historic watermark images; and assessing the feasibility of applying content-based image retrieval software to a variety of applications such as the one described here.

Technical Application of Content-based Image Retrieval Software to Collage

Application of content-based image retrieval software to the web-based version of Collage required a smooth transition with no confusion to users and no disruption to the standard web-based services. Additionally, it was important to establish use of CBIR as voluntary, provide a simple mechanism for users to test CBIR within the Collage website, allow users the option to specify preferences on search parameters, return results of the CBIR search in a format familiar to Collage users, encourage users to complete the online questionnaire and develop a system that would work with common browsers.

Development of the site also imposed technical considerations and constraints. For example, regular activities of the London Guildhall Library could not be disrupted by the project, service of the Collage site could not be adversely impacted during CBIR software application, performance of the Collage site could not be diminished by introduction of CBIR, the user interface had to be independent of the CBIR software, allowing the potential for testing different types of CBIR software, and the CBIR software had to be held on the Institute's server.

System Design

Application of the CBIR software to Collage comprised five integrated stages. General testing of the site was simplified because of the modular architecture of the computer system whereby components could be developed and tested before integrating the complete system.

Creating the Database.
Approximately 31,000 images on 6 CDs were processed.

Designing the Interface to Initiate the CBIR Search.
Content-based image retrieval software from Virage® was linked to the Collage system to perform image analysis and image comparison. Image analysis was executed only once for the entire image database resulting in feature vector information, a mathematical representation of the visual content of an image. Image comparison for this database involved assessing three feature vectors (colour, texture, and shape) resulting in a score that quantifies image characteristics. Sizes of the feature vectors vary slightly but are approximately 1.3 KB for each image, producing a database of over 44 MB for all of the images and taking several hours to generate.

Conducting a CBIR Search.
In order to activate content-based image retrieval, the user must begin with the traditional Collage search mechanism (Fig. 1). A word must be entered in the search box or one of several search categories must be selected. When 'Cries of London' was entered in the search box, the text-based search yielded 35 pages, of which two are shown in Fig. 2.

Collage search

Fig. 1. A traditional Collage search must be conducted before beginning a CBIR search.

text-based search

Fig. 2. A text-based search for 'Cries of London' yielded 35 pages of which two are shown.

The communication pattern that happens in a traditional search between the user and the Collage server is a fairly simple one; however, the addition of CBIR software makes communication a bit more complicated. When the user selects a single image for closer inspection, a 'Standard Visual Search' or an 'Advanced Visual Search' may be conducted using the links below the image (Fig. 3). If a user conducts a 'Standard Visual Search,' the search parameters are set at default values. Colour is set at 1, visual texture at 60, and shape at 30 out of a possible 100. Testing of the 'Standard Visual Search' determined optimum values for producing the most desirable matches. Since many images in Collage are not color images, these settings were found to be most effective.

specific image search

Fig. 3. A Standard Visual Search or an Advanced Visual Search may be conducted once a specific image has been selected from a traditional Collage search.

When an 'Advanced Visual Search' is conducted, the user ranks 'colours in the image,' 'visual texture in the image,' and 'shapes in the image' on a five-point Likert-type scale indicating the characteristic as 'not at all important' to 'very important'. Values are assigned to each button on the scale with 'not at all important' set at 1, and subsequent buttons set at 20, 40, 60, and 80. The software requires that each variable has a value greater than zero, thus 1 is assigned to indicate practically no importance.

When CBIR is initiated, the identification number (ID) is read and the corresponding weights for colour, texture and shape are sent to the IIDR server (Fig. 4). The feature vectors for the image are extracted from the database. Using the weightings compared with all of the other image feature vectors in the database, a list of distance measures is produced. The IDs are then sorted to create a list of image IDs in order of similarity. Searching all the images and producing the results takes about 10 seconds. In order not to affect the general performance of Collage, the CBIR engine was placed on a second server. This moved a heavy processing load away from the main Collage database, which is maintained exclusively by iBase, and shifted the maintenance and monitoring of the CBIR segment to the Institute for Image Data Research.

CBIR/Collage system desing

Fig. 4. Logical design of the CBIR/Collage system illustrating a text-based search (A) and the distribution of CBIR processing between the Collage website and the Institute for Image Data Research at University of Northumbria (B).

Displaying results from the CBIR search.
At the end of a CBIR search, measures for the entire Collage database are sorted in ascending order. However, only the first 18 images are collected for potential viewing and only 8 in addition to the original image are displayed (Fig. 5). The 10,000 images for which the Guildhall does not hold the copyright are excluded from transmission over the web. A surplus of returned images insures a sufficient amount for viewing when copyright-free images are filtered out of the returned images. Additionally, because Collage displays 9 images with its traditional searches, it was decided that the CBIR results page should use the same nine-image format. Four pages of results from a text-based search are compared with actual CBIR results in Fig. 6.

CBIR search results

Fig. 5. Results of the CBIR search are displayed using a gallery format similar to the traditional Collage search.

comparison CBIR/text-based search

Fig. 6. Four pages of a text-based search are compared with one page of CBIR results.

Testing on the Web.
During project development, Collage was made available on the Institute's network so that project and other researchers could test and evaluate the software. While the computer system was constructed, the online questionnaire was developed. Questions were devised to assess CBIR functionality and satisfaction, the Collage service, and demographic information about the users. The format of the questionnaire was designed to match the original Collage site and was pilot tested on browsers to ensure compatibility on alternative systems.

Results and Analysis

Recruitment to the site was promoted in November and December 2000 by distribution of brochures at classes at Northumbria University and at professional conferences. Once the initial recruitment phase was complete, the site was left undisturbed while data collection progressed. Responses to the questionnaire were collected online and were transferred to a Microsoft® Excel spreadsheet and input into a file for analysis using SPSS 10.0. Frequencies, percentages, and means were computed on the questionnaire. Although data are still being collected via the online questionnaire, results presented here include responses received up to 4th February 2002.

Respondents.
The 181 visitors to the site who completed the questionnaire were almost evenly split between females (81 or 45%) and males (84 or 46%). Age was distributed throughout the categories with the greatest percentage (29%) aged 46-55. Similar numbers of respondents were reported for age ranges 26-35, 36-45, and 56-65. Over half of the respondents indicated having an undergraduate or postgraduate degree with nearly one-third of the respondents possessing a postgraduate degree. Students comprised 16% of the respondents whereas 39% reported full-time employment, and 22% reported they were retired.

Respondents reported a range of professions with no apparent groupings emerging. 'Retired' (11%) and 'education' (10%) were the most frequently reported categories. Sixty-two percent of the respondents were UK residents with 22% USA residents. However, there were also visitors from Australia, Canada, France, Germany, Austria, Poland, Portugal and even Nepal.

Over one-third (38%) of the respondents had visited the Collage site before and about one-half reported using the internet two or more times a day. Twenty-two percent reported finding Collage through a link from another site, and in each case about the same number – roughly 17% – found it through a web search, were told about it, or found it by accident. Visitors to the Collage site came for various reasons. About one-third (34%) came just to look, about one-third came to look for a particular print, and about one-third came to research the art collection. Some came for more than one reason.

Ratings of CBIR Retrieval.
Respondents indicated that nearly one-half (48%) of all the images retrieved by the CBIR software were good matches to the original, whereas about 39% of the images were not. CBIR software returns images in descending order of nearest matches. Thus, it is not surprising that the number of respondents who indicated images were a good match, generally decreased in descending order for images 1 through 8.

Ease of Visual Image Search.
The Standard Visual Search was conducted by 74% of the users and 37% conducted the Advanced Visual Search. Some did both. Over 85% of the users 'agreed' or 'strongly agreed' the search was fast and easy to use. Although over 50% concurred that the number of images returned was adequate, nearly a quarter (23%) 'disagreed' or 'strongly disagreed'. This confirms results of another question that asked respondents about the number of images they preferred returned. In that question, almost one-half of the users (49%) preferred more images returned with nearly 24% wanting more than 18.

Usefulness of Visual Image Search.
There were some interesting results regarding the usefulness of the visual image search. Forty percent of the users 'agreed' or 'strongly agreed' that they found what they wanted by using the visual search, 56% responded that results of the search met their expectations. Over 65% indicated results were useful. Of special importance is the rating regarding whether the respondents would like to use the visual image search again. Nearly 80% indicated positively and only 10% disagreed.

Satisfaction with Visual Image Search.
Nearly one-half of the respondents (48%) 'agreed' or 'strongly agreed' that the visual image search was better than a word search. Only 16% of respondents provided a negative rating of this item. About one-half (52%) were satisfied with the results. Over three-quarters (79%) of the respondents indicated that results were interesting and 73% of the users responded that the visual image search was a good method to retrieve images. Approximately 52% 'agreed' the search was fun to use as compared with the 44% percent who were negative or neutral. Clearly, reaction is mixed; however, none of the respondents indicated they 'strongly agreed' with the statement that the search was fun to use.

Results taken alone may not portray an accurate picture of the usefulness of the CBIR technology. Although respondents reported CBIR was inconsistent in retrieving similar images nearly 40% of the time, respondents also indicated results were interesting and CBIR was good for image retrieval. Subjects for this study indicated value in using the technology even when the retrieved images were not always 'similar'.

Conclusions and Recommendations

Searching digital images will become more challenging as collections and users continue to increase. Growing collections necessitate image retrieval that expands search capability beyond traditional word-based indexing. Content-based image retrieval offers a solution and has great potential. With traditional searching, the end-user may not always be able to define the image for which they are searching. Perhaps the user's language is different from that used in the text-descriptors of the image collection, or the user may be challenged with learning differences that makes using indexing terms difficult. With respect to new applications, visual searching may be useful to aid indexing of large collections. Simplifying cataloguing by identifying similar images may be an economical application.

It may be necessary for computer technologists to rethink how they evaluate the success of CBIR. Mathematical modelling of the similarity of retrieved images may not present the best picture. End-user evaluations are critical to evaluating enhanced technology.

Enhancements to CBIR still can be made. Allowing the respondent to specify the number of returned images from the CBIR search will provide users with the opportunity to peruse a greater selection of items if they choose. The ability to save the search criteria for future referral will also be important. Combining the option of using text-based descriptors with CBIR can provide an innovative way of image retrieval that integrates two strong retrieval devices. Additionally, using CBIR technology on areas of an image to extract specific elements would be an exciting advance, as well as a useful tool.

Those 'serendipitous' results that often displease the computer programmer may be appreciated by end-users. An artist explained that if she wanted the exact image back, she would not do a CBIR search; she would stick with what she had in the first place. Variation of returned images may be desirable. CBIR may supply the user with alternative images that may stimulate creative thought and inspiration, and may lead the user along a different search and, ultimately, a more productive path.

November 2002