Workshop on Personal Photo Libraries
Workshop on Personal Photo Libraries 17th Annual Symposium and Open House Human Computer Interaction Laboratory University of Maryland at College Park Juey Cheng Ong
My father took plenty of travel pictures when I was a child. Each time we came home from a trip, rolls of Kodachrome slide film would be sent for processing. Several weeks later, they would be returned in yellow plastic boxes and my father would haul out the projector and screen for a visual feast of another great family adventure. The slides then went into biscuit tins, rarely to see the light of day (or projector) again. Today, 20 to 30 years later, they still sit in the same biscuit tins in Singapore: well preserved, but lacking any identification except maybe for some label on the box that indicates the travel destination and — sometimes — the date of the trip.
The photographs of many families and individuals suffer a similar fate. Indeed the notion of storing photographs in shoe boxes is so popular that department stores sell shoebox-sized photo storage boxes made out of archival-quality paper board.
The photo and imaging industry offers many tools for archiving and preserving our photographs. Today a quality photofinisher will even return processed prints and negatives in archival-quality storage envelopes. But they offer little in terms of helping us to create libraries of our photographs. i.e. collections that can be used and enjoyed. Prints are usually returned without negative numbers and dates. It’s up to each of us to number, label, annotate, caption and organize each print — a painstaking task that is usually left undone.
With the advent of digital cameras, photographs are now stored on fixed and removable digital storage media (possibly kept in shoe boxes again!). But again, the solutions we see coming from the industry today mostly emphasize photo capture and storage, but offer little in terms of building photo libraries. Today’s digital cameras will record the time, date and exposure data; they might even permit a short audio comment to be recorded. However there is little if any software that will adequately utilize this data to catalog and search for images. The hardest task, of annotating an image with keywords and comments is still an arduous task. Creating digital photo libraries will not be any easier without better tools.
With these concerns in mind, Ben Shneiderman and Catherine Plaisant organized the first Workshop on Personal Photo Libraries: Innovative Designs as part of the 17th Annual Symposium and Open House of the Human-Computer Interaction Lab at the University of Maryland. This one-day workshop was held on June 1, 2000 and attracted active participation from a group of about twenty attendees. The attendees were remarkable in breadth and scope: they hailed not only from academia, but also from major companies in the imaging and computer industry, and from as far away as China. They comprised not only software and hardware designers and manufacturers, but also users and creators of photo libraries. The also stretched the notion of a personal photo library: while the organizers of the workshop were primarily interested in the issues of organizing photographs created by the person maintaining the library, there were participants in the workshop who were wrestling with the issues of photographs selected for a person’s use from a large archive of images captured by others.
The first speaker was Elizabeth Rosenzweig from Eastman Kodak. She noted that currently available software for organizing digital photos already held many of the needed solutions, but none offered the complete solution. In many cases, the missing key element is fun.
She then gave a design overview of an ongoing project at Kodak called the Electronic Shoebox Browser. The Browser is designed to help a family organize their photographs according to the five basic storytelling principles: who, what, where, when and why. She envisions a three-dimensional user interface to accommodate the tens of thousands of images that a family could produce over the years. Using various familiar metaphors and objects (calendars, time lines, maps, icons, etc.), the family members customize their interface according to their needs and fancy. While early versions of the system will likely rely heavily on user input for image annotations and metadata, Rosenzweig also touched on several follow-on projects that will attempt to alleviate this time-consuming but critical task. The first step is to put in place the organizational structure for the images based on the four dimensions afforded by the “who”, “what”, “where” and “when”. Capturing the “why” of an image is an interesting but difficult task that will be tackled later on.
The next speaker, Norunn Mikkelson from Microsoft gave us a preview of new photo cataloging feature to be incorporated in a future version of Picture It!, a consumer photo editing software package from Microsoft. According to Mikkelson, Picture It! is the best-selling photo editing software worldwide since its introduction in 1995. As a usability engineer for the product, her research has found that users of the product are storing hundreds of images on their hard drives and other removable storage media perhaps with no particular filing scheme. The photo cataloger in the new Picture It! is designed to make accessible all those images by storing thumbnail-sized copies of all the images together with their annotations and metadata. Thus, although an image may not be on the user’s hard drive, and its directory path may not be known, a search performed through the Photo Cataloger would locate the image. Unlike many current cataloging software which only allows an object to belong to one category (a number of MP3 jukebox software come to mind), images in the photo cataloger may be assigned multiple categories and keywords. To ease the task of annotation, multiple images may be selected at once and assigned the same metadata for batch input.
Ben Shneiderman noted that contributing to the lack of annotation in personal photographs is that not only is annotation a painstaking task, the perceived value in the annotation is also not clear. Are you annotating a personal photo library for yourself, or for others? Do you annotate every single photograph (which makes it even more painful) or do you pick and choose only your favorite ones (which lessens the value of the library to others)?
The problem of perceived value does not cross the minds of commercial photo libraries where the need to zero in on a particular image could decide the viability of the entire enterprise. Tom Kennedy and Stephen Cook from the Washington Post and Newsweek Interactive next presented their perspective of the problem. Unlike the archetypal personal photo library builder/user, their organization generates and acquires thousands of press photographs each year. To date, they have built a digital library of about half a million images dating back to 1986 which represents a small portion of their 100-year-old collection. These images are used by photo editors who need to locate pictures to use in articles, or to produce a “package” of images for a Web site. Kennedy said that their photo editors preferred to search for photographs by browsing through them visually. Hence the interface was designed to facilitate this task by pulling up thumbnail images quickly. To make it easier for photo editors, their query system accepts natural language queries. Kennedy stressed that the search system also accounted for synonyms to the input search criteria.
Unlike most smaller collections, there was also a backend challenge of keeping an open software architecture to tap into their older image management systems as well as to integrate with future ones as well. Here is an area where builders of personal photo library software can learn from as the range of image capture and storage solutions evolve and expand. Some workshop participants mentioned the development of a new XML schema for image data.
Next up were several presentations showcasing personal photo library interfaces and research from the HCIL. First, Hyunmo Kang presented the PhotoFinder: a software application developed by Kang, Catherine Plaisant and Ben Shneiderman with funding support from Intel Architecture Labs. PhotoFinder addresses the problems of browsing and annotating personal photo collections. Say you have photographs of a group of people—perhaps colleagues and strangers attending a conference or a family reunion with relatives you only see once in three years. Annotating such an image with keywords could tell you who was in the photo, but you might need descriptive comments to indicate that the person wearing the red sweater was John while the woman on the far right was Linda. That is a daunting task for many of my own photographs which consist of 100 or more people standing on the stage of a concert hall. PhotoFinder implements a method which the authors call direct annotation (the University of Maryland is seeking to license this). You simply drag a name or label from a list and drop it on or next to the person or object in the photograph you want to identify. If you have a batch of photographs to label, simply select all the photos and drag the label over the selection.
In addition to implementing queries on keywords, PhotoFinder also applies established visualization techniques to the task of search and browsing photographs. For example, a histogram display can be used to show the date of an image while the y-axis shows the number of images. The actual thumbnails of images satisfying the two variables are used as data points on the graph. A quick glance at the subjects depicted in the thumbnail could reveal a lot about the content of a image collection.
Following the PhotoFinder demonstration, Shneiderman and several HCIL students presented the results of two usability experiments designed to test the effectiveness of direct annotation in the PhotoFinder over “traditional” text entry into an input text box and whether users preferred to view all the images in a collection as tiny thumbnail images without needing to scroll the display window, or as larger thumbnail images with scrolling, and how this affected browsing speed. The experiments found that direct annotation was preferred over text entry, although text entry was slightly faster (not surprising if subjects have good typing and computer skills); browsing non-scrolling thumbnail displays was faster, but users preferred larger thumbnails with scrolling for medium (about 36 photos) and larger sets of images.
A third experiment was designed to compare and evaluate the PhotoFinder against several publicly released software packages. The experiment confirmed the hypothesis that PhotoFinder’s multiple visual views of an image collection allowed users to locate images more quickly and with greater satisfaction. I am reminded of Tom Kennedy’s mention in his earlier presentation that his users prefer to browse visually. It looks like visual browsing is an important element of a photo library and the topic deserves further research.
A lunchtime walk through the University of Maryland campus provided great opportunities for informal (and informative!) discussions and a chance to show how serious the participants were about digital photography. When one person whipped out a camera for an impromptu group photo, three other participants also went for their cameras, all digital of course.
This lead to the presentation by Henry Lieberman from the MIT Media Lab. Henry noted that when Kodak introduced their first camera in 1888 with the slogan “You push the button, we do the rest”, it literally meant that there were human agents in the photofinishing business to assist the photographer after the picture was taken. Today, the photographer is largely responsible for “the rest”, like knowing when to remove the digital film from the camera (I lost a bunch of images recently due to poor insertion of the memory card in the reader) and how to extract its digital contents onto the hard drive, and so on. With so many non-task oriented issues to deal with, annotation is probably the furthest thing from the user’s mind, which is concentrating on just being able to see the images on the computer.
Lieberman’s answer to the problem is to deploy a software agent that combines the task of both annotation and retrieval. The demonstration of ARIA (Annotation and Retrieval Integration Agent) took the form of an e-mail program. The agent continuously monitors the user’s typing, searching for images that might be relevant to the content of the message and displaying them beside the message window. Images are thus constantly displayed and updated to the right of the text. The user then simply drags the desired image into the message area. The user can also drag words onto an image to annotate it with those words. Thus anything in the text that describes the image can then be used to further annotate the selected image: the annotation and retrieval interface is practically invisible. It does appear that unless there is some external information that can be tied into the photo (e.g. location data from a GPS receiver in the camera), some initial annotation is still needed. To make this easier, the application also includes support for reading memory cards from digital cameras, alleviating the need to switch to a different application. Using these approaches, the most frequently used images will probably be the most richly annotated, which automatically solves the problem of the perceived value of annotation.
Another semi-automatic approach to organizing and retrieving images was proposed by Wenyin Liu from Microsoft Research in China. Liu’s team is working with the premise that retrieving images based on their low-level feature similarities may help retrieve images similar to what the user is searching for. Of course this brings up some spurious search results (e.g. a search for sunsets might also bring up an image of an orange ball against a red background). The user’s acceptance or rejection and ranking of such similarly retrieved images will then serves as feedback indicating the relevance of that image to the search criteria specified by the user. This relevance feedback is stored as annotation of the images in addition to helping the system refine the search result. The semi-automatic annotation approach is much more efficient than the tedious and labor intensive manual annotation method.
The final three presentations dealt with the presentation interface of a personal photo library. Andrew Sailus from Eastman Kodak started the set by presenting the results from his group’s efforts at benchmarking the current crop of photo organization and album page generation software, retelling many of his own experiences using those products to print and produce his own photo albums at home. Andrew’s interest is in producing physical prints of digital images on paper, which is still a very important medium for sharing photographs.
In contrast, Intel’s Grand Canyon project, presented by John David Miller, tackles new and novel ways of presenting photographs. Miller recognizes that albums must still be assembled and slide shows still need to be assembled after successful retrieval of a set of photographs. The editing and assembling of a presentation can be as painful as the original organizing tasks, depending on the user’s inclination. The Grand Canyon project explores the advantages of digital media in three ways:
- Automatic layout and presentation techniques can be applied. The Grand Canyon prototype application arranges photographs and text headlines in an animated 3D collage. The user meanders along a time tunnel, where content from personal photo albums mixes with news content to help them remember and re-experience the essence of that history.
- External knowledge of related events can enrich the presentation. for example, a presentation of family photographs over the years could be juxtaposed with headlines or news articles from the same period of time to provide some context into the hairstyles and clothing worn and the activities depicted in the photographs.
- Graphical representations and animation can engage the viewer more closely and make a more compelling presentation than static album pages or a simple slide show.
The final presentation by Chia Shen and her colleague Baback Moghaddam outlines the ongoing research takenat MERL (Mitsubishi Electric Research Lab) to explore new physical interfaces and ways of using images and other media for people to share experiences via storytelling. Recognizing that much of today's efforts concentrate on authoring and searching, the Personal Digital Historian project emphasizes the exploration of one's photo collection and sharing of experiences. Their exploratory and interactive interface will be supported by visual navigation mechanisms among people, location, time and event planes.
The PDH project is a long-term project and investigates areas including the design of media sharing devices and spaces (e.g. a table-top display in the living room coffee table), seamless storytelling and navigational tools that eliminate the distinction between authoring and presenting, active listening software that automatically retrieves related information, content-based retrieval using facial recognition and other image-based techniques and even using data mining techniques to enhance the discovery and presentation of certain events that we might have overlooked.
In conclusion, the workshop was a great opportunity for airing many different ideas about how we might create electronic shoeboxes for digital photography that don’t gather dust like those photographs in paper shoeboxes. There were many common themes. The participants agreed that the emphasis on personal photo libraries is on storytelling and the sharing of experiences (Rosenzweig, Lieberman, Shen, Shneiderman). New and/or improved styles of presentation or layout can enhance the experience, especially if the presentation can be generated with some automation from the computer. (Sailus, Miller, Liu, Shen, Shneiderman). To ensure that the images don’t gather virtual dust on the hard drive, it is important to annotate and add metadata to each image. Because annotation is the most painstaking task presented before the photographer/librarian, much effort is being spent on exploring more productive and less painful user interfaces for annotation, as well as semi-automatic methods where the computer might continuously analyze the actions and stories you are telling with the images and use that data for automatic annotation; or it might involve direct analysis of the content or image itself. And of course, the greatest test of any electronic shoebox is how well you can retrieve the images from the shoebox. We saw the advantages of using 2D and 3D visualization techniques in browsing for images as well as presenting them. Several participants also presented the potential for content-based retrieval, by using facial recognition techniques and other content-based analysis. They also pointed out that the interest or disinterest in the images retrieved using these semi-automatic methods could be used as relevance feedback that becomes part of the image metadata.
The tightly-packed schedule for workshop did not permit deeper discussion on each topic. But this appears to be an area where the questions and issues are well-established but solutions are lagging. Recognizing the value of this workshop, the participants are already looking forward to the next one, as well as thinking of bring the issue to a wider audience by organizing a panel at CHI2001. I would also encourage the organizers to consider bringing the issues and findings to a wider audience at venues like the International Center for Photography, Photo-Plus East/West or the Photo Marketing Association’s annual show.
The author would like to thank the organizers and workshop participants for reviewing and editing this report.
Workshop presentation abstracts, slides and a complete
list of Workshop participants are available at
About the Author
Juey Chong Ong is a senior staff member at Digital Image Design Incorporated. He is also a photographer and sings in the New York Choral Society.
Juey Chong Ong
Digital Image Design Incorporated
72 Spring Street Fl 6
New York NY 10012
Tel: +1 (212) 343 2442 ext. 225